Analysis of a Quasi-Chemical Model for Bacterial Growth

We analyze a quasi-chemical model for bacterial growth in the context of a parameter estimation problem using available experimental data sets. For many of the data sets the dynamics are simple and we show that the quasi-chemical model (QCM) is over parameterized in these cases. This results in larger uncertainty in the estimated parameters and is some cases instability in the inverse problem. We illustrate methods for reducing the uncertainty present in the estimated model parameters. We first consider a model reduction technique where subsets of the QCM parameters are fixed at nominal values and hypothesis testing is used to compare the nested models. An orthogonal decomposition of the sensitivity matrix is used to guide the choice of which parameters are fixed. Additionally, surrogate models are developed to compare to the QCM. In most cases one of the surrogate models is able to fit the data nearly as well as the QCM model while reducing the uncertainty in the model parameters.

standing and prediction of bacteria growth on food in various environmental settings.High-pressure processing (HPP) is one of the most widely used nonthermal food processing techniques.HPP processing results in different bacteria growth dynamics than would result from traditional thermal or chemical processing techniques.Often, this results in complex survival curves with at least one change in concavity [12].This has led to the development of more sophisticated models, including a class of quasi-chemical models (QCM), that are capable of capturing the effects of microbial lag, inactivation and tailing [11,8,9].These models have successfully been used to fit laboratory data through the means of an inverse problem.However, to the authors' knowledge, little effort has been made to account for the uncertainty present in the estimated model parameters.Using the data provided in [8] we will use various models to preform the inverse problem and quantify the uncertainty in the estimated parameters by constructing asymptotic confidence intervals.
In this work we will focus on the QCM originally derived in [11], but these results can be readily extended to other quasi-chemical models.It has been observed that the inverse problem using this model is ill-posed with the presence of many local minima.Additionally, there are subsets of parameter values which cause the differential equations to grow so rapidly that many standard ode solvers fail.Our focus in this work is not to improve the robustness of the optimization using the current bacterial growth model.Rather, we aim to illustrate that for some data sets the resulting estimates have very large uncertainty bounds.We will consider methods aimed at improving the confidence in the estimated parameters.One such method is a model reduction in which the number of estimated parameters is reduced.We will detail this approach in Section 2.1.For some of the data sets where the uncertainty is large, we will propose simple models capable of achieving similar model fits as the bacterial growth model, but have better uncertainty properties.Since our goal is not to develop a robust optimization of the bacterial growth model, we will typically choose a starting point for the optimization which is near a local minima that obtains a model fit similar to what is shown in [8].Additionally, for a particular data set, if the optimization routine searches a region of parameter space which causes the model to grow unboundedly, we will enforce ad hoc bounds on the parameters.

Quasi-Chemical Model
The original quasi-chemical model proposed in [11,8] is given by (1.1) In the above equation M is the concentration of lag phase cells, M * the growth phase cells, A the antagonist cells, and D the dead cells, the values k j , j = 1, 2, 3, 4, are the rate constants, and h is a scaling factor set to h = 10 −9 .The initial conditions are given by [M, M * , A, D] T (0) = [I, 0, 0, 0] T .We can calculate the solution for the lag phase cells as M (t) = Ie −k 1 t .Then, the total microbial population is given by Notice that the concentration of dead cells D is uncoupled from the other equations, and since we have no measurement data on the number of dead cells, we may ignore this equation.Furthermore, the term (k 2 − k 4 )M * present in the first equation will lead to a correlation between the parameters k 2 and k 4 .In consideration of this, we will define α = k 2 − k 4 .Hence, we now can consider (1.1) as a system of two differential equations given by Then (1.1) can be written compactly as where f (x; θ) represents the right hand side of (1.3).
The total microbial population can be measured by plate counts that do not distinguish between cells in the growth and lag phases.Furthermore, the data collected is typically presented on a log scale.Thus, we will fit the above model to the log-scaled data.In light of this we assume a statistical model of the form where Y j is a random variable which is composed of the log-scaled total microbial population count, g(t, θ) = log(U (t; θ)), at the sampling time t j with θ 0 , the "true" or nominal parameters, and the measurement error E j .With this assumption, the estimators θ can be found using an ordinary least squares formulation with realizations where y j is a realization of Y j , j = 1, ..., n in (1.5).That is, with ε j a realization of E j .

Sensitivity Analysis and Uncertainty Quantification
In this section we investigate the sensitivity of the QCM with respect to the parameters θ j , j = 1, 2, 3, and 4. Since the model is relatively simple, we can compute the traditional sensitivities using the well-known sensitivity equations, which are given by d dt The individual traditional sensitivity function is then given by where We are also concerned with the sensitivity of the total microbial population U (t).We find that where δ 1j is the Kronecker delta function.Since we are fitting the log-scaled total microbial population, we must consider the sensitivity of g(t; θ) In order to give a level of confidence in the parameter estimates, we will compute the confidence intervals.The standard least squares estimator θ has the asymptotic properties [4,6,7]: where θ 0 is the nominal or "true" parameter vector, and the 4 × 4 covariance matrix is given by the approximation Here the n × 4 sensitivity matrix is given by Since θ 0 is unknown, we will use the estimates , with the approximation for the variance where κ is the number of estimated parameters.We can then construct the 100(1 − ρ)% level confidence intervals by where SE j = Σ jj , and the critical value , where S has a students's t distribution with n − κ degrees of freedom.

Model Reduction via Parameter Ranking and Model Comparison Testing
As we will see in the next section, the data sets typically do not support a reliable estimation of all of the model parameters.For this reason we will perform a parameter ranking based on the orthogonalization of the sensitivity matrix F (θ).This is implemented by using an "economy size" QR decomposition of the matrix F , so that F P = QR, where P is a permutation matrix.The order of the permutations give the ranking of the parameters where the rankings are chosen according the 2-norm of the sensitivity with respect to the parameter (see [10] for a detailed description).In [1,5] global techniques similar to those presented here were considered.At this time, we do not have reliable ranges for the parameter values, and so we do not consider global methods.
Our goal is to fit the data with the minimal set of parameters that can be reliably estimated.To accomplish this we will take the following approach.We will first estimate only the most important parameter (determined from the sensitivity-based parameter ranking scheme) and fix the three remaining parameters at a nominal value.Then we will estimate the two most important parameters and use a statistically-based model comparison test to determine if the data can be adequately described by the single parameter model compared to the two parameter model.If it is determined that the second parameter significantly improves the model fit to the data, then we will estimate the three most important parameters and compare with the two parameter model, again using the a model comparison test.In this way we have an automated method for ranking the parameters and determining the minimal set of parameters needed to describe the data.
Since we are comparing nested models, we are testing the null hypothesis that the constrained model provides an adequate fit to the data.Let J(θ; y) and J(θ H ; y) denote the value of the objective function, where θ H ⊂ θ and dim(θ − θ H ) = 1 (i.e.θ H contains 1 less parameter than θ).Then the test statistic can by computed by T (y) = J(θ H ; y) − J(θ; y) ≥ 0, and we define U (y) = nT (y) J(θ; y) .

Aggregate vs. Individual Data
Here we discuss potential issues that can occur with mathematical modeling and estimation in connection with bacteria kinetics experiments.Consider an experiment in which independent samples of inoculated food are prepared and incubated.At each sample time, a sample is extracted and the bacteria are recovered and plate counted.If the sample is discarded after enumeration, then we are not repeatedly measuring the same sample over time.This type of data is called aggregate data which is similar to the type of data one would obtain in ecological catch and release experiments see [4,Chapter 5].Repeated measurements of the same sample over a period of time would result in individual data.Aggregate data is often treated as individual data, and modeling and estimation techniques derived for the purpose of data assimilation with individual data are applied.When this is done, the effects of variability across samples is ignored, which can lead to over confidence in estimation results and overly conservative predictions.Techniques designed to handle aggregate data have been developed [4], but require sufficient replicates at each sampling time to be collected which can greatly increase the cost of an experiment.Examples of the pitfalls which can occur when aggregate data is treated as individual data, and a more detailed discussion on aggregate data approaches applied in a pharmacokinetic setting can be found in [2].We briefly discuss potential problems that may arise when utilizing an individual data-based mathematical modeling framework to describe an aggregate data experiment by considering the following hypothetical scenario.Suppose we have an experiment involving two different strains of a microorganism with different growth rates.In Figure 1 we present two simulations of the model, in one strain α = 2.7 (green circles) and in the other strain α = 3.7 (blue dots).In this depiction we have created data which matches our hypothetical scenario.Assume that there are microorganisms from several different strains, and that any sample is randomly selected from this data.In Figure 2 we depict how a data set collected in this manner might appear.To construct this data set at each time point we randomly select a data point from one of the two trajectories shown in Figure 1.If we look at the data set given in Figure 2, it appears to resemble a noisy data set representing a population with constant growth rate.Therefore, if we use this data set to estimate α, we will estimate a value for α somewhere roughly midway between α = 2.7 and 3.7 (the two values used to produce the data set).
Ideally, what we would like to do is be able to use the data shown in Figure 2 to infer that the samples contain microorganisms that grow at two distinctly different rates.In this way, we can consider alpha as an individual based parameter which has an associated distribution.If we impose a distribution on the model parameter α to obtain a model of the form we will not be able to determine that two growth rates produced the data regardless of the assumptions we place on the distribution G(α).Even if we assume G is a discrete distribution such that the Prob{G(α) = 2.7} =  Prob{G(α) = 3.7} = 0.5, the resulting model v(t; G) will only exhibit the average behavior.In order to gain more insight into the underlying distribution G(α) we would need to acquire replicate data at each sampling time where the number of replicates counted at each sampling time is sufficient to identify the distribution G (see [2] for more details).

Data Fitting Results Using the Quasi-chemical Model
In spite of the above warning regarding potential difficulties with using the data of [8] in estimating parameters with confidence, we also treat the aggregate data as individual data.We investigate the inverse problem using the four parameter QCM given by (1.3) on various data sets taken from [8].

Inverse Problem
Here we present the results of the inverse problem using the counts of S. aureus in bread at a pH of 5.4, temperature of T = 35 • C, with a wide range of environmental conditions of water activity, a W , see Figure 1 in [8].For each data set we give the model fits to the data, the sensitivity of g(t; θ) with respect to the parameters, and the residual plots (see Figures 3-6).The estimated values for the parameters along with the standard errors (SE) are given in Table 1.While the uncertainty (SE) in the estimated parameters varies widely in the examples, for each data set we achieve a reasonable model fit to the data, similar to those presented in [8].
To implement the inverse problem, we used the MATLAB routine fmincon with the interior-point algorithm, with the function tolerance set to 10 −13 and the step size tolerance set to 10 −8 .The system of differential equations (1.3) couple with equations (2.3) was solved using ode15s with the relative tolerance set to 10 −4 .The sensitivity equations (2.3) are used to compute the gradient of the objective function using where The gradient is supplied to the optimization routine fmincon.
Below we present the results of inverse problem on a selection of data sets.As we mentioned previously, the goal in the subsequent section is not to improve upon the estimation results given in [8], but to illustrate how for many of the data sets considered the uncertainty present in the estimated values is so large that it is difficult to make confident conclusions based on the estimates.Particularly if one wishes to use the calibrated parameters to provide predictions on either the interior of the sampling domain (interpolation) or outside the sampling domain (extrapolation).For the case of a W = 0.79, from Table 1 we see that the standard errors are unreasonably large for all of the parameters, particularly for k 2 and k 3 .This indicates that we have very little confidence in our estimated values for θ.From Figure 3 we see that we have a good fit to the data, however the sensitivity with respect to k 2 and k 3 is near zero for the entire sampling interval.Observe also that the residual plot illustrates an approximately random pattern, indicating that we have correctly specified the statistical model.That is, there is no need at this stage to consider more sophisticated error models, such as a relative error model [4].
For different starting points chosen for the optimization algorithm, the fit of the model to the data was consistent.However, we saw considerable variation in the parameter estimates of k 2 and k 3 (as much as an order of magnitude).This is consistent with the fact that we have such large standard errors (a direct consequence of the lack of sensitivity).The estimated values for k 1 and α were consistent.The maximum difference between estimated values was observed to be 0.015 for k 1 and 0.002 for α.
For this data set we only have 7 data points (excluding the data point at t = 0 which is not considered since we have assumed that the data point at t = 0 is the initial condition for M , that is M (0) = I = y 0 ).Therefore, in this case we have less than double the number of data points as parameters, so we cannot reasonably expect to obtain a large degree of confidence in our estimates.For the case of a W = 0.84, we again see from Table 1 that the standard errors are unreasonably large for all of the parameters, particularly for k 2 and k 3 .Again indicating that we have very little confidence in our estimated values for θ.In Figure 4 we see that the sensitivity of k 2 and k 3 are again near zero over the entire sampling region, leading to the large standard errors.Here, we obtained consistent estimates with regards to the initial starting values for the optimization.For the data set with a W = 0.87, the standard errors are much smaller (see Table 1), however they remain close to (larger for k 2 and k 3 ) the actual values of the parameter estimates.We see from Figure 5 that k 3 now has a nonzero sensitivity at the end of the sampling region.This undoubtedly aids in the reduction of the standard error values.The residual plots also exhibit a independent and identically distributed (i.i.d.) pattern, leading us to again conclude that the statistical model is specified correctly.The estimates were observed to be robust with respect to the initial starting points selected for the optimization.For this data set, the inverse problem was found to be severely ill-posed.This was resolved by enforcing a stringent upper bound for α, namely α ≤ 6.If α was taken to be a value larger than 6, than the system of differential equations, became very stiff, and the integration could not be completed.Previously, all of the estimated values for α were less than 1.2, so we may be justified in our upper bound.Yet, more attention is required on both the biological justification for such a bound, and on understanding the reason for the ill effects observed with this data set.Alternatively, if an initial guess was chosen near the estimated value reported in Table 1, then the inverse problem was carried out with no issue.
In Table 1 we give the results of the inverse problem.For a W = 0.91 the standard error for k 1 is approximately the same value as the k 1 .The standard errors for the remaining parameters are reasonable in this case.Overall, the results presented on these data sets indicate that the model may be overparameterized.That is, given the data available, the data sets may not contain enough information in order to estimate all of the parameters with any significant degree of confidence.

Parameter Ranking
In this section we illustrate by example one possible technique for obtaining improved uncertainty bounds on the estimated parameters.This will be accomplished by the method outlined in Section 2.1.For all of the following results, the parameters which are set to fixed values are set to be the value of the corresponding starting point used in the optimization.
For the case of a W = 0.79, the parameters were ranked in order of decreasing sensitivity, given as: α, k 1 , k 3 , k 2 .We begin by comparing the situation when only α is estimated compared to estimating both α and k 1 .Thus, α = θ H = −0.2759and (α, k 1 ) = θ = (0.5990, 2.4122).From Table 2, we see that we reject the null hypothesis at a 99.9% confidence level.That is, the data set does support at a significantly statistical level the model with the estimation of both α and k 1 .Next we test to see if the data will support the estimation of the third parameter k 3 .Hence, we now compare estimating (α, k 1 ) = θ H and (α, k 1 , k 3 ) = θ = (0.5411, 1.591, 215.1).From the values presented in Table 2, we see that we fail to reject the null hypothesis.That is, we conclude that the data does not support the estimation of the third parameter k 3 .
Therefore, we conclude that the model can sufficiently describe this data set where only the parameters α and k 1 are estimated.The standard errors for α and k 1 are 0.0982 and 0.5615 which are now at a much more acceptable level compared to the standard errors obtained using the full four-parameter model.
We now consider the data set where a W = 0.84.In this case the parameters were ranked in decreasing importance given by: α, k 1 , k 3 , k 2 .We again attempt to determine the minimal number of parameters which are needed to describe the data.We first consider only estimating α, compared with estimating both α and k 1 .From Table 3, we conclude that we can reject the null hypothesis with a 95% confidence level.Next, we test if the third parameter k 3 provides a significantly better fit to the data.We see from Table 3 that we fail to reject the null hypothesis.Again, we conclude that the data does not support the estimation of the third parameter k 3 .The estimated values were α = 0.4082 and k 1 = 0.4333 with standard errors of 0.1037 and 8.9582 × 10 −2 , respectively.Again we achieve an increased confidence in the parameter estimates, while sacrificing very little with regards to the model fit to the data.
Next we consider the case of a W = 0.87.The parameters were ranked in descending order as: α, k 1 , k 2 , k 3 .For this data set, we reject the null hypothesis until we compare the three and four parameter models, as can be seen in Table 4. Hence, the three-parameter model can adequately describe the data.The estimated values were found to be α = 1.1199, k 1 = 0.2286, and k 2 = 1.7384 with standard errors of 0.1036, 0.1584 and 0.4397, respectively.We again observe that the standard errors decrease using the three-parameter model.The decrease in the standard errors is less significant here, most likely due in part to the fact that we only reduced the model by a single parameter.
For the final case of a W = 0.91, the parameters were ranked in descending order given by: k 2 , α, k 3 , k 1 .From Table 5 it is seen that we fail to reject the null hypothesis when comparing the three and four parameter models.The estimated values were determined to be k 2 = 4.6079, α = 3, 8890, and k 3 = 0.7537 and the standard errors were 0.1544, 0.1568, and 0.7537, respectively.One major drawback with this approach is the need for reasonable values for the fixed parameters.Even if a parameter has a low sensitivity, the possible range of parameter values is generally so large that a poor choice for a nominal value will not allow the reduced model to provide a reasonable approximation.Thus, this approach does not alleviate one of the major shortcomings of the QCM which is the sensitivity to the starting point for the optimization in the data fitting procedure.Because of this, we will next focus on alternative (surrogate) models which can be used to describe the same, or similar, dynamics captured by the quasi-chemical model.

Alternative Candidate Models
Several of the available data sets exhibit primarily only a growth or a death phase.The QCM provides adequate fits to the data in these cases, yet the uncertainty in the estimated parameters becomes extremely large.This is due to the fact that the model is over parameterized with respect to the rather simple dynamics observed in the data.For such cases it may be sufficient to describe the data using a simple exponential model, or an exponential model with a lag phase, Let x(t) denote the total concentration of cells alive at time t.Note that the exponential lag model can be considered as a subset of the QCM (i.e., set The basic dynamics that the QCM was derived to model is a lag phase which transitions into a growth phase, and is then followed by a death phase.In this section we propose a simple model derived from first principles which can also be used to describe this two stage growth-death cycle.
Assume that over the interval 0 ≤ t ≤ τ , the cells reproduce at a rate proportional to the total concentration at time t.This gives where a is the growth rate, and x 0 is the initial cell concentration.Solving the above equation, we obtain x(t) = x 0 e at for t ∈ [0, τ ].Now suppose that at time t = τ the cells enter the death phase.During the death phase we assume that the cells die at a rate proportional to the total concentration at time t.This gives where b is the death rate, and the initial condition at t = τ is chosen so the the cell concentration is continuous.Putting the two stages together, we arrive at the simple piecewise defined model where q = (a, b, τ ) T .This model is of course a crude approximation of the bacteria dynamics.However, it is appealing because of its simplicity.Notice that we do not attempt to separate the bacteria concentration into two subpopulations as is done in the QCM.Thus, we take as our initial condition x 0 = I.
If we wish to incorporate the dynamic that the cells initially start off in a lag phase, transitioning at a constant rate to the growth phase, we can modify the two-state exponential model as follows, where x 1 (0) = I and x 2 = 0, and we impose that the solution is continuous at t = τ .In this case the total bacteria population is given by x(t) = x 1 (t) + x 2 (t), which can explicitly be given by where and q = (k 1 , a, b, τ ).We will refer to this model as the two-stage exponential lag model.This model has the ability to represent many of the same general dynamics as the QCM.Both the two-stage exponential lag model and the QCM have 4 unknown parameters.However, the two-stage exponential lag model has an explicit solution, and has no indication of being ill-posed with regards to the inverse problem.We can further compare the QCM and all of the surrogate models by using the Akaike Information Criteria (AIC), a model selection criteria which allows for the comparison between models without the necessary condition that the models are nested.The AIC is based on the Kullback-Leibler information and maximum likelihood estimation as a way to balance model bias and variance (see [4] for details).The AIC is given by where n is the number of observations, J is the residual sum of squares between the model and the data, and κ is the number of estimated parameters in the model.Since we have relatively few data points, we will use the small sample AIC (AIC c ) which is given by For a given data set, among the competing models, the best approximating model is the one with the minimum AIC c .
In Table 6 we report the frequency that each of the models achieved the minimum AIC c score for a total of 39 data sets.The exponential model, the 2-stage exponential model, and the QCM model were most commonly found to be the best model according to the AIC c .When the QCM achieved the minimum AIC c score using data sets which exhibited primarily only growth or only death dynamics the uncertainty was unreasonably large.However, if the data exhibited both growth and death phases and the QCM was selected as the best model, then it was consistently observed that the uncertainty in the estimated parameters remained reasonable.A detailed discussion regarding the pros and cons of each model applied to the individual data sets is given in [3].

Model
Frequency of minimum AIC c Exponential 11 Exponential with lag phase 1 2-stage exponential 14 2-stage exponential with lag 2 QCM 11 Table 6: A summary of the frequency that each of the models achieved the minimum AIC c value for a total of 39 data sets.
To be clear, we are not proposing that the surrogate models are superior to the QCM.But it is important to acknowledge the shortcomings of the QCM in the context of the given parameter estimation problem.The inverse problem using the QCM is quite ill-posed, and we have seen that for some of the data sets the estimated parameters contain uncertainty bounds many orders of magnitude larger than the estimate.Given the current data sets, it is useful to consider alternative, less complex models.The appeal of the class of quasichemical kinetic models is that they are sufficiently robust to model many different experimental results.Yet, the cost of the robustness is often directly seen in large uncertainties in the parameter estimates for cases where the data does not contain dynamics which require fitting by a sophisticated model.

Concluding Remarks
In summary, we have analyzed a quasi-chemical kinetic model which provides very good fits to experimental data collected from various bacteria species under different environmental conditions.However, we have shown that although the QCM model has great flexibility, it often results in an ill-posed problem, or estimated parameter values with little statistical significance.We illustrated that use of statistical model reduction techniques provides a way in which one might reduce the uncertainty in model parameters for cases where the data does not exhibit the complex behavior the QCM was designed to capture.Additionally, we developed surrogate models, which at the very least provide alternatives for the QCM model, and in the case of the so-called hybrid quasi-chemical model developed in [3], might be used to estimate a subset of the QCM model parameters.Indeed focus on the hybrid quasi-chemical model in future efforts may lead to reduced uncertainty and better initial guesses for the associated inverse problems and subsequently less ill-posedness and more readily identifiable parameters.

Figure 1 :
Figure 1: Example of how the data differs using a growth rate of α = 3.7 (blue dots) and α = 2.7 (green circles).

Figure 2 :
Figure 2: Data set produced by randomly sampling the two trajectories in Figure 1.

Figure 3 :
Figure 3: (left, top) The model fit (solid line) to the colony counts (circles) with a W = 0.79, the sensitivity of g(t; θ) with respect to the parameters (right, top), and the residuals (bottom).

Figure 4 :
Figure 4: (left, top) The model fit (solid line) to the colony counts (circles) with a W = 0.84, the sensitivity of g(t; θ) with respect to the parameters (right, top), and the residuals (bottom).

Figure 5 :
Figure 5: (left, top) The model fit (solid line) to the colony counts (circles) with a W = 0.87, the sensitivity of g(t; θ) with respect to the parameters (right, top), and the residuals (bottom).

Figure 6 :
Figure 6: (left, top) The model fit (solid line) to the colony counts (circles) with a W = 0.91, the sensitivity of g(t; θ) with respect to the parameters (right, top), and the residuals (bottom).

Table 1 :
The parameter estimates and standard error (SE) for each data set.

Table 2 :
Model comparison results for the data set with a W = 0.79.

Table 4 :
Model comparison results for the data set with a W = 0.87.This situation is similar to the a W = 0.87 case in that we do not observe a large decrease in the standard errors.

Table 5 :
Model comparison results for the data set with a W = 0.91.