## Parameter Estimation

The previous sections presented the mathematical techniques necessary to solve the model equations. Thus, with knowledge of the input function Ca(t), the model configuration, and its rate constants, the tissue concentration curve can be predicted mathematically. This section provides an overview of the inverse problem, i.e., given measurements of the tissue activity and the input function and a proposed model configu ration, one can produce estimates of the underlying rate constants. Many references are available on the topic of parameter estimation [42-44].

There are many ways to accomplish the estimation of model parameters. The choices available and the success of any given method depend upon the form of the model and the sampling and statistical quality of the measured data. If only a single measurement of tissue radioactivity is made, obviously only a single parameter can be determined. Collection of multiple time points permits the estimation of some or all of the parameters of a model. Since measured data always have some associated noise, the estimates of model parameters from such data will also be noisy. It is often the goal of a statistical estimation method to minimize the variability of the resulting parameter estimates. Note also that the values of the parameters will affect the statistical quality of the results. For example, blood flow estimates produced by a particular method may be reliable for high-flow regions but unreliable for low-flow regions.

When many tissue measurements are collected after radionuclide administration, the most commonly used method of parameter estimation is called least-squares estimation. Qualitatively, the goal of this technique is to find values for the model rate constants that, when inserted into the model equations, produce the "best" fit to the tissue measurements. Quantitatively, the goal is to minimize an optimization function, specifically the sum of the squared differences between the measured tissue concentration data and the model prediction, i.e.,

where there are N tissue measurements, C¡, i=1,...,N, at times T,and C(T) is the model prediction of tissue activity at each of these times. This particular form is used because of the nature of the noise in the measured data. Parameter estimates produced by minimizing the sum of squared differences have minimum variability if the noise in each scan measurement is statistically independent, additive, Gaussian, and of equal magnitude. Additive and independent statistical noise is usually a good assumption for PET image data, however, often the variance of the measurements will not be constant across different scans in one multiple-scan acquisition, particularly for short-lived isotopes such as 15O or 11C. In this case, the least-squares function can be modified to accommodate variable noise levels as follows:

where wi is a weight assigned to data point i. This method is called weighted least-squares estimation,and the optimal weight for each sample is the inverse of the variance of the data [43]. For simple count data, the variance of the data can be estimated from the count data itself based on its Poisson distribution [45]. For reconstructed data, many algorithms have been proposed to calculate or approximate the noise in pixel or region-of-interest data [46-52].

It is important to recognize that there are many non-random or deterministic error sources in the modeling process that cause inconsistencies between the model and the measured data (see section on random and deterministic errors). When fitting data to a model, the parameter estimation procedure is naive in that it believes that the specified model is absolutely correct. The algorithm will do its best to minimize the optimization function. Therefore, if there are deterministic errors in the model or the input function, the estimation algorithm can produce unsuitable results. It may be appropriate in some situations to adjust the weights of some data points (e.g., early time points where errors in the model due to intravascular activity are most significant) to reduce the sensitivity of the model to the presence of deterministic errors.

Once an optimization function (Eq. 30 and Eq. 31) has been defined, there are many algorithms available to determine the values of the model parameters that minimize it [41,43]. Unfortunately, in most cases with compartmental models, there are no direct solutions for the parameters. This is true because, although the models themselves are linear (i.e., all fluxes between compartments are linear multiples of the concentration in the source compartment), the solutions to these models are functions that are non-linear in at least one of the model parameters. For example, Eq. 27, the solution to the one-compartment model

(Fig. 6.3B) is linear with respect to the parameter K1 but is non-linear with respect to the parameter k2. To solve for the parameters, iterative algorithms are required. First, an initial guess is made for the parameter values. Then the algorithm repeatedly modifies the parameters, at each step reducing the value of the optimization function. Convergence is reached when changes to the parameters from one iteration to the next become exceedingly small. Great care is required in the use of iterative algorithms, because incorrect solutions can be obtained, particularly if the initial guess is not appropriate.

Figure 6.6 provides an example of the process of parameter estimation applied to time-activity data collected after a bolus injection of fluorodeoxyglucose (FDG) [53]. Figure 6.6a shows a plot of region-of-inter-est values (occipital cortex) taken from reconstructed PET images. The solid line through the data points is the best fit obtained by minimizing the weighted sum of squared differences between the data and the two-compartment model (Figure 6.3c). Figure 6.6b shows a plot of the weighted residuals versus time. The residual is the difference between each data point and the model prediction. When weighted least squares is used, the residuals are scaled by the square root of each weight, w;, so that the sum-of-squares optimization function equals the sum of squared residuals. Ideally the residuals would be random, have zero mean, and uniform variance. If a good estimate of the noise level in the data is known, the weighted residuals should have a standard deviation of approximately 1. Thus, when plotting the residuals versus time or versus concentration, the residuals would appear as a uniform band centered on zero. The residuals in Fig. 6.6b reasonably satisfy these expectations.

Many parameter estimation algorithms provide estimates of the uncertainties of the parameter estimates

O 100

20 40 60 80

Time (min)

40 60

Time (min)

Figure 6.6. Examples of parameter estimation results from PET data after bolus injection of FDG. a: tissue time-activity curve from region of interest in occipital cortex. Symbols are measured data points. Solid line is fitted function for the two-compartment model (Fig. 6.3C) based on weighted least-squares parameter estimation. b Plot of weighted residuals (difference between data and fitted value scaled by regression weight) versus time.

called standard errors. These values can be used as estimates of the minimum random uncertainty in the estimate. The algorithm determines these standard errors based on the structure of the model, the parameter estimates, and the magnitude of the residual sum of squares. However, this measure is an underestimate of the true uncertainty of the parameters, since there are usually many sources of "real-world" errors that are not explicitly included.

It is often useful to calculate functions of the rate constants which provide different physiological information. For example, in the one-compartment model (Fig. 6.3B), a parameter estimation problem may be posed to estimate the rate constants K1 and k2. From these parameters, the distribution volume (V = K1/k2) can be calculated. To determine the uncertainty in the distribution volume estimate, information about the individual standard errors in K1 and k2 is required along with the correlation between them. The parameter estimates will be correlated since both values are determined simultaneously from the same noisy data. The coefficient of variation (CV, the ratio of the standard error to the parameter value) of the distribution volume can be calculated by propagation of errors calculations:

CV2 (V) = CV2(k1) + CV2(k2) - 2p12CV(K 1 )CV(k2) (32)

where p12 is the estimated correlation coefficient between the parameter estimates K1 and k2.

Least squares is the best optimization criterion for estimating parameters when a large number of assumptions are met. If any of these assumptions are not true, better estimates may be obtained by other methods (see section on error analysis). A better estimate is one that may be more accurate (less biased) or more precise (less variable). In addition, iterative least-squares algorithms may be very computationally intensive, particularly if it must be carried out individually for every pixel in an imaging volume. Often, iterative least-squares procedures are used only for a small number of regions of interest. However, it is often more useful if the data analysis procedure produces functional images where each pixel represents a physiological parameter of interest. To do this, rapid computation schemes are required. Rapid implementations of iterative least-squares procedures have been developed for the simplest non-linear models with just one nonlinear parameter, e.g., the one-compartment model with solution in Eq. 27. These techniques have been applied to the measurement of cerebral blood flow [54, 55] and total volume of distribution of receptors [56-58]. In addition, a number of methods have been derived that allow direct non-iterative calculation of the parameter estimates by reformulating the problem in terms of integrals of the tissue and blood data [59-66]. These methods do not minimize the sum-of-squares optimization function, but in many cases have been shown to have comparable statistical quality to the least-squares techniques and often have less sensitivity to deterministic errors in the model. Another interesting approach for parameter estimation from nonlinear models is called spectral analysis and uses the methods of linear programming with the knowledge that all the exponential clearance terms (a¡ in Eq. 19 and Eq. 20, for example) are positive [67].

As shown above, the measured tissue activity is the convolution of the input function with the underlying impulse-response function (Eq. 28). This impulse-response function has a much simpler mathematical form (usually a sum of exponentials) and is therefore more easily analyzed. Some investigators have used the approach of deconvolution, whereby an estimate of the impulse-response function is determined from measurements of the tissue response and the input function [68]. However,because the process of deconvolution greatly amplifies noise in the tissue measurements and is often mathematically unstable, great care is required in the application of these techniques.