Abstract
To evaluate the postulated role of extrastriatal D1 receptors in human cognition and psychopathology requires an accurate and reliable method for quantification of these receptors in the living human brain. [11C]NNC 112 is a promising novel radiotracer for positron emission tomography imaging of the D1 receptor. The goal of this study was to develop and evaluate methods to derive D1 receptor parameters in striatal and extrastriatal regions of the human brain with [11C]NNC 112. Six healthy volunteers were studied twice. Two methods of analysis (kinetic and graphical) were applied to 12 regions (neocortical, limbic, and subcortical regions) to derive four outcome measures: total distribution volume, distribution volume ratio, binding potential (BP), and specific-to-nonspecific equilibrium partition coefficient (k3/k4). Both kinetic and graphic analyses provided BP and k3/k4 values in good agreement with the known distribution of D1 receptors (striatum > limbic regions = neocortical regions > thalamus). The identifiability of outcome measures derived by kinetic analysis was excellent. Time-stability analysis indicated that 90 minutes of data collection generated stable outcome measures. Derivation of BP and k3/k4 by kinetic analysis was highly reliable, with intraclass correlation coefficients (ICCs) of 0.90 ± 0.06 (mean ± SD of 12 regions) and 0.84 ± 0.11, respectively. The reliability of these parameters derived by graphical analysis was lower, with ICCs of 0.72 ± 0.17 and 0.58 ± 0.21, respectively. Noise analysis revealed a noise-dependent bias in the graphical but not the kinetic analysis. In conclusion, kinetic analysis of [11C]NNC 112 uptake provides an appropriate method with which to derive D1 receptor parameters in regions with both high (striatal) and low (extrastriatal) D1 receptor density.
The D1 receptors are the most abundant dopaminergic receptor subtype in neocortical areas including the prefrontal cortex (PFC) as well as in the hippocampus and amygdala (for reviews, see Seeman, 1992; Meador-Woodruff et al., 1996). Studies in nonhuman primates have shown that activation of D1 receptors in the PFC is involved in working memory (Arnsten et al., 1994; Sawaguchi and Goldman-Rakic, 1994; Williams and Goldman-Rakic, 1995) and that D1 receptors in the hippocampus are implicated in short- and long-term memory (Huang and Kandel, 1995; Otmakhova and Lisman, 1996; Bernabeu et al., 1997; Seamans et al., 1998). An accurate and reliable method for measuring these receptors in extrastriatal regions of the living human brain will permit the study of their role in mediating cognition in health and disease states.
The radiolabeled benzazepine [11C]SCH 23390 (KD = 0.4 nmol/L) was the first radiotracer developed to image the D1 receptor with positron emission tomography (PET) (Billard et al., 1984; Halldin et al., 1986; Farde et al., 1987). Studies in humans have demonstrated that [11C]SCH 23390 is an appropriate radiotracer with which to measure D1 receptors in the striatum, where these receptors are present in high density (Laihinen et al., 1994; Chan et al., 1998). However, [11C]SCH 23390 displays relatively low specific/nonspecific ratios, which compromises the sensitivity and reliability of the D1 receptor measurement in extrastriatal areas such as the PFC (Karlsson et al., 1997), where the density of these receptors is approximately fivefold lower than in the striatum (De Keyser et al., 1988; Laruelle et al., 1991; Hall et al., 1994).
Recently, two new benzazepines developed by Novo Nordisk (Malov, Denmark) (Andersen et al., 1992) have been evaluated as PET radiotracers in baboons: [11C]NNC 756 (KD = 0.17 nmol/L) and [11C]NNC 112 (KD = 0.18 nmol/L) (Halldin et al., 1993, 1998; Karlsson et al., 1993). Both radiotracers provide higher specific/nonspecific ratios than [11C]SCH 23390. The disadvantage of [11C]NNC 756 is a low selectivity against serotonin 5-HT2 receptors. In baboons, ∼25% of the binding of [11C]NNC 756 in the frontal cortex was displaced by the 5HT2 antagonist ketanserin (1.5 mg/kg) (Karlsson et al., 1993). In contrast, ketanserin injection (2 mg/kg) did not affect the frontal uptake of [11C]NNC 112 (Halldin et al., 1998), indicating that [11C]NNC 112 provides appropriate D1/5-HT2 selectivity in vivo. Thus, [11C]NNC 112 appears as a promising candidate radiotracer with which to measure both striatal and extrastriatal D1 receptors, and preliminary results in humans have confirmed the potential of this radiotracer (Halldin et al., 1998).
The purpose of the present study was to further evaluate [11C]NNC 112 as a radiotracer with which to quantify D1 receptors in the human brain. Specifically, the aim of this study was to define the optimal scanning protocol and analytic method to derive accurate and reliable D1 receptor parameters in striatal and extrastriatal regions. Six healthy volunteers were studied twice, after injection of [11C]NNC 112. Twelve brain regions were studied, ranging from high (caudate and putamen) to low (neocortex, limbic regions, and thalamus) receptor density regions. Two approaches were compared for measurement of D1 receptor parameters: compartmental kinetic and graphical analysis. Four outcome measures were compared: total distribution volume (VT), distribution volume ratio (DVR), binding potential (BP), and specific-to-nonspecific equilibrium partition coefficient (k3/k4). The comparison included four attributes of the outcome measures: their identifiability, which describes the degree of certainty in parameter estimation; validity, which is derived by comparison with in vitro values; stability, which evaluates the dependence of the outcome measures on the duration of the scan; and reliability, which is assessed with test/retest reproducibility studies.
MATERIALS AND METHODS
Human subjects
We report here the results of six consecutive weekly experiments. Six healthy volunteers participated in this study (age 33 ± 9 years, range 23 to 45 years; these and subsequent values given as means ± SD; five men and one woman). All subjects were scanned twice on the same day to evaluate the reproducibility of the outcome measures. The study was approved by the Columbia Presbyterian Medical Center Institutional Review Board, and subjects provided written informed consent after receiving an explanation of the study. The absence of pregnancy and medical, neurological, and psychiatric history (including alcohol and drug abuse) was assessed by history, review of systems, physical examination, routine blood tests including pregnancy test, urine toxicology, and electrocardiography.
Radiochemistry
[11C]NNC 112 was prepared by N-methylation of the precursor using [11C]methyltriflate as previously described (Halldin et al., 1998). Specific radioactivity at the time of injection was 1,255 ± 355 Ci/mmol (n = 12, range from 883 to 2,195 Ci/mmol). Injected dose was 15.0 ± 2.3 mCi (n = 12, range from 11.8 to 18.2 mCi).
Positron emission tomography protocol
Subject preparation included placement of arterial and venous catheters, fiducial markers, and polyurethane head immobilizer. An arterial catheter was inserted in the radial artery after completion of the Allen test and infiltration of the skin with 1% lidocaine. A venous catheter was inserted in a forearm vein on the opposite side. Four fiducial markers filled with 11C (∼3 μCi/marker at the time of injection) were glued on the subject's head. Head movement minimization was achieved with a polyurethane head immobilizer system (Soule Medical, FL, U.S.A.) molded around the head of the subject. This system provides better restraint than a thermoplastic mask (unpublished data). The PET imaging was performed with the ECAT EXACT 47 (Siemens/CTI, Knoxville, TN, U.S.A.; 47 slices covering an axial field of view of 16.2 cm, axial sampling of 3.375 mm, three-dimensional mode in-plane and axial resolution of 6.0 and 4.6 mm full width at half-maximum at the center of the field of view, respectively) (Wienhard et al., 1992). A 10-minute transmission scan was obtained before radiotracer injection. [11C]NNC 112 was injected intravenously over 45 seconds. Emission data were collected in the three-dimensional mode for 120 minutes as 21 successive frames of increasing duration (3 × 20 s, 3 × 1 min, 3 × 2 min, 2 × 5 min, 10 × 10 min). Images were reconstructed to a 128 × 128 matrix (pixel size of 2.5 × 2.5 mm2). Reconstruction was performed with attenuation correction using the transmission data and a Sheppe 0.5 filter (cutoff 0.5 cycle/projection rays). Subjects were allowed to rest outside of the camera for 15 to 30 minutes between the two injections.
Input function measurement
After radiotracer injection, arterial samples were collected every 5 seconds with an automated sampling system for the first 2 minutes (Graham and Lewellen, 1993) and manually thereafter at longer intervals. A total of 31 samples were obtained per experiment. After centrifugation (10 minutes at 1,800 g), plasma was collected in 200-μL aliquots and counted in a gamma counter (Wallac 1480 Wizard 3M Automatic Gamma Counter). Gamma counter efficiency was calibrated at regular intervals with the PET camera using an 18F solution. In addition, a long-lived source (22Na) was counted with each set of samples to control for between-run variance in counting efficiency.
Six selected samples (collected at 2, 8, 16, 30, 50, and 70 minutes) were further processed by protein precipitation using acetonitrile followed by high-pressure liquid chromatography to measure the fraction of plasma activity representing unmetabolized parent compound. Plasma samples (0.5 mL) were added to 0.7 mL of acetonitrile in a centrifuge tube. After mixing, the samples were centrifuged at 14,000 rpm for 3.5 minutes. The acetonitrile solution was separated and analyzed by high-pressure liquid chromatography. The system consisted of a Waters 510 isocratic pump, a Rheodyne injector with a 2-mL loop, a Phenomenex C18 ODS column (10-μm particle size, 250 × 4.6 mm, stainless steel), and a gamma detection system (Bioscan Flow Count unit). The column was eluted with a solvent mixture of acetonitrile/0.1 mol/L aqueous ammonium formate (30:70) at a flow rate of 2.5 mL/min. Five fractions collected over 12 minutes were counted. A standard [11C]NNC 112 solution was processed with each experiment. Parent eluted with fractions 3 and 4. For each sample, the fraction parent was estimated by the ratio of decay-corrected activity in collections 3 and 4 to that in the total collection.
A biexponential function was fitted to the six measured fractions parent and used to interpolate values between and after the measurements. The smallest exponential of the fraction parent curve, λpar, was constrained to the difference between λcer, the terminal rate of washout of cerebellar activity, and λtot, the smallest elimination rate constant of the total plasma (Abi-Dargham et al., 1999). The input function was calculated by the product of total counts and interpolated fraction parent at each time. The measured input function values [Ca(t); μCi/mL) were fitted to a sum of three exponentials, and the fitted values were used as input to the kinetic and graphical analyses. The clearance of the parent compound (CL; L/h) was calculated as the ratio of the injected dose to the area under the curve of the input function (Abi-Dargham et al., 1994).
For the determination of the plasma free fraction (f1), triplicate 200-μL aliquots of plasma collected before injection were mixed with radiotracer, pipetted into ultrafiltration units (Centrifree, Amicon, Danvers, MA, U.S.A.), and centrifuged at room temperature (20 minutes at 4,000 rpm). At the end of centrifugation, plasma and ultrafiltrate activities were counted, and f1 was calculated as the ratio of ultrafiltrate to total activity concentrations (Gandelman et al., 1994). Triplicate aliquots of saline solution mixed with radiotracer were also processed to determine the filter retention of free [11C]NNC 112.
Magnetic resonance imaging acquisition and segmentation procedures
Magnetic resonance imaging (MRI) was performed on a GE 1.5 T Signa Advantage system (Milwaukee, WI, U.S.A.). After a sagittal scout (localizer) image performed to identify the anterior-posterior commissural (AC-PC) plane (1 minute), a transaxial T1-weighted sequence with 1.5-mm slice thickness was acquired in a coronal plane orthogonal to the AC-PC plane over the whole brain with the following parameters: three-dimensional SPGR (spoiled gradient-recalled acquisition in the steady state); repetition time 34 milliseconds; echo time 5 milliseconds; flip angle 45°; slice thickness 1.5 mm and zero gap; 124 slices; field of view 22 × 16 cm; 256 × 192 matrix, reformatted to 256 × 256, yielding a voxel size of 1.5 × 0.9 × 0.9 mm; and time of acquisition 11 minutes.
The MRI segmentation was performed within MEDx (Sensor Systems, Sterling, VA, U.S.A.), with original subroutines implemented in MATLAB (Math Works, Natick, MA, U.S.A.). Steps for MRI segmentation included correction for field inhomogeneities, fitting of the voxel distribution to a combination of three Gaussians functions, voxel classification, and postfiltering. To correct for field inhomogeneity, histograms of one anterior coronal slice and one posterior coronal slice were computed, as described below. g x was defined as the difference in peak locations divided by the distance between slices. In a similar manner, g y was defined from two sagittal slices, and g z was defined from two transaxial slices. Next, an image G was constructed of the same dimensions as the MR image, with G being an affine function with gradient (g x , g y , g z ). Gradient-corrected MRI of the image I was given by I/G. The principal advantage of this approach is that if there is no gradient inhomogeneity, the gradient correction causes no loss of resolution. Gradient-corrected images were then imported into MATLAB. Voxel intensity distribution was fit to a linear combination of three Gaussian functions:
where α k , μ k , and σ k are the weight, mean, and SD of the Gaussian function for compartment k. The values of these parameters were estimated using maximum likelihood (Fig. 1), and the threshold values were used in the MEDx system to perform the segmentation of the MRI into gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) compartments. These thresholds were applied to create GM, WM, and CSF masks. The masks were finally smoothed and thresholded to remove isolated voxels.

Distribution of voxel intensity from a T1-weighted SPGR (spoiled gradient-recalled acquisition in the steady state) magnetic resonance imaging file and fitted sum of three Gaussians estimating voxel intensity of cerebrospinal fluid and gray and white matter. The solid line represents the sum of the fitted Gaussians. The intersection between the Gaussians was used as threshold for pixel classification.
Image analysis
Image analysis was performed with MEDx according to the following steps: frame realignment, PET-MRI registration, and time-activity curve measurement.
1. Frame realignment. To correct for head movement during the acquisition, all frames were coregistered to the first frame of the study, using a least-squares algorithm for within-modality coregistration (automated image registration) (Woods et al., 1992). Images were thresholded to the level of the 11C fiducial markers, and only the signals from the markers were used for automated image frame-to-frame registration. The x, y, and z coordinates of the pixel with highest intensity of each fiducial marker were used to monitor the quality of between-frame coregistration.
2. PET-MRI registration. After frame-to-frame registration, the 21 frames were summed, and the summed PET image was coregistered and resampled to the MR image, using automated image registration (Woods et al., 1992). The summed PET image was used for the coregistration because it contains counts from the initial flow-dependent activity distributions that enhance detection of boundaries of regions with low receptor density, such as the cerebellum. The parameters of the spatial transformation matrix of the summed PET data set were then applied to each individual frame. Thus, each PET frame was resampled in the coronal plane to a voxel volume of 1.5 × 0.9 × 0.9 mm3.
3. Region-of-interest (ROI) tracing and activity sampling. The ROI boundaries were drawn on the MR image according to criteria based on brain atlases (Talairach and Tournoux, 1988; Duvernoy, 1991) and on published reports (Pani et al., 1990; Kates et al., 1997; Killiany et al., 1997). Neocortical regions included dorsolateral prefrontal cortex (DLPFC), medial prefrontal cortex, orbitofrontal cortex (OFC), anterior cingulate cortex, parietal cortex, temporal cortex, and occipital cortex. Subcortical regions included caudate, putamen, thalamus, amygdala, hippocampus, and cerebellum. Two methods were used for final ROI definition. A segmentation-based method was used for neocortical regions, and a direct identification method was used for subcortical regions. For neocortical regions, “large” regions were first drawn to delineate the boundaries of the ROIs. Within these regions, only the voxels classified as GM were used to measure activity distribution. This process is illustrated in Fig. 2 for DLPFC and OFC. Because of the mixture of GM and WM in central gray structures (especially thalamus), the segmentation-based approach was not used to define subcortical ROIs, and the boundaries of these regions were identified by anatomical criteria.

The process of region-of-interest (ROI) definition for dorsolateral prefrontal cortex (DLPFC) and orbitofrontal cortex (OFC) in one coronal plane.
Quantitative analysis
Derivation of [11C]NNC 112 regional distribution volumes was performed using kinetic and graphical analyses. A three-compartment model (i.e., two-tissue compartment model) provided the general framework for each method. The model included the arterial plasma compartment (Ca), the intracerebral free and nonspecifically bound compartment (nondisplaceable compartment; C2), and the specifically bound compartment (C3). The equilibrium distribution volume of a compartment i (V i ; mL/g) was defined as the ratio of the tracer concentration in this compartment to the free plasma concentration at equilibrium:
V2 and V3 were defined as the distribution volumes of the second (nondisplaceable) and third (specific) compartments, respectively. V3 is equal to BP (g/mL), which is the ratio of the receptor density (Bmax; nmol/L/g of tissue) and affinity (KD; nmol/L/mL of brain water) (Mintun et al. 1984; Laruelle et al. 1994d). VT was defined as the total regional equilibrium distribution volume, equal to the sum of V2 and V3. Both kinetic and graphical analyses shared two assumptions: given the negligible concentration of D1 receptors in the cerebellum (Hall et al., 1994), cerebellar VT was assumed to represent only free and nonspecific binding and to provide a reasonable estimate of V2 in the ROIs; the contribution of plasma total activity to the regional activity was calculated assuming a 5% blood volume in the ROIs (Mintun et al., 1984).
Kinetic analysis
Kinetic analysis was performed according to the differential equations:
with the kinetic parameters defined as
where F (mL g−1 min−1) is the regional blood flow, E (unitless) the unidirectional extraction fraction, PS (mL g−1 min−1) the permeability-surface area product of the tracer, kon [(nmol/L)−1 min−1] the bimolecular ligand-receptor association rate constant, Bmax′ (nmol/L) the concentration of receptors available for binding (equal to Bmax as experiments were performed at tracer dose), and koff (min−1) the receptor dissociation rate constant. Kinetic parameters were derived by nonlinear regression using a Levenberg-Marquart least-squares minimization procedure (Levenberg, 1944) implemented in MATLAB, as previously described (Laruelle et al., 1994c). In the absence of precise knowledge of the variance associated with the measurement of each frame, uniform weighting was used in the regression (Carson, 1986; Burger and Buck, 1997). Goodness of fit of models with different levels of complexity was compared using the Akaike information criterion (Akaike, 1974) and the F test (Landlaw and DiStefano, 1984; Carson, 1986).
In the ROIs, volumes of distribution were then calculated according to
and
Substituting in Eq. 10 the terms K1/k2f1, k3, and k4 with Eqs. 5, 6, and 7, respectively, and recalling that KD = koff/kon establish the equivalence between V3 and BP (Laruelle et al., 1994a).
The BP, as defined above, is the only expression of BP that is dependent on receptor parameters (i.e., KD and Bmax) and can be compared with in vitro values (see derivations in Laruelle et al., 1994a). Another outcome measure of interest is the specific-to-nonspecific equilibrium partition coefficient, termed k3/k4 or V3″ (Laruelle et al., 1994d):
k3/k4 is often used as the outcome measure in the PET literature and is referred to as the BP (Gjedde and Wong, 1990; Logan et al., 1990). In fact, this ratio does not correspond to BP (=Bmax/KD) as classically defined by Mintun et al. (1984) but to the ratio of BP to V2, as shown by dividing Eq. 10 by Eq. 9 (Eq. 11). To avoid confusion, we use BP to designate the “true” BP, as defined in Eq. 10, and k3/k4 to designate the “practical” BP, as defined in Eq. 11.
The ratio of total to nonspecific distribution volume is termed the DVR:
DVR is reported to facilitate comparison with results of studies reporting reproducibility of [11C]raclopride DVR (Volkow et al., 1993), [18F]altanserin DVR (Smith et al., 1998), and [11C]SCH 23390 DVR (Chan et al., 1998).
Results from kinetic analysis were evaluated with four criteria: identifiability, validity, stability, and reliability.
Identifiability. The term “identifiability” refers to the error in the parameter estimation due to the inherent uncertainty of this estimation. The identifiability was assessed by the standard error of the parameter at convergence. The standard error of the parameters was given by the diagonal of the covariance matrix (Carson, 1986) and expressed as percentage of the parameters (coefficient of variation; %CV). This standard error should not be confused with the variance of the parameter in the investigated population (between-subject SD or %CV) or with the lack of reproducibility (within-subject SD or %CV). It should also be noted that uniform weighting of the least-squares estimation might be associated with an overestimation of the standard error of the parameters if the measurement variances range over orders of magnitude (Carson, 1986). Validity. Physiological plausibility was assessed by comparing regional BP values with distribution of D1 receptors in the human brain in vitro, as reported by De Keyser et al. (1988), who used autoradiography and [3H]SCH 23390 in the presence of 20 nmol/L mianserin to preclude binding to 5-HT2A/2C receptors, and Hall et al. (1988), who used homogenized membranes, [3H]SCH 23390, and 40 nmol/L ketanserin. Stability. Experimental data were collected for 120 minutes. The relationship between BP derivation and the duration of the scan was evaluated by fitting shorter-duration data sets (110, 100, 90, 80, 70, 60, 50, 40, and 30 minutes) and comparing the results with the reference value obtained with the 120-minute data set. For each region and each duration, the averages ± SD (n = 12) of the results expressed as percentage of the reference value were calculated to provide an estimate of the bias (average) or dispersion (SD) induced in the outcome measure by analyzing shorter data sets. The solution was considered stable after time t if all results derived from time t to the end of the experiment had a mean within 10% of the reference value and an SD that did not exceed 15%. Reliability. The test/retest variability was calculated as the absolute value of the difference between test and retest values, expressed as percentage of the mean value of both measurements. To evaluate the within-subject variability relative to the between-subject variability, both within-subject SD and between-subject SD were calculated and expressed as percentage of mean value (within-subject %CV and between-subject %CV). The reliability of the measurements was assessed by the intraclass correlation coefficient (ICC) calculated as follows (Kirk, 1982):
where BSMSS is the mean sum of squares between subjects, WSMSS is the mean sum of squares within subjects, and n is the number of repeated observations (n = 2 in this study). This coefficient estimates the reliability of the measurement and assumes values from −1 (no reliability; i.e., BSMSS = 0) to 1 (maximum reliability, achieved in case of identity between test and retest; i.e., WSMSS = 0).
Graphical analysis
Regional time-activity curves were graphically analyzed according to the equation
where the values of the slope a and the intercept b were obtained by linear regression (Logan et al., 1990). This method allows the determination of regional VT of reversible ligands as the slope of the regression line without assuming a particular compartmental configuration. Assuming, as in the kinetic analysis, the equivalence between the cerebellum distribution volume (VT cer) and the nondisplaceable distribution volume in the ROI, outcome measures were calculated as
and
Reliability of the graphical analysis was evaluated as described for the kinetic analysis.
Noise analysis
Simulations were performed to evaluate the sensitivity of the kinetic and graphical analysis to experimental noise. This analysis was carried out for the cerebellum and one ROI (caudate). First, noise-free data [T(t)] were created, using a standard input function and a standard impulse response function. Data were simulated for 120 minutes, with values calculated every minute (n = 120). Second, various levels of random noise were introduced into the data. Due to the rapid decay of 11C, the noise increased over the course of an experiment, and the simulated noise was designed to reflect this effect. For noise level {α j }, a normal random noise vector with mean 1 and SD at time t equal to α j [exp(τt)/SD of exp(τt)] was generated, where τ was the exponent associated with 11C half-life [20 minutes, τ = 0.034 = ln (2)/20]. The “true” datum T(t) was multiplied at each point by the noise vector, resulting in a “measured” data set M(t) with mean = T(t) and SDj(t) = T(t)α j [exp(0.034t)/SD of exp(τt)]. At each noise level α j , this operation was repeated 500 times. Each data set was analyzed with kinetic or graphical analysis. The average VT derived from these noisy data sets was compared with the true value to estimate the bias introduced by the noise in the derivation of VT, whereas the SD provided a measure of the noise-related error in VT.
Statistical analysis
For each outcome measure and each subject (n = 6), the average of test and retest values was calculated, and results are given as means ± SD of these six average measurements. Thus, unless otherwise specified, SD refers to between-subject SD, that is, to the estimated variability in the investigated population (each subject counts only once, n = 6). When the SD refers to variability between the experiments rather than between subjects (such as for the injected dose), the SD is followed by n = 12. Statistical analysis was performed with repeated-measures analysis of variance. Relationships between continuous variables were analyzed with the Pearson product-moment correlation coefficient. A two-tailed probability value of 0.05 was selected as the significance level.
RESULTS
Plasma analysis
[11C]NNC 112 underwent rapid metabolism measured in human plasma (Fig. 3). At 30 minutes, only 18 ± 6% (n = 12) of the total activity corresponded to the parent compound. The six measures (2, 8, 16, 30, 50, and 70 minutes) of fraction parent were fit to a sum of two exponentials. In all cases, the unconstrained fit of the fraction parent curve yielded a value of 0 for the smallest exponent (λpar). This solution was consistent with the low rate of metabolism after 30 minutes and the absence of data after 70 minutes. However, this solution was not consistent with the observation of a faster washout rate in the cerebellum compared with the total plasma activity after 30 minutes. This problem was addressed by constraining λpar to the difference between λcer, the terminal rate of washout of cerebellar activity, and λtot, the smallest elimination rate constant of the total plasma. For each experiment, the terminal cerebellar washout rates were calculated by fitting the cerebellum data from 30 to 120 minutes to a one-exponential function (λcer = −0.00764 ± 0.00287 min−1, n = 12). Similarly, the smallest exponential of the total plasma activity was derived by fitting the total plasma activity from 30 to 120 minutes to one exponential (λtot = −0.00322 ± 0.00083 min−1, n = 12). The smallest exponential of the fraction parent curve was constrained to the difference between the two functions λcer and λtot (−0.00450 ± 0.00296 min−1, n = 12), and this fit was used to correct total activity by the fraction parent.

Fraction of plasma activity associated with parent compound (% total) following [11C]NNC 112 injection in humans. Each point is the average ± SD of 12 measurements (6 subjects studied twice). The solid line represents the sum of two exponentials, with second half-life constrained to −0.00440 min−1. This value corresponded to the average value of the difference between cerebellum and total plasma terminal half-lives.
The input function was fit to a sum of three exponentials (Fig. 4), and fitted values were used as input function for kinetic and graphical analysis. The average [11C]NNC 112 plasma clearance was 96 ± 31 L/h. Significant between-subject differences in plasma clearance were detected (P = 0.006). Test/retest variability of the clearance measurement was 15 ± 8%, and its reliability was excellent with within-subject %CV of 10%, between-subject %CV of 32%, and ICC of 0.83.

Plasma concentration of [11C]NNC 112 following injection of 18 mCi in one healthy volunteer. Points represent measured values. The solid line represents the result of a three-exponential fit to the measured values. Clearance was 100 L/h.
The mean plasma f1 fraction was 1.02 ± 0.28%. The within-subject CV of the f1 measurement (34% CV) exceeded the between-subject CV (27%), which translated into poor reliability (ICC = −0.20). In addition, significant retention of free radiotracer on the filter was noted (the free fraction measured in saline solution was 77 ± 4%, n = 12). Because of the failure to reliably identify between-subject differences in free fraction, this correction factor was neglected, and distribution volumes were expressed relative to the total parent.
Brain uptake
Highest brain regional uptake was observed in caudate and putamen, and lowest uptake was observed in the cerebellum and thalamus (Fig. 5). In other regions (neocortex and limbic system), activity levels were intermediate between striatal and cerebellar regions. Activity peaked early in the cerebellum, neocortical, and limbic regions (<15 minutes) and later in striatum (20-40 minutes). Appreciable washout was observed in all regions, suggesting that the association and dissociation rates were fast enough to allow derivation of D1 receptor BP within the time frame of the scan. At 60 minutes, the ROI/cerebellar activity ratios were as follows: putamen, 3.72 ± 0.58; caudate, 3.26 ± 0.58; amygdala, 1.77 ± 0.32; hippocampus, 1.48 ± 0.23; anterior cingulate, 1.78 ± 0.29; DLPFC, 1.53 ± 0.20; temporal cortex, 1.69 ± 0.33; occipital cortex, 1.30 ± 0.17; and thalamus, 1.19 ± 0.21.

[11C]NNC 112 in a human volunteer. (
Kinetic analysis
Cerebellum. Cerebellar time-activity curves were fit to both a two- and a three-compartment model. In all cases, the three-compartment model provided a slightly but significantly better fit (P < 0.01; Fig. 6), with smaller values of Akaike information criterion (data not shown). Cerebellum VT values calculated with the two- and three-compartment models were 2.54 ± 0.42 and 3.41 ± 0.58 mL g−1, respectively. Thus, cerebellar VT was 43% larger when calculated with a three-compartment model than a two-compartment model. The improved goodness of fit of the three-compartment model favored this model. However, several other factors favored the two-compartment model: (1) The identifiability of cerebellar VT derived with a two-compartment model analysis (error of 4.13 ± 1.0%) was superior to that of the three-compartment model analysis (9.52 ± 7.13%). (2) The cerebellar VT values derived by the Logan analysis (2.75 ± 0.48 mL g−1) were closer to the two-compartment model cerebellar VT (2.54 ± 0.42 mL g−1) than the three-compartment model cerebellar VT (3.41 ± 0.58 mL g−1). (3) The stability over time of the two-compartment model was superior to that of the three-compartment model: For the two-compartment model, all values derived for periods from 60 to 120 minutes were within 10% of the reference value (= 120-minute value). In contrast, values were very unstable for the three-compartment model, with no discernible pattern over time and large standard errors (Fig. 7). (4) Derivation of cerebellar VT by the two-compartment model was very reproducible. The test/retest variability and reliability of the two-compartment model VT (test/retest variability = 6 ± 6%, between-subject %CV = 17%, within-subject %CV = 3%, ICC = 0.92) were much better than for the three-compartment model VT (test/retest variability = 21 ± 23%, between-subject %CV = 17%, within-subject %CV = 19%, ICC = −0.10). Together, these data suggest that the three-compartment model cerebellar fit was unstable and sensitive to the noise associated with cerebellar measurements at the end of the experiments. For these reasons, the two-compartment model fit was selected as the fit of choice to derive cerebellar VT.

Brain time-activity curves measured in putamen (open circles), dorsolateral prefrontal cortex (filled circles), and cerebellum (triangles) after injection of 18 mCi of [11C]NNC 112 (same subject as in Fig. 4). The solid line represents the results of a three-compartment model fit, and the dashed line illustrates a two-compartment model fit in the cerebellum. In the cerebellum, the three-compartment model provided a slightly but significantly better fit.

Relationship between duration of data collection and kinetic derivation of total distribution volume (VT) in the cerebellum, according to two-compartment model
Regions of interest. Data from all the other regions were fitted to a three-compartment model, with the K1/k2 ratio constrained to the value of the cerebellum VT (two-compartment model). Results are presented in Table 1. The BP ranged from 8.44 ± 2.57 mL g−1 (putamen) to 1.45 ± 0.41 mg g−1 (thalamus). Despite these large regional differences in D1 receptor density, the constrained model performed well in all regions.
Kinetic analysis of regional [11C]NNC 112 uptake in human brain (three-compartment model constrained)
Values are mean ± SD of six subjects, with each value measured twice. Ant., anterior; BP, binding potential; Ctx, cortex; DLPF Ctx, dorsolateral prefrontal cortex; DVR, distribution volume ratio; MPF Ctx, medial prefrontal cortex; OF Ctx, orbitofrontal cortex
Identifiability. Overall, the goodness of fit was excellent, and VT values were well identified. Regional VT mean identifiability was 5.75 ± 6.25% (n = 144, range from 2.60 ± 0.69% in the putamen to 10.6 ± 8.43% in the hippocampus). The BP values were also well identified, with slightly more uncertainty in small-size regions (hippocampus and amygdala) or regions with lower D1 receptor density (thalamus). The BP mean identifiability was 11.9 ± 13.0% (n = 144, range from 3.42 ± 0.98% in putamen to 24.4 ± 16.7% in thalamus). The K1 values were very well identified in all regions (2.36 ± 0.45%, n = 144, range from 1.60 ± 0.65% in temporal cortex to 3.03 ± 1.24% in caudate). A larger but still acceptable uncertainty was associated with values of k3 (23.4 ± 16.3%, n = 144) and k4 (33.5 ± 27.6%, n = 144). The k3/k4 ratios were well identified (12.9 ± 16.1%, n = 144).
Validity. A significant relationship was observed between PET and autoradiography measures (r2 = 0.97, P < 0.001) and between PET and homogenate membrane measures (r2 = 0.94, P < 0.001) (Table 2).
Regional distribution of D1 receptors in human brain: in vitro versus in vivo comparison
Ant., anterior; BP, binding potential; Ctx, cortex; DLPF Ctx, dorsolateral prefrontal cortex; OF Ctx, orbitofrontal cortex; PET, position emission tomography.
Specific binding of [3H]SCH 23390 (1 nmol/L) in the presence of mianserin (20 nmol/L) measured with autoradiography (from De Keyser et al., 1988).
Bmax of [3H]SCH 23390 [3H]SCH 23390 in the presence of ketanserin (40 nmol/L) measured in homogenated tissue (from Hall et al., 1988).
BP values obtained with [11C]NNC 112 imaging in 6 healthy volunteers.
Stability. The ROI time-activity curves were fit to shorter scanning time durations, and BP values derived from shorter scanning times were expressed relative to BP values derived with 120-minute scanning time. Average BP values were 101 ± 8% at 105 minutes (n = 120, 12 regions × 10 scans), 102 ± 8% at 95 minutes, 104 ± 8% at 85 minutes, 117 ± 42% at 75 minutes, and >120% for shorter scanning times. At 85 minutes (corresponding to a total scanning time of 90 minutes), all regions were within 10% of the reference value, with the exception of the thalamus (112 ± 30%). Figure 8 displays time-stability analyses of the caudate and the DLPFC.

Relationship between duration of data collection and kinetic derivation of binding potential (BP) in the caudate
Reproducibility. The reproducibility of regional VT and DVR are presented in Table 3. For each outcome measure, Table 3 lists the regional mean, between-subject SD (%CV), within-subject SD (%CV), test/retest variability, and ICC. For each outcome measure and each region, the between-subject SD exceeded the within-subject SD (Fig. 9). The test/retest variabilities of VT and DVR were 8 ± 2 and 6 ± 2%, respectively. The ICC of VT (0.91 ± 0.06) was slightly but significantly better than the ICC of DVR (0.84 ± 0.10, paired t test, P = 0.0409). The reproducibility of BP and k3/k4 is presented in Table 4. The test/retest variability of BP (14 ± 5%) and k3/k4 (13 ± 5%) was comparable and larger than for VT (8 ± 2%) and DVR (6 ± 2%). However, for both measures, the between-subject SD exceeded the within-subject SD and the reliability was excellent (BP ICC = 0.90 ± 0.06, k3/k4 ICC = 0.84 ± 0.11). The reliability of BP compared with k3/k4 was significantly better (paired t test, P = 0.017).
Reliability of [ 11 C]NNC 112 distribution volume and distribution volume ratio measurements in human brain: kinetic analysis
Ant., anterior; BSSD, between-subject SD; Ctx, cortex; DLPF Ctx, dorsolateral prefrontal cortex; DVR, distribution volume ratio; ICC, intraclass coefficient; MPF Ctx, medial prefrontal cortex; OF Ctx, orbitofrontal cortex; WSSD, within-subject SD.
Reliability of [11C]NNC 112 BP and k3/k4 measurements in human brain: kinetic analysis
Ant., anterior; BP, binding potential; BSSD, between-subject SD; Ctx, cortex; DLPF Ctx, dorsolateral prefrontal cortex; ICC, intraclass coefficient; MPF Ctx, medial prefrontal cortex; OF Ctx, orbitofrontal cortex; WSSD, within-subject SD.

Relationship between [11C]NNC 112 binding potential (BP) in dorsolateral prefrontal cortex in test (x axis) and retest (y axis) conditions. The solid line represents the line of identity. Linear regression of measured values had an r2 of 0.82, corresponding to an intraclass coefficient of 0.93.
Graphical analysis
By visual inspection, Logan plots achieved linearity at times of >25 minutes, and a linear regression was performed on data from the 30- to 120-minute interval (Fig. 10). The VT, BP, DVR, and k3/k4 values derived from graphical analysis are listed in Table 5. Regional VT derivedby the graphical method was highly correlated with VT derived by the kinetic method (r2 = 0.99, P < 0.001, n = 144) but across regions was slightly lower by an average of 7 ± 3% (repeated-measures analysis of variance, P < 0.001). Regional DVR, BP, and k3/k4 values derived by graphical method were also highly correlated with values derived with kinetic method (r2 = 0.99, P < 0.001) but were significantly lower, by averages of 14 ± 3, 23 ± 0.10, and 28 ± 9%, for DVR, BP, and k3/k4, respectively.
Graphical analysis of regional [11C]NNC 112 uptake in human brain
Values are mean ± SD of six subjects, with each value measured twice. Ant., anterior; BP, binding potential; Ctx, cortex; DLPF, dorsolateral prefrontal cortex; DVR, distribution volume ratio; MPF Ctx, medial prefrontal cortex; OF Ctx, orbitofrontal cortex.

Graphical analysis of [11C]NNC 112 uptake in caudate (open circles), dorsolateral prefrontal cortex (filled circles), and cerebellum (triangles) after injection of 18 mCi of [11C]NNC 112 (same experiment as in Fig. 6). The linearity of the relationship at normalized times of >15 min (equal to 25 minutes of real time) confirms the reversibility of the binding.
Tables 6 and 7 display the reproducibility parameters of the graphical method. As observed for kinetic analysis, the test/retest variability of VT (10 ± 2%) and DVR (11 ± 2%) were lower than that of BP (24 ± 7%) and k3/k4 (28 ± 8%). The reliability of the absolute measures (VT ICC = 0.88 ± 0.05, BP ICC = 0.72 ± 0.17) was superior to the reliability of the relative measures (DVR ICC = 0.60 ± 0.19, k3/k4 ICC = 0.58 ± 0.21). The test/retest variability of all outcome measures derived with graphical analysis was higher than results of kinetic analysis (e.g., graphical BP variability = 24 ± 7%, kinetic BP variability = 14 ± 5%, paired t test, P < 0.001). As a consequence, the reliability of outcome measures derived with graphical analysis was lower than with the kinetic analysis (e.g., graphical BP ICC = 0.72 ± 0.17, kinetic BP ICC = 0.90 ± 0.06, paired t test, P < 0.001).
Reliability of [11C]NNC 112 distribution volumes and distribution volume ratio measurements in human brain: graphical analysis
Ant., anterior; BSSD, between-subject SD; Ctx, cortex; DLPF Ctx, dorsolateral prefrontal cortex; DVR, distribution volume ratio; ICC, intraclass coefficient; MPF Ctx, medial prefrontal cortex; OF Ctx, orbitofrontal cortex; WSSD, within-subject SD.
Reliability of [11C]NNC 112 BP and k3/k4 measurements in human brain: graphical analysis
Ant., anterior; BP, binding potential; BSSD, between-subject SD; Ctx, cortex; DLPF Ctx, dorsolateral prefrontal cortex; ICC, intraclass coefficient; MPF Ctx, medial prefrontal cortex; OF Ctx, orbitofrontal cortex; WSSD, within-subject SD.
Noise analysis
Noise analysis was done to compare the effect of realistic noise in the images on results from the kinetic and graphical analyses (Fig. 11). A typical experiment was used to generate a simulated data set that included one cerebellar curve (two-compartment model, K1 = 0.120 mL g−1 min−1, k2 = 0.0526 min−1, VT = 2.28 mL g−1) and one caudate curve (three-compartment model, K1 = 0.130 mL g−1 min−1, k2 = 0.0570 min−1, k3 = 0.110 min−1, k4 = 0.034 min−1, VT = 9.65 mL g−1, BP = 7.38 mL g−1, k3/k4 = 3.24). Random noise was then introduced as described in Materials and Methods, and noisy data sets were analyzed with kinetic and graphical analyses. Seven noise levels were investigated (SDJ of 0.04, 0.05, 0.09, 0.11, 0.13, 0.17, and 0.21), and for each noise level, 500 curves were created and analyzed. For each region and noise level, Table 8 lists the average ±SD VT values derived with the 500 simulations, the bias (i.e., difference between average of 500 simulations and true VT), and the error (%CV of the 500 simulations).
Noise analysis of cerebellum and caudate: comparison of kinetic and graphical methods
Noise level observed in experimental data set.

Noise analysis.
Both analyses were very robust as far as the cerebellum was concerned. For cerebellar kinetic analysis, the bias was negligible, and the error was very small (both bias and error < 1%). Cerebellar graphical analysis showed higher bias and errors than the kinetic analysis but remained within acceptable levels at each noise level.
Analysis of caudate curves was more sensitive to noise. Kinetic analysis showed a small bias (+2.7% at higher noise level), and the error increased to ±14% at the highest noise level. Graphical analysis was characterized by a pronounced bias. As noise increased, graphical VT significantly decreased, up to −38% at the highest noise level. The error on graphical VT was comparable with that on the kinetic VT, except that kinetic errors were larger than graphical errors at the two highest noise levels. The bias associated with graphical analysis (lower value as noise increases) was consistent with the results obtained on the experimental data set.
DISCUSSION
This study confirms the excellent imaging properties of [11C]NNC 112 previously reported in two healthy volunteers (Halldin et al., 1998). In addition, model-based analysis demonstrates that kinetic analysis of brain regional uptake of [11C]NNC 112 provides reliable estimates of D1 receptor BP not only in the striatum but also in extrastriatal areas such as the limbic and neocortical regions. To our knowledge, this study represents the first demonstration of the reliability of the measurement of D1 receptor BP in extrastriatal areas. In a recent study of the reproducibility of D1 receptor quantification with [11C]SCH 23390, only measurements in the striatum were reported (Chan et al., 1998).
Comparison with [11C]SCH 23390
Results of this study indicate that [11C]NNC 112 achieves higher specific/nonspecific DVRs at equilibrium than [11C]SCH 23390. In the putamen, [11C]SCH 23390 k3/k4 value derived by graphical analysis was 0.85 ± 0.18 (Chan et al., 1998), lower than the value reported here with [11C]NNC 112 in the same region (3.32 ± 0.80). In the PFC, Okubo et al. (1997) reported an [11C]SCH 23390 k3/k4 value of 0.41 ± 0.06, lower than the k3/k4 value observed with [11C]NNC 112 in the DLPFC (0.77 ± 0.33). This improvement is due not only to the higher affinity of [11C]NNC 112 (K D = 0.18 nmol/L; Andersen et al., 1992) compared with [11C]SCH 23390 (0.4 nmol/L; Billard et al., 1984) but also to lower nonspecific binding. Cerebellar VT is 2.54 ± 0.42 mL g−1 for [11C]NNC 112 versus 3.58 ± 0.64 mL g−1 for [11C]SCH 23390 (Chan et al., 1998). This higher signal/noise ratio is especially valuable for quantification of D1 receptors in the neocortex.
Neocortical region-of-interest sampling
In this study, an original method was developed for sampling of activity in neocortical regions. Measurement of activity in large subcortical regions such as the striatum is generally straightforward, inasmuch as the boundaries of activity can be well defined anatomically. In the striatum, MRI-guided ROI definition does not substantially improve the reliability of the measurement compared with direct definition of the ROI on the PET image based on activity distribution (Wang et al., 1996). In contrast, measurement of activity distribution in GM cortical areas is a difficult problem due to the extent and complexity of cortical gray geometry (see discussion in Koepp et al., 1997). The simplest method is to draw geometrically simple structures (spheres, rectangles) directly on the PET scan, but this method is unreliable and influenced by the activity distribution on the PET image. Thus, defining the ROI on a coregistered MR image is preferable. However, tracing contours of a cortical region manually around the GM on high-resolution 128-slice MRI is time consuming and subject to rater bias and errors. In this study, we used a method based on coregistration of the PET data set to a segmented MRI data set, that is, a data set in which each pixel has been designated as GM, WM, or CSF value. Only the activity located in the pixels classified as GM is sampled. The advantage of constraining ROI on GM is that the segmentation procedure is operator independent and much faster and reliable than manual tracing of GM/WM/CSF boundaries. The only operator-driven choice is the choice of the limits between cortical regions, which follows well-defined criteria. The excellent reliability of the cortical measurement of [11C]NNC 112 BP reported in this study indirectly supports the robustness of the method.
It is, however, important to keep in mind that this method has several limitations. Like all MRI-based ROI definitions, this method is sensitive to errors in PET-MRI coregistration. In this study, we used the minimization procedure developed by Woods et al. (1993). This procedure has a reported average three-dimensional error of <2 mm for between-modality registration, which is acceptable, given a camera resolution of 6 mm. Second, this sampling method results in inherent subsampling, inasmuch as the activity spilling over the WM is not measured. This subsampling is desirable from an accuracy point of view but results in a loss of information that might translate into increased noise. Finally, this method does not correct for partial voluming, which is considerable in the cortex. Yet, it provides a necessary first step for implementation of ROI-based partial volume correction (Rousset et al., 1998). In fact, partial volume correction would result in a recovery of the spilled-over information. Thus, partial volume correction has the potential not only to increase the accuracy of the measurement but also to reduce the error by using the information that is rejected in the present method. We are currently developing and testing a partial volume correction method that would address these issues.
Evaluation of kinetic approach
Evaluation of model-based methods for quantification of receptor parameters includes assessment of various criteria, namely, identifiability, validity, stability, and reliability. The identifiability is a mathematical dimension referring to the confidence in the value of the parameter as derived by the fitting process. The validity refers to the physiological plausibility of the parameters and is usually determined in relationship to in vitro results. The stability refers to the independence of the parameter estimate from the amount of data collected, in this case the duration of the experiment. The reliability expresses the test/retest variability in relation to between-subject differences. Kinetic analysis of [11C]NNC 112 regional uptake performed well on all of these criteria.
Kinetic analysis of the cerebellum presented an interesting problem. The three-compartment model provided a slightly but significantly superior fit to the data compared with the two-compartment model. The observation of a better fit by the three- compared with the two-compartment model for a region of a reference is the rule rather than the exception in neuroreceptor imaging (Frost et al., 1989; Logan et al., 1990; Laruelle et al., 1994b, 1994c; Lammertsma et al., 1996; Abi-Dargham et al., 1997; Ito et al., 1998). The absence of specific binding in the cerebellum is confirmed by the absence of displacement of [11C]NNC 112 uptake in the cerebellum after injection of SCH 23390 (2.5 mg/kg) (Halldin et al., 1998). Thus, the small second tissue compartment revealed by the three-compartment model's superior fit might reflect a slow buildup of radiolabeled metabolites in the brain or the existence of a nonspecific binding component with slow kinetics. The presence of a statistically superior fit with the three-compartment model would usually suggest the use of a three-compartment model to derive cerebellar VT. However, in this case, VT derived by the three-compartment model was less well identified than with a two-compartment model, was markedly affected by the duration of the scan, had a poor test/retest reproducibility (negative ICC), and provided values much larger than the value of VT derived by the graphical analysis. Thus, whereas the goodness of fit favored the three-compartment model, all other criteria favored the two-compartment model. For these reasons, we favored the two-compartment model, that is, the solution that provides superior identifiability, stability, validity, and reproducibility.
For ROIs, we used a three-compartment model with K1/k2 ratio constrained to the value of VT measured in the cerebellum. Whereas an unconstrained three-compartment model is usually appropriate to derive VT, the separate derivation of V2 and BP is not reliable with an unconstrained three-compartment model (Laruelle et al., 1994d, 1994e; Lammertsma et al., 1996). The constrained approach takes advantage of the knowledge of V2 derived from the region of reference while accommodating regional differences in blood flow. The validity of this approach is supported by the homogeneous distribution of the inactive enantiomer (−)[11C]NNC 112 within the human and baboon brain (Halldin et al., 1998). The goodness of fit was excellent, with even distribution of the residuals, and regional distribution volumes were well identified.
A significant relationship was observed between regional BP and regional density of D1 receptor measured in vitro in the human brain (Table 2). Yet, subtle differences were observed. In vitro, D1 receptor density is slightly higher in caudate than putamen, whereas the reverse is true in the PET measurements. In vitro, D1 receptor density is higher in the amygdala than hippocampus, but this difference is less pronounced in the PET measurements. In vitro, frontal D1 receptor density is higher than occipital D1 receptor density, but PET measurements gave similar BP. These differences can be accounted for by partial volume effects, resulting in relative underestimation of smaller regions versus larger regions (caudate versus putamen, amygdala versus hippocampus, frontal GM versus occipital GM).
Results of kinetic modeling were essentially stable from 90 minutes to the end of the experiment. The stability over time of the solution means that the model describes the data appropriately and that enough data were collected to inform the model. Remarkably, this was true for both striatal and extrastriatal areas. It is often difficult to accurately measure receptor parameters with the same radiotracer both in regions with high and low receptor densities. High affinity is required for a high signal/noise ratio in regions with low receptor density, but high affinity might preclude favorable kinetics in regions with high receptor density. [11C]NNC 112 achieves an effective balance between these competing requirements. However, 90 minutes of data was needed to achieve stability of the outcome measure. The three-dimensional acquisition mode, with its higher sensitivity, is crucial to enable data collection up to 120 minutes. The use of the polyurethane head mold was effective, as only minimal head movement was recorded during this prolonged time frame. Head movement was measured by comparing the spatial location of the 11C markers on each frame and was <2.5 mm per frame in ∼90% of the cases. Thus, the combination of the high sensitivity of the three-dimensional mode and the effective headholder allowed collection of enough data to achieve stability of the results.
Reproducibility and outcome measures
Several statistical parameters and outcome measures are used in the PET/single photon emission computed tomography neuroreceptor literature to report the reproducibility of the measurements. The most straightforward reproducibility parameter is the absolute test/retest variability, calculated as the absolute difference between test and retest values divided by their average (Abi-Dargham et al., 1995; Seibyl et al., 1995; Smith et al., 1998). A second method consists of reporting the within-subject SD (or %CV) of the measurement (Chan et al., 1998). When the number of repeat scans is 2, the within-subject %CV is smaller than the variability by a factor of 21/2. Finally, the ICC provides a measure of the reproducibility of the measurement relative to between-subject differences and is the most useful measure to compare results of various methods.
In this study, four outcome measures (VT, DVR, BP, and k3/k4) were compared in terms of reproducibility and reliability. These commonly used outcome measures in the PET literature have very different relationships vis à vis receptor Bmax and KD. In essence, VT and BP are “absolute” measures that relate brain distribution volumes to the input function and are expressed as milliliters (of plasma) over grams (of tissue). VT is the total distribution volume, including free, nonspecific, and specific binding. The use of VT as outcome measure should be restricted to situations when nonspecific binding cannot be measured for a lack of region of reference and is negligible compared with the specific binding, such as in the case of [11C]flumazenil (Koeppe et al., 1991) or [123I]iomazenil (Abi-Dargham et al., 1994). Otherwise, BP is the outcome measure of choice because it is directly equal to the Bmax/KD ratio and does not include nonspecific binding. The reproducibility of VT is intrinsically superior to BP, as BP is derived by the difference between two VT measurements (VT of the ROI and VT of the region of reference), and this is especially true in regions with low BP relative to V2. In this study, the test/retest variability of VT (8 ± 2%) was better than the variability of BP (14 ± 5%).
In contrast to VT and BP, DVR and k3/k4 are “relative” outcome measures in the sense that they represent distribution volumes “normalized” by the region of reference, in this case the cerebellum. Both outcome measures are related (DVR − 1 = k3/k4), yet only k3/k4 expresses a meaningful quantity. Although both outcome measures are linearly related to BP, the relationship between DVR and BP has an intercept of 1, meaning that DVR assumes a value of 1 in a region with BP = 0. Thus, DVR is inappropriate to express receptor density. However, studies reporting reproducibility of neuroreceptor measurements frequently report DVR instead of k3/k4 (Volkow et al., 1993; Chan et al., 1998; Smith et al., 1998). The reproducibility of DVR is intrinsically better than the reproducibility of k3/k4, and this is especially true in regions with low receptor density. In this study, the test/retest variabilities of DVR and k3/k4 were 6 ± 2 and 13 ± 5%, respectively. The DVR variability observed in this study (6%) is in the same range as the variability of DVR for [11C]raclopride in the striatum (5%) (Volkow et al., 1993), [11C]SCH 23390 in the striatum (within-subject %CV of 3%) (Chan et al., 1998), and [18F]altanserin in cortical regions (ranging from 7 to 9%, n = 8) (Smith et al., 1998).
BP versus k3/k4
As a region of reference can be measured for [11C]NNC 112 (i.e., cerebellum), the only meaningful choices of outcome measures for clinical studies are either BP or k3/k4. Usually, the reproducibility of k3/k4 is better than BP because the data are normalized by cerebellum VT and k3/k4 is less sensitive to errors associated with plasma input function measurement and cross-calibration between PET camera and well counter. However, in this study, the average test/retest variability of BP (14 ± 5%) was similar to that of k3/k4 (13 ± 5%), and the reliability of BP (ICC = 0.90 ± 0.06) was slightly better than that of k3/k4 (0.84 ± 0.11). Thus, according to these data, both outcome measures essentially perform similarly, and the choice between the two depends on the inherent assumptions one is willing to accept.
The BP as expressed in Eq. 10 is the only expression of BP that is exclusively dependent on receptor parameters and unbiased by factors unrelated to receptors. However, in this study, BP was not corrected for f1. The free fraction of [11C]NNC 112 in plasma was low (1.02 ± 0.28% of total plasma concentration), and no between-subject differences in f1 were reliably detected (f1 ICC = −0.20). Thus, the within-subject %CV associated with f1 measurement (±34%) was of the same magnitude as the between-subject %CV in f1 (±27%). Under these conditions, individual f1 values should not be used in the calculation of the distribution volumes (Abi-Dargham et al., 1995). Yet, the use of this uncorrected BP is associated with a potential bias introduced by between-subject differences in f1. Although this study did not support the utility of this correction in healthy subjects, it may still be necessary to control for potential alterations of plasma protein binding in pathological conditions. On the other hand, the cost associated with the use of k3/k4 is the potential bias introduced by between-subject differences in nonspecific binding, which might lead to between-subject differences in k3/k4 that are unrelated to receptor parameters. True between-subject differences in nonspecific binding were demonstrated in this study (the between-subject %CV of cerebellum VT was 17%). Thus, both the uncorrected BP used in this study and k3/k4 have potential pitfalls. Between-subject differences in plasma protein binding would bias BP but not k3/k4. Between-subject differences in nonspecific binding to brain tissue would bias k3/k4 but not BP. With these limitations in mind, the slightly better reliability of BP compared with k3/k4 suggests that BP should be the outcome measure of choice for between-subject comparison. Yet, this advantage might be offset by the convenience of measuring k3/k4 using the cerebellum as input function instead of the plasma (Ichise et al., 1996; Lammertsma et al., 1996; Logan et al., 1996). Additional analysis comparing the reliability of k3/k4 measurement with and without plasma input function is required to resolve this issue.
Kinetic versus graphical analysis
Distribution volumes derived by graphical and kinetic analyses were strongly correlated (r2 = 0.99, P < 0.001), but results of graphical analysis resulted in lower values of VT (by 7 ± 3%), BP (by 23 ± 10%), and k3/k4 (28 ± 9%). Differences between graphical and kinetic BP and k3/k4 can be partially accounted for by the fact that cerebellum VT was higher with graphical than kinetic analysis, which might be linked to an underestimation of cerebellum VT as derived by two-compartment model. However, these differences are also due to differences in ROI VT. A systematic difference between kinetic and graphical VT was an unexpected finding. Another unexpected difference was the lower reproducibility of graphical VT (test/retest variability = 10 ± 2%, ICC = 0.88) compared with kinetic VT (8 ± 2%, ICC = 0.91). The lower reproducibility of graphical compared with kinetic outcomes was even more accentuated for BP and k3/k4. This result undermines the claim that graphical analysis is less sensitive to noise and statistical fluctuation of the data than kinetic analysis (Volkow et al., 1993).
Analysis of the effect of noise on the outcome measure derived by both methods was performed to better understand this phenomenon. Realistic noisy data sets were generated in which the noise increased with the duration of the experiment proportionally to the decay of 11C. The impact of the noise on the outcome measure was very different for kinetic and graphical analyses. As noise increased, both methods were associated with a similar variability, but VT derived by kinetic analysis remained essentially unbiased, whereas the graphical analysis yielded systematically lower values. As the noise was randomly distributed around the true values with zero mean, the noise-introduced bias toward lower outcome measures in the graphical analysis was not intuitively predicted. This noise-dependent bias was introduced by the linearization process in which both predictor and response variables are subject to error. This phenomenon, observed here for the graphical analysis, is comparable with the noise-dependent bias introduced by the Scatchard transformation used in analysis of saturation binding data (Munson and Rodbard, 1980) and has been previously described (Carson, 1993).
At the highest noise level (SDJ = 0.21), this bias resulted in a 38% underestimation of VT by graphical analysis (i.e., the mean of the 500 simulations was 38% lower than the true value used to generate the curves). Simulations performed on other regions such as the DLPFC yielded similar results (data not shown). We also observed that the noise-dependent bias was related to the slope of the regression (i.e., on the magnitude of VT). Therefore, regions with lower receptor density were less vulnerable to this effect than regions with high receptor density (cerebellum < DLPFC < caudate). This slope dependency also contributed to the greater underestimation of BP and k3/k4 by graphical analysis compared with VT, to the extent that regions with receptors were more affected than cerebellum.
A large range of noise was used to generate these simulations (SDJ from 0.04 to 0.21). To determine which level of noise corresponded to the experimental noise, we compared the mean sum of squares of the residuals as well as the identifiability (percent error) of VT at each level of noise in the simulations with the mean sum of squares and percent error effectively observed in the experimental data set. The mean sum of squares observed in the caudate in the experimental data was 0.35 ± 0.21 nCi/mL (n = 12). The simulations with SDJ of 0.04 and 0.05 gave mean mean sums of squares of 0.25 and 0.39 nCi/mL, respectively. The percent error associated with VT in the caudate in the experimental data was 2.6 ± 0.7% (n = 12). The simulations with SDJ of 0.05 and 0.09 had percent errors of 1.8 and 3.2%, respectively. Thus, the experimental mean sum of squares falls between SDJ of 0.04 and 0.05, and the experimental percent error falls between SDJ of 0.05 and 0.09. Therefore, the simulation with SDJ of 0.05 appears to provide a noise level comparable with the noise observed experimentally.
At this noise level, caudate kinetic analysis had no bias (−0.02%), but the graphical analysis had a significant bias (−9.41%). The bias associated with graphical analysis at this noise level might account for the lower VT measured in the caudate with the graphical analysis (9.3 ± 3.6 mL g−1) compared with the kinetic analysis (10.3 ± 3.6 mL g−1). This difference was significant (repeated-measures analysis of variance, P = 0.007) and averaged −11 ± 7%. In addition, as the noise might vary from test to retest condition, this phenomenon might explain the lower reproducibility of the graphical method compared with the kinetic analysis. These results indicate that kinetic modeling is the method of choice for the derivation of D1 receptor BP or k3/k4 with [11C]NNC 112.
CONCLUSION
This study confirms that [11C]NNC 112 is a superior radiotracer with which to measure D1 receptors in both the striatal and the extrastriatal areas. The main limitation of this radiotracer is that 90 minutes of data collection is needed to yield stable outcome measures. Simplified approaches that do not require input function measurement still need to be evaluated. Because it allows reliable estimation of D1 receptor parameters in extrastriatal areas, [11C]NNC 112 might play an important role in elucidating the role of D1 receptors in cognition in health and disease and in psychopathology.
Footnotes
Abbreviations used
Acknowledgements
The authors thank Suehee Chung, Bryan Bergert, Ann Shinn, Alexei Kartachov, Richard Weiss, Analia Arevaldo, Julie Montoya, Daniel Schneider, and the technologists of the Columbia PET Center for excellent technical assistance.
