Abstract
Background
The relative clinical benefit of histopathology and computed tomography (CT) in patients with idiopathic interstitial pneumonia (IIP) is under debate.
Purpose
To analyze thin-section CT features and histopathologic findings in patients with usual interstitial pneumonia (UIP) in the clinical context of idiopathic pulmonary fibrosis (IPF), and to evaluate and compare diagnostic accuracy of the two methods among patients with an appropriate spectrum of IIP.
Material and Methods
The study included 91 patients (49 men; mean age 53.2 years; median follow-up 7.2 years) with clinically suspected interstitial lung disease. All underwent surgical lung biopsy and thin-section CT. Two independent readers retrospectively assessed the CT images for the extent and pattern of abnormality and made a first-choice diagnosis. Two pathologists retrospectively assessed the histopathologic slides. In 64 patients with IIP, a retrospective composite reference standard identified 41 patients with UIP. CT characteristics of UIP and IIPs other than UIP were compared with univariate and multivariate analyses.
Results
There was good agreement between the readers for the correct first-choice CT diagnosis of UIP (κ = 0.79). The sensitivity, specificity, and positive predictive value of the CT diagnosis of UIP were 63%, 96%, and 96%, respectively. The sensitivity, specificity, and positive predictive value of the histological diagnosis of UIP were 73%, 74%, and 83%, respectively. The CT feature that best differentiated UIP from IIPs other than UIP was the extent of reticular pattern (odds ratio, 5.1).
Conclusion
Surgical lung biopsy may not be warranted in patients with thin-section CT diagnosis of UIP.
Keywords
A group of non-infectious, acute or chronic, diffuse parenchymal lung disorders are classified as interstitial lung disease (ILD). More than 150 clinical conditions and/or causes are associated with ILD (1). The idiopathic interstitial pneumonias (IIPs) are a heterogeneous subset of ILD resulting from damage to the lung parenchyma by varying patterns of inflammation and fibrosis. The American Thoracic Society (ATS) and European Respiratory Society (ERS) described the seven clinic-pathologic entities which are sufficiently different from one another to be designated as separate disease entities (2). The two most common are idiopathic pulmonary fibrosis (IPF) (earlier denoted cryptogenic fibrosing alveolitis) and non-specific interstitial pneumonia (NSIP). However, NSIP is still an area of uncertainty, and there has been continuing debate on whether it is a real clinical entity or not (3).
IPF is characterized by the histological and radiological pattern of usual interstitial pneumonia (UIP), and accounts for up to 64% of the patients with IIP (4). In this article the term UIP is used in histological and radiological contexts even if the clinical diagnoses of all the patients used in the analyses were IPF. Patients with IPF are mainly over 50 years of age; onset of symptoms is typically gradual, with dry cough and progressive dyspnea. IPF has a poor prognosis, with a median survival of 2.8 years, and 5 years survival between 20% and 40% (4, 5). The prognosis of patients with IPF is considerably worse than the prognosis of patients with disorders mimicking IPF; and, in contrast to other IIPs, IPF does not respond to steroid treatment (2, 6–8). Thus, correct diagnosis is a crucial factor in determining patient prognosis and therapeutic intervention, and one of the key messages from the ATS/ERS International Multidisciplinary Consensus Classification is to separate patients with IPF from those with other chronic IIPs (2).
Thin-section computed tomography (CT) is the most accurate non-invasive method of imaging the lung parenchyma, and enables surveillance of whole-lung morphologic changes. In contrast, confidence of histopathologic diagnosis is restricted both by the small volume of tissue used for making a diagnosis, and by the frequent histopathologic variability between different lung segments observed in patients with IIP (9, 10). Furthermore, there is significant discordance between general and pulmonary pathologists in their interpretation of the histopathology in ILD (11). Although several groups have highlighted the diagnostic accuracy of thin-section CT in IIP diagnosis and management (2, 8, 12–14), surgical lung biopsy has been considered necessary for diagnosing most IIPs, in particular because the ability of thin-section CT to distinguish between NSIP and UIP is debated (3, 7, 15).
In the present study we retrospectively analyzed thin-section CT features and histopathologic findings in patients with IIP. In contrast to previous reports where thin-section CT findings are compared to histopathologic features, we questioned the status of lung biopsy as a robust reference diagnostic tool by utilizing a retrospective ‘composite’ reference standard where all clinical, radiological, and histopathologic data were available for evaluation. The composite reference standard made us able to evaluate the power of CT features and histopathologic findings as diagnostic tools at baseline.
The clinical benefit of histopathology and CT is under debate. The purpose of our study was to evaluate and compare the accuracy of thin-section CT and histopathology in the diagnosis of UIP (and the clinical diagnosis of IPF) among patients with an appropriate spectrum of IIP, referred to a tertiary medical centre.
Material and Methods
The institutional review board for medical research approved this retrospective study, informed consent was waived.
Patients
As a regional and national centre of chronic lung diseases in Norway, the department of lung medicine in our University Hospital has received patients with a high frequency of ILD for many years. Patient selection was made by reviewing medical records of all patients between 1992 and 2007 who clinically were suspicious of having interstitial lung disease. One hundred and twenty-eight patients who underwent both high-resolution CT and surgical lung biopsy were identified. From this initial cohort, patients were included in the study if (a) the CT examinations were judged to be consistent, and (b) the lung disease was not associated with connective tissue disease, environmental exposure and/or drug toxicity. A total of 104 patients met these criteria. Thirteen patients with histopathologic specimens not available or of suboptimal quality were excluded from analysis. Thus, the study group included 91 patients (49 men and 42 women; mean age 53.2 years; age range 23–79 years). Survival status was identified from medical records, and all patients were followed up from the time of diagnosis until the end of the entire study updated by September 2009. The duration of follow-up ranged from 3 to 17 years (median follow-up 7.2 years). During the follow-up period 45 patients died and five patients underwent lung transplantation due to respiratory failure.
Histopathologic evaluation
The lung biopsies were performed by open thoracotomy or by thoracoscopy. Formalin-fixed paraffin-embedded material was stained with hematoxylin-eosin safran. Specimens were retrospectively studied by light microscopy in consensus by two experienced lung pathologists (EHS and HS, with more than 10 years of experience in evaluating pulmonary surgical biopsies) who were blinded to clinical and radiological features. Interstitial lung disease was classified according to current histopathologic criteria (2).
The pathology core made a highly confident diagnosis in 55 of the patients. In the rest of the patients either alternative diagnoses were considered or the first choice diagnosis was not specific. For statistical analyses in this study, the most probable diagnosis was used.
CT scanning protocol
Thin-section CT images were obtained in the supine position during breath-holding and deep inspiration at 120–140 kV, with 1 or 1.25-mm section thickness at 10-mm intervals. Tube current setting was adjusted to each patient's age and weight. Images were reconstructed with a high-spatial-frequency (bone) algorithm. The lungs were examined from the apex to the base. Supplementary expiratory scans were obtained in 32 patients, and, in five patients, supplementary images were obtained in the prone position. Four different CT scanners (CT/T 9800, HiSpeed CT/i, LightSpeed 16 and LightSpeed VCT 64; GE Healthcare, Milwaukee, WI, USA) were used for this study.
Thin-section CT images were interpreted at window settings appropriate for viewing lung parenchyma (width 1000–1500 HU; level −450 to −700 HU). In 52 of the patients images were reviewed on a PACS (Picture Archiving and communication System) workstation (Sectra Medical Systems, Linköping, Sweden). In the remaining 39 patients hard-copy images were available.
Review of thin-section CT images
The images were reviewed separately and in random order by two chest radiologists (TMA and GM, with 18 and 12 years of experience, respectively). The observers were blinded to clinical information and histological diagnosis.
The observers evaluated the presence, extent, and distribution for established CT criteria of ILD on the basis of the recommendations of the Nomenclature Committee of the Fleischner Society (16). These findings included ground-glass opacity, airspace consolidation, reticular pattern, and interlobular septal thickening. The presence of associated findings, such as architectural distortion, traction bronchiectasies and traction bronchiolectasies, micronodules, bronchovascular bundle thickening, pleural irregularity, cysts, emphysema, and air trapping, was also assessed. Cysts were defined as air-filled spaces with sharply demarcated thin walls (16).
Reticular pattern (i.e. the coarseness of fibrosis) was graded as follows: 1, fine intralobular fibrosis without evident cysts; 2, predominantly microcystic reticular pattern involving air spaces smaller than or equal to 4 mm in diameter; and 3, a predominantly macrocystic reticular pattern with air spaces larger than 4 mm in diameter. When ground-glass opacification was superimposed a reticular pattern, the abnormality was recorded as being reticular. When fine intralobular fibrosis was superimposed microcystic reticular pattern, the abnormality was recorded as being microcystic reticular pattern.
The distribution of disease was reviewed in four zones: (a) above the aortic arch; (b) between the aortic arch and the level of the carina; (c) between the level of the carina and the level of the inferior pulmonary veins; and (d) below the inferior pulmonary veins.
Each CT finding was assessed and its presence determined, and the extent of involvement of the findings was evaluated independently for each lung zone. Consequently, they defined the vertical distribution of lung changes. The extent of ground-glass opacity, airspace consolidation, reticular pattern, and interlobular septal thickening in each zone was assigned a score based on the percentage of lung parenchyma that showed evidence of an abnormality. The score was classified as follows: 0, no involvement; 1, 1–4% involvement; 2, 5–14% involvement; 3, 15–29% involvement; 4, 30–49% involvement; and 5, more than 50% involvement. An overall score of parenchymal involvement for each patient was derived by summing the scores of the four CT levels for each finding. Thus, observers were able to score both the overall extent of interstitial lung disease (regardless of pattern) and the extent of individual findings. The observations of the two radiologists were summed for the purposes of analysis.
The extent and severity of traction bronchiectasies and bronchiolectasies were scored following a method used by Sumikawa et al. (17).
After the initial assessment of the CT examination, the two observers independently made a first-choice diagnosis for each patient according to accepted diagnostic criteria (2, 18, 19) Discrepancies in the most probable thin-CT diagnosis were reviewed jointly. The consensus evaluation was used in analysis. The criteria for a confident diagnosis of UIP included large extent of coarse fibrosis (reticular pattern grade 2 and 3), no or minimal ground-glass opacity, and peripheral and basal predominance.
The extent of various CT abnormalities was combined by calculating the average score of the two observers' readings.
Statistical analysis
Statistical analyses were performed with statistical software (STATA version 10 and SPSS, version 16.0; SPSS, Chicago, IL, USA).
Inter-observer variation for evaluation of first-choice HRCT diagnosis was analyzed with the kappa statistic. Inter-observer agreement was classified as poor (κ = 0.00–0.20), fair (κ = 0.21–0.40), moderate (κ = 0.41–0.60), good (κ = 0.61–0.80), or excellent (κ = 0.81–1.00). Inter-observer variation for the extent of various HRCT abnormalities was evaluated with weighted kappa statistics. The association between the score and each lung zone was performed using generalized estimation equations (GEE) for continuous outcomes. In univariate analysis the Mann-Whitney U test was used to compare the various thin-section CT findings in patients with UIP with the CT patterns in patients with other IIPs. Variables identified as significant at univariate analysis were included as covariates in a multivariate dichotomized logistic model using UIP as the dependent variable to assess the predictive value of the various thin-section CT findings in distinguishing UIP from the other types of IIPs. We used standard methods to calculate the sensitivity, specificity, and positive and negative predictive values.
A P value of less than 0.05 was considered to indicate a significant difference.
Composite reference standard
A composite reference standard was used to provide an overall clinical diagnosis at the end of the study. A multidisciplinary team composed of a pulmonologist (AN, with 30 years of experience in interstitial lung disease) and a radiologist (AB, with 6 years of experience in pulmonary radiology) independently evaluated all available clinical, radiological, and histopathologic data by review of patient hospital records. As previous exposure to mineral dust, toxic drug effects, and a diagnosis of connective tissue disease was excluded on inclusion, clinical data included assessment of age, sex, smoking habits, type of treatment and response to therapy, number of deaths and causes thereof, and duration and nature of symptoms. The team also had access to initial and follow-up pulmonary function test results and CT examinations. The multidisciplinary team was blinded to the results of the retrospective review of the initial thin-section CT scans and retrospective histopathologic evaluation made for the purpose of this study.
Results
Observer agreement and extent of various HRCT findings
There was good agreement between the two readers for the correct first-choice diagnosis of UIP (κ = 0.79 and NSIP (κ = 0.62).
The mean scores of abnormalities (in each of the readers and in mean values of both readers) as observed on HRCT are presented in Table 1. There was fair to good interobserver agreement between the readers for the extent of the various abnormalities on CT images (κ = 0.31–0.74). As for the different levels of coarseness of reticular pattern assessed separate, the kappa values were 0.34, 0.29, and 0.39 regarding grades 1, 2, and 3, respectively (data not shown in the table). However, there was good inter-observer agreement for the extent of overall reticular pattern (κ = 0.67).
Mean scores in each reader and in mean values of both readers at four different zones. Test for p-trend was performed using the GEE-procedure (explained in the Methods section). The results of the kappa statistics are also shown
*p < 0.001
R1 = Reader 1, R2 = Reader 2, MS = mean score of R1 + R2, Zone 1 = Above the aortic arch, Zone 2 = Between the aortic arch and the level of the carina, Zone 3 = Between the level of the carina and the level of the inferior pulmonary veins, Zone 4 = below the inferior pulmonary veins, κ (95% CI) = weighted kappa for agreement (95% confidence interval)
p-trend analyses showed significant lower zone predominance for ground glass opacities, traction bronchiectasies, and reticular pattern for both readers except for one reader's evaluation of fine intralobular fibrosis.
HRCT diagnoses and histological diagnoses
Of the 91 included patients with available histological diagnosis, 27 with non-idiopathic interstitial lung diseases were excluded from further analyses. In the remaining 64 patients a composite reference diagnosis was established. Median time difference between HRCT and lung biopsy in these patients was 1 month (range 0–40 months). The group included 41 patients with IPF. Based on the composite reference standard the overall sensitivity, specificity, accuracy, positive predictive value, and negative predictive value of the CT diagnosis of UIP were 63%, 96%, 75%, 96%, and 59%, respectively. One patient with the final composite diagnosis of respiratory bronchiolitis-associated interstitial lung disease (RB-ILD)/desquamative interstitial pneumonia (DIP) was the only CT-diagnosed UIP false-positive patient.
Based on the composite reference standard overall sensitivity, specificity, accuracy, positive predictive value, and negative predictive value of the histological diagnosis of UIP were 73%, 74%, 73%, 83%, and 61%, respectively. Of the six patients with false-positive biopsy diagnosis of UIP, four had NSIP; one had DIP/RBILD; and one patient was given the final composite diagnosis of chronic hypersensitivity pneumonitis.
The consensus CT reading assigned an overall correct diagnosis in 37 (58%) of 64 readings, including 26 (63%) readings in patients with UIP. The histological consensus diagnoses was correct in 34 (53%) of 64 readings, including 30 (73%) cases of UIP.
Thin-section CT findings, comparison between UIP and non-UIP patients
Univariate analysis showed no significant difference between the overall extents of abnormalities as detected on CT images in patients with UIP compared to patients in the group with other ILDs (Table 2). Patients with UIP had a larger extent of reticular opacities (P = 0.002) and traction bronchiectasies (P < 0.001) than did patients in the other group. However, there was no significant difference in the proportion of reticular pattern grade 1 (fine intralobular fibrosis without evident cysts) between patients in the two groups. Patients with UIP had a smaller extent of ground-glass opacities and nodules than patients with other IIPs than UIP (P < 0.001).
HRCT characteristics of UIP and idiopathic interstitial pneumonias other than UIP
Data are mean score of involvement ± standard deviation
Reticular pattern score includes grade 1, grade 2, and grade 3
The univariate analyses indicated that airspace consolidation and interlobular septal thickening was not associated with UIP. Thus, the following variables were included in the logistic model: ground glass opacities, reticular pattern, nodules, and bronchiectasies. The results of these analyses are presented in Table 3. The CT feature that best differentiated UIP from other ILDs than UIP was the extent of reticular pattern (odds ratio 5.1). With two exceptions, all patients with the composite consensus diagnosis of UIP were noticed to have a reticular pattern grade 2 and 3 sum score of five or more. None of those two exceptions were given the first-choice CT diagnose UIP. They died after 14 and 7 years observation time; 68 and 50 years old, respectively.
Odds ratio (OR) of histological verified usual interstitial pneumonia with corresponding 95% confidence interval (95% CI) and P values by different CT findings dichotomized at the upper tertile using logistic regression
Nodules and ground glass opacities were negatively associated with a diagnosis of UIP, in agreement with the crude analyses (Table 2).
Discussion
A continuing bias in observational studies is the power of the ‘gold standard’. Indeed, comparative studies on diagnostic tests without a benchmark that is regarded as definitive, is challenging. In the present study we attempted to overcome the absence of a robust gold standard at baseline by exploiting the retrospective design of the study to establish a ‘composite’ reference standard. A multidisciplinary team were exhibited all clinical, radiological, and histopathologic information and was able to establish a retrospective diagnosis, which obviously is more accurate than a diagnosis given at baseline.
In the present study HRCT findings were analyzed in order to determine the most distinguishing findings for the diagnosis of UIP. It is generally agreed that in many cases of UIP the diagnosis is commonly based on clinical and imaging features, without the need for lung biopsy (6, 20), and several studies have documented good positive predictive value of HRCT diagnosis of UIP (1, 8, 20, 21). However, these studies are biased because they only included patients with biopsy-proven diagnosis, and traditionally histopathologic evaluation has been regarded as a robust gold standard. CT appearances of UIP may overlap with those of other IIPs, and in those cases the diagnosis nevertheless can only be made with the aid of lung biopsy. The innate problem of interstitial lung diseases is, however, the histopathologic variability, including the variation between stages of the disease and variable appearance in different lung segments. Significant inter-observer disagreement even among expert pathologists remains, in particular in the distinction between UIP and NSIP, with more than 50% inter-observer variation in the diagnosis of NSIP (9, 22, 23). Shin et al. reported a discordance rate of 36%, with fair-to-good agreement for the diagnosis of UIP and fibrotic NSIP (24). In contrast to HRCT, only small pieces of lung tissue are used for making histopathologic diagnoses. Even in the same lobe with multiple biopsies variation in the histopathologic pattern between different IIPs may be seen (24). Consequently, it can be questioned whether histopathologic findings alone can be regarded as a robust reference standard. The main message of our study is that HRCT diagnosis of UIP at baseline is more accurate and has better positive predictive value than histological diagnosis. This conclusion is supported by survival analyses which showed that HRCT diagnosis of UIP appears to be the strongest independent predictor of mortality, while a histological diagnosis of UIP is not a significant predictive marker (data not shown). In clinical practice multidisciplinary teams composed of pulmonologists, pathologists, and radiologists are frequently making the final diagnosis of interstitial lung diseases (25). However, this concept may be insufficient because a confident diagnosis of IIP is difficult in patients who do not show all typical clinical, radiological, and histopathologic features at baseline, and diagnosis may be altered according to clinical follow-up such as treatment response and mortality. Furthermore, similar initial CT findings may change over time; one study showed that 28% of the patients with initial CT findings suggestive of NSIP progressed to findings suggestive of UIP, at 3 years or longer follow-up (19).
For these reasons we established a retrospective composite reference standard to make the final diagnosis of interstitial lung disease. This method allowed us to determine the value of radiological findings at presentation by retrospectively taking into consideration also clinical follow-up and mortality. To our knowledge, this method has not been used in similar studies, but we believe it is more accurate than the commonly used baseline reference standard. For study purposes a combination of clinical and histopathologic data as the reference standard has been reported in other contexts (17, 24, 26).
The most useful thin-section CT finding in this study when differentiating between UIP and other IIPs was the greater extent of coarse fibrosis (reticular pattern grade 2 and 3) in the cases of UIP. This is in accordance with previous studies in which honeycombing is shown to be a core finding when distinguishing UIP from other interstitial lung diseases (6, 7, 19). NSIP is characterized by a predominant pattern of ground-glass opacities with minimal to no honeycombing (19, 27) However, a proportion of patients with fibrotic NSIP have HRCT features which resemble those of UIP, and NSIP may be difficult to distinguish from UIP (15, 27). This conception is concordant with the conclusions of Silva et al. (19) in a study of changes in disease pattern of NSIP over time. In that study, no initial CT features suggestive of NSIP allowed distinction between patients who progressed to a UIP pattern and those that maintained an NSIP pattern.
In our study, UIP was characterized by a lower proportion of ground-glass opacities than other ILDs. The extent of ground-glass opacities was in addition an independent variable that could help predict survival in patients with UIP (data not shown). These findings concur with the understanding that ground-glass opacities are reversible and suggest a better prognosis. This finding differs, however, from those in two recent studies where the extent of ground-glass opacities does not have prognostic value (26, 28). An alternative explanation of the discrepancy between the findings in the two previous studies and that in our study, is the considerable overlap in HRCT features between UIP and NSIP, and the better prognosis of NSIP compared with UIP (17). Therefore, cases of NSIP might have been misclassified as UIP.
In our study, like in most other cases, the extent of various findings was subjectively and qualitatively evaluated by the readers. However, substantial inter-observer and intra-observer variation has been reported (29). In the future automated systems for quantification and discrimination of different interstitial lung diseases may be useful. In a recent paper, an automated system based on specific HRCT features was evaluated (30). The authors concluded that their system was able to discriminate UIP from NSIP, and that it may be used for objective and reproducible assessment of regional disease severity.
Our study had limitations. First, it was retrospective. We believe, however, to have derived advantages of the long median follow-up time and the retrospective study design by establishing the composite reference standard where all histopathologic, clinical, and radiological follow-up data, were available for evaluation. The patients underwent CT examinations with different scanners. However, the scanning technique was comparable, and it is not likely that our findings are distorted.
The study is biased towards patients with atypical CT findings because, at our institution, surgical biopsy is seldom required as part of routine clinical survey in patients with evident diagnosis (including Usual Interstitial Pneumonia) as observed on CT images. Consequently, these patients are hardly included in our study. Another limitation is that relatively few patients with other idiopathic interstitial pneumonias than UIP were included, and statistical analyses between UIP and NSIP or other subgroups of IIPs could not be performed.
In conclusion, in contrast to previous reports where thin-section CT findings are compared to histopathology assuming lung biopsy being a robust reference diagnostic tool, a composite reference standard made us able to retrospectively evaluate the power of both CT features and histopathologic findings as diagnostic tools at baseline. Given the results of our study, CT Bdiagnosis of UIP at presentation is more accurate than histological diagnosis although the histological diagnosis is more sensitive. Based on our findings, surgical lung biopsy may not be warranted in patients with a well-founded thin-section CT diagnosis of UIP.
