Published ahead of print on December 11, 2003, doi:10.1164/rccm.200307-1018OC
© 2004 American Thoracic Society The Predictive Value of Bronchiolitis Obliterans Syndrome Stage 0-pDivision of Pulmonary and Critical Care Medicine; and Division of Cardiothoracic Surgery, Washington University School of Medicine, St. Louis, Missouri Correspondence and requests for reprints should be addressed to Ramsey R. Hachem, M.D., Division of Pulmonary and Critical Care Medicine, Washington University School of Medicine (Campus Box 8052), 660 South Euclid Avenue, St. Louis, MO 63110. E-mail: rhachem{at}im.wustl.edu
Bronchiolitis obliterans syndrome (BOS) remains the main cause of graft loss after lung transplantation. Stage 0-p was recently added to the staging criteria to detect early deterioration in allograft function that might presage BOS stage 1. We assessed the predictive value of stage 0-p by retrospectively analyzing spirometric data for 203 adult bilateral lung transplant recipients. The FEV1 criterion for stage 0-p had a positive predictive value of 79% and a negative predictive value of 82%. In contrast, the FEF2575% criterion for stage 0-p had a positive predictive value of 52% and a negative predictive value of 72%. Fifty-seven percent of subjects who developed stage 0-p by the FEV1 criterion progressed to stage 1 within 1 year, whereas only 37% of those who developed stage 0-p by the FEF2575% criterion progressed to stage 1 within 1 year. We conclude that the FEV1 criterion for stage 0-p is a reasonable predictor of BOS stage 1 after bilateral lung transplantation, but the FEF2575% criterion for stage 0-p is not predictive of BOS stage 1 after bilateral lung transplantation.
Key Words: lung transplantation chronic rejection diagnosis spirometry Obliterative bronchiolitis is the histologic hallmark of chronic allograft dysfunction after lung transplantation. Because histologic confirmation of obliterative bronchiolitis is difficult, bronchiolitis obliterans syndrome (BOS), defined by spirometric changes, is the clinical surrogate. Within 5 years after transplantation, the prevalence of BOS approaches 50% (14). Because of a marginal response to therapy, BOS has emerged as the leading cause of late mortality and the major obstacle to long-term success after lung transplantation (3, 4). In the original BOS classification system, the average of the two highest FEV1 measurements after transplantation was defined as the baseline FEV1, and a persistent decrease of 20% or more in FEV1 from this baseline was defined as BOS stage 1 (5). Progressive stages of BOS were defined according to the magnitude of decrease from the baseline FEV1. Midexpiratory flow rates were not used in the original staging system because of wide intrasubject variability. New staging criteria were recently proposed to detect early changes in allograft function that might foretell the onset of BOS stage 1 (6). A potential BOS stage (0-p) was defined as a 10 to 19% decrease in FEV1 or a 25% or more decrease in FEF2575% from baseline (6). The baseline FEF2575% is defined as the average of the two highest FEF2575% measurements obtained after transplantation. The validity of this modification to the staging system has not been examined. The purpose of this study was to determine the predictive value and operating characteristics of BOS stage 0-p among bilateral lung transplant recipients. Some of the results of this study have been previously reported in the form of an abstract (7).
Design A retrospective cohort study design was used to assess the predictive values of the different criteria for BOS stage 0-p. Pulmonary function test files, lung biopsy results, and microbiologic data were reviewed, and the onset of BOS stage 0-p and stage 1 was determined. The protocol was approved by the Washington University School of Medicine Institutional Review Board for human studies.
Patient Population
Pulmonary Function Testing All spirometry was performed according to the American Thoracic Society guidelines in the Washington University School of Medicine pulmonary function laboratory during the first 3 months after transplantation. Thereafter, patients who did not live locally had spirometry performed elsewhere, and decrements in lung function obtained at outside pulmonary function laboratories were repeated in our laboratory.
Variables
The sensitivity, specificity, positive predictive value, and negative predictive value of stage 0-p by each criterion were calculated. Patients who developed BOS stage 0-p but had less than 1 year of follow-up thereafter were excluded from the sensitivity, specificity, and predictive value analysis.
Statistical Analysis Freedom from BOS stage 1 was determined after the onset of stage 0-p by each criterion using the Kaplan-Meier method. Patients who developed stage 0-p but did not progress to stage 1 were censored at the time of their last spirometry. Kaplan-Meier curves were compared using the log rank test. Statistical analysis was conducted using SPSS (version 11.0 for Windows; SPSS, Chicago, IL).
Follow-up was completed through June 30, 2002. The mean duration of observation per patient was 2.7 ± 1.7 years (median of 2.3 years), and the study included 548 person-years of follow-up. We reviewed 6,090 spirometry measurements; the mean number of spirometry measurements per patient was 30 ± 10 (median of 28; range of 1565). The values of the FEF2575% and the modified FEF2575% at baseline and at stage 0-p were normally distributed. The times to baseline and to stage 0-p for both criteria were skewed to the right. The baseline FEF2575% was higher than the baseline modified FEF2575% (4.6 ± 1.7 vs. 3.88 ± 1.44 L/second, respectively; p < 0.0005) and occurred earlier after transplantation (median 62 vs. 264 days, respectively; p < 0.0005) (Table 1) . Likewise, the FEF2575% was higher than the modified FEF2575% at the onset of stage 0-p (3.0 ± 1.3 vs. 2.45 ± 1.08 L/second, respectively; p = 0.002). Stage 0-p occurred earlier by the FEF2575% criterion than by the modified FEF2575% criterion (median 335 vs. 486 days, respectively; p = 0.003).
Freedom from BOS stage 1 is shown in Figure 2 ; the median time from transplantation to the onset of stage 1 was 5.3 years. Our results showed that the FEV1 criterion for stage 0-p had good performance characteristics among bilateral recipients (Table 2 ; corresponding two-by-two tables are shown in Figure E1 in the online supplement), and it was a reasonable predictor of BOS stage 1. Both the positive predictive value and the negative predictive value were relatively high, and thus, the false-positive and false-negative rates were low. However, the sensitivity was only fair (74%), and approximately one-fourth of recipients destined to develop BOS stage 1 were not prospectively identified by this criterion. The FEF2575% criterion for stage 0-p had a slightly better sensitivity (78%), but a much higher false-positive rate diminished its specificity and positive predictive value. Our modified FEF2575% criterion for stage 0-p had operating characteristics that were very similar to the FEV1 criterion. Combining the FEV1 criterion for stage 0-p with either the standard or modified FEF2575% criterion increased the sensitivity slightly, but the combinations basically demonstrated the operating characteristics of their components.
Freedom from BOS stage 1 after the onset of stage 0-p by each criterion is shown in Figure 3 . The median time from the onset of stage 0-p to the onset of BOS stage 1 was longer for the FEF2575% criterion compared with the FEV1 and the modified FEF2575% criteria (2.86 vs. 0.65 and 0.7 years, respectively). In fact, 63% and 62% of patients developed BOS stage 1 within 2 years of developing stage 0-p by the FEV1 and the modified FEF2575% criteria, respectively; on the other hand, only 41% of patients developed BOS stage 1 within 2 years of developing stage 0-p by the FEF2575% criterion.
We retrospectively analyzed spirometric data for 203 adult bilateral lung transplant recipients to determine the predictive value of BOS stage 0-p. Our results suggest that the FEV1 criterion for stage 0-p has reasonable operating characteristics, but its sensitivity is only fair. Furthermore, the FEF2575% criterion for stage 0-p has unacceptably low specificity and positive predictive value. Finally, although the characteristics of the modified FEF2575% criterion parallel those of the FEV1 criterion, the combination of the FEV1 or the modified FEF2575% criteria has the best overall performance. Screening tests for the early diagnosis of disease are often imperfect and invariably yield some false-positive and false-negative results. An ideal test would be both highly sensitive and highly specific. Unfortunately, this is not often possible when clinical data are distributed over a range of values, and an arbitrary threshold distinguishes health from disease. In such situations, sensitivity can only be increased at the expense of specificity and vice versa. This is evident in our data; the FEF2575% criterion for stage 0-p has a higher sensitivity than the FEV1 criterion but a significantly lower specificity. Consequently, a compromise must be reached, and the criterion with the best overall operating characteristics should be used clinically. Unfortunately, the operating characteristics of all criteria for stage 0-p are far from ideal, but the FEV1 criterion has an adequate overall performance. Furthermore, the majority of patients who developed stage 0-p by the FEV1 criterion developed BOS stage 1 within the following year. Midexpiratory flow rates were not used to categorize BOS in the original classification system because of wide intrasubject variability and the very high measurements obtained early after transplantation among bilateral recipients (5). The physiology behind the very high early FEF2575% measurements is unclear. This phenomenon has not been studied previously, and to our knowledge, there have been no published reports investigating it; nevertheless, it is not an infrequent occurrence (6). These changes are typically seen early after transplantation when the FVC and the FEV1 are still reduced in a pattern consistent with a restrictive ventilatory defect without radiographic or histologic evidence of parynchemal lung disease. These are possibly related to healing of the bilateral thoracotomies, resulting in restriction of the chest wall and increased expiratory airflow. Alternatively, the high FEF2575% measurements may result from an occult interstitial pulmonary process that increases the force pulling outward on airway walls, thus increasing airway diameter and airflow, as is sometimes seen early in the course of interstitial lung disease (8, 9). Nonetheless, we hypothesized that these very high early measurements would reduce the predictive value of a decline in FEF2575% because they would be used to define baseline lung function. Thus, we also tested a modified midexpiratory flow rate criterion that defined the baseline value of FEF2575% as the average of the two measurements obtained with the two highest FEV1 measurements. The basis for this definition is that the two highest FEV1 measurements represent a recipient's best graft function, and concurrent measurements of FEF2575% would be more appropriate. Furthermore, this avoids the problem of using early, sometimes supranormal, midexpiratory flow rates that would overestimate the true baseline FEF2575%. Indeed, our results suggest that the standard FEF2575% criterion for stage 0-p lacks specificity (44%) and has a low positive predictive value (52%). Furthermore, only one-third of patients who developed stage 0-p by the FEF2575% criterion developed BOS stage 1 within the following year. The modified FEF2575% criterion was more specific albeit less sensitive. Nonetheless, approximately two-thirds of patients who developed stage 0-p by the modified FEF2575% criterion developed BOS stage 1 within the following year. Overall, the predictive value of the modified FEF2575% criterion parallels that of the FEV1 criterion. This is not unexpected because the two criteria are linked temporally, but it puts the utility of a midexpiratory flow rate criterion into question. Nonetheless, there were patients who developed stage 0-p by the modified FEF2575% criterion and subsequently BOS stage 1 without developing stage 0-p by the FEV1 criterion. This is highlighted by the improvement in the sensitivity of stage 0-p when the combination of either the FEV1 or the modified FEF2575% criteria is used rather than either criterion alone. There are some potential limitations of this study. First, variability in the frequency of pulmonary function testing may have influenced the sensitivity and negative predictive value of BOS stage 0-p. However, patients who had gaps of greater than 12 weeks between spirometry measurements were excluded from the study. Closer follow-up of lung function may detect significant changes earlier; however, this approach is not always attainable in clinical practice. The use of spirometry values from different pulmonary function laboratories may have introduced bias. However, decrements in FEV1 or FEF2575% obtained at outside pulmonary function laboratories were confirmed at our center. Finally, augmentation of immunosuppression in response to a decline in FEV1 or FEF2575% may have altered the progression to BOS stage 1 and thus limited the specificity and positive predictive value of stage 0-p. It has not been our practice to augment immunosuppression solely in response to a decline in FEF2575%; however, we have augmented immunosuppression in the past when the FEV1 declined by 1015% from baseline. Unfortunately, this limitation is intrinsic to a retrospective study, and a prospective study randomizing patients with declining lung function to augmented immunosuppression versus observation is not feasible because of the risk of irreversible loss of lung function. Previous studies have suggested that the FEF2575% may be a more sensitive marker than the FEV1 for the early detection of obliterative bronchiolitis (1012). Patterson and colleagues showed that a decline in FEF2575% occurred 112 days before a 20% decline in FEV1 (10). However, these investigators used a different definition of baseline from the recently proposed International Society for Heart and Lung Transplantation guidelines and set the threshold of a significant decline in FEF2575% to less than 70% of baseline. Likewise, Reynaud-Gaubert and colleagues showed that a decline in FEF2575% to 70% or less of baseline occurred significantly earlier than the decline in FEV1 (11). Most recently, Nathan and colleagues studied the predictive value of a decline in FEF2575% as defined by the new International Society for Heart and Lung Transplantation guidelines in a cohort of single lung recipients (12). The FEF2575% criterion for stage 0-p had a sensitivity of 80% and a specificity of 82.6%. The authors further noted that for single lung recipients with obstructive lung disease, the specificity of this criterion was 69.2% and the sensitivity was 91.7%, whereas for those with restrictive lung disease, the specificity was 100% and the sensitivity was 62.5%. We did not include single-lung recipients in this study because of the relatively small sample size (n = 55) and because we were concerned that combining single-lung recipients with underlying obstructive and restrictive lung disease may interfere with the operating characteristics of stage 0-p. When spirometry measurements are taken months apart in normal subjects, changes are considered significant if they exceed 12% for the FEV1 and 21% for the FEF2575% (13, 14). Martinez and colleagues evaluated the normal variability of spirometry measurements in stable lung transplant recipients; they found that the FEV1 varied by 12.1% and 13.1% among bilateral and single lung recipients, respectively, and the FEF2575% varied by 24.7% and 29.1% among bilateral and single lung recipients, respectively (15). These findings explain the less than ideal predictive value of stage 0-p that we report in this study. Furthermore, because of the large number and cross-sectional area of terminal and respiratory bronchioles, these small airways contribute little to airflow resistance, and a large proportion may be damaged before spirometry becomes impaired (16, 17). Thus, changes in spirometry may be late findings in the course of obliterative bronchiolitis and other surrogate markers, such as exhaled nitric oxide (18) and indices of ventilation distribution (17), may prove more sensitive for the early detection of graft dysfunction. In summary, the FEV1 criterion for stage 0-p has fair operating characteristics, but the FEF2575% criterion's test characteristics limit its clinical utility. This is likely explained, at least in part, by supranormal midexpiratory flow rate measurements obtained early after transplantation. Finally, the combination of the FEV1 or the modified FEF2575% criteria has slightly better operating characteristics than the FEV1 criterion alone.
This article has an online supplement, which is accessible from this issue's table of contents online at www.atsjournals.org Conflict of Interest Statement: R.R.H. has no declared conflict of interest; M.M.C. has no declared conflict of interest; R.D.Y. has no declared conflict of interest; J.P.L. has no declared conflict of interest; A.A.A. has no declared conflict of interest; G.A.P. has no declared conflict of interest; E.P.T. has no declared conflict of interest. Received in original form July 23, 2003; accepted in final form December 5, 2003
This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||