|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
Since 1951, the tuberculin PPD-S1 has been used to standardize
commercial PPD reagents and perform special tuberculin surveys. PPD-S1 is now in short supply and a new standard (PPD-S2) has been manufactured. To determine if PPD-S2 is equivalent and can replace PPD-S1, we conducted a double-blind clinical trial. Between May 14 and October 28, 1997, 69 subjects with a history of
culture-proven tuberculosis (TB patients) and 1,189 subjects with
a very low risk for TB infection were enrolled, received four skin
tests (with PPD-S1, PPD-S2, and one each of the commercially
available PPDs), and had reactions measured by two trained observers. Among the TB patients, we found statistically indistinguishable immunogenicity (mean reaction size ± standard deviation): 15.6 ± 6.6 mm for PPD-S1 and 14.8 ± 5.6 mm for PPD-S2.
Among low-risk subjects, the tests had equally high specificities
(PPD-S1, 98.7% and PPD-S2, 98.5%), using a 10-mm cutoff. The
number of discordant (negative versus positive) interpretations
for PPD-S2, assuming that low-risk subjects who had a
10 mm
reaction to PPD-S1 were truly infected, was low (0.5%) and indistinguishable from the rate of discordant interpretations of the
same test when read by two different observers (0.8%). The study
results indicate that PPD-S2 is qualified to be used as the new U.S.
reference standard for PPD tuberculin.
| |
INTRODUCTION |
|---|
|
|
|---|
In the United States, 10 to 15 million persons are estimated to have latent infection with Mycobacterium tuberculosis (1). Every year a large proportion of the country's active, potentially infectious tuberculosis (TB) cases originate from this pool of latently infected persons (2). Treating latent infection substantially reduces the risk that the TB infection will progress to disease. At present, the tuberculin skin test is the only widely used method for diagnosing TB infection. Therefore, a critical national TB control strategy depends on the availability of a tuberculin skin test that is reliable and accurate. The tuberculin skin test is also an important component of epidemiologic studies to evaluate the prevalence of latent TB infection in various populations (3, 4). In addition, tuberculin testing is used as a diagnostic aid for patients with active TB disease but without a positive culture for M. tuberculosis.
Tuberculin skin testing involves the intracutaneous injection of 5 tuberculin units (TU) of purified protein derivative (PPD) by the Mantoux technique (5). The reference standard tuberculin (PPD-S1) used in the United States was prepared in 1941 (6) and adopted in 1951 by the World Health Organization Expert Committee on Biological Standardization (7, 8). The Food and Drug Administration (FDA) is responsible for storing tuberculin standard and releasing it for use. Master batches of commercial PPD are standardized against PPD-S1 by comparative testing in human populations, using some subjects known to be infected and others presumed to be uninfected with M. tuberculosis (8). Individual lots of commercial PPD are subsequently standardized against PPD-S1 using guinea pig potency tests.
Within the last 10 yr, the implementation of tuberculin skin testing programs for screening populations at high risk for latent tuberculosis infection has become common practice and the demand for commercial PPD reagents has increased in the United States. Because there are only a few vials of PPD-S1 remaining, a new standard, PPD-S2, has been manufactured for use when the supply of PPD-S1 is depleted. We compared the immunogenicity and specificity of PPD-S1 and PPD-S2 in a randomized, double-blind clinical trial, to determine if PPD-S2 is equivalent to and can replace PPD-S1.
| |
METHODS |
|---|
|
|
|---|
Study Participants and Skin Testing Methods
Study participants were recruited by investigators from six health departments in Denver, Colorado, Marion County, Indiana, and Seattle-King County, Washington, and universities in Atlanta, Georgia, San Diego, California, and Tucson, Arizona. The study protocol was approved by the human subject review committees of each study site, the Centers for Disease Control and Prevention, and the FDA. The study had two objectives: first, to compare the specificity of the two commercial reagents against the "gold standard," PPD-S1 [results reported in prior publication (9)]; and second, to assess the bioequivalency of PPD-S1 and PPD-S2, the subject of the present report. The recruitment, evaluation, and skin testing methods have been previously described (9). In brief, we studied two populations of subjects: persons known to be infected with M. tuberculosis, and persons presumed uninfected with M. tuberculosis. The subjects in the first group (TB patients) all had a history of culture-positive TB disease (within 5 yr of the study) and at least 2 mo of therapy with good clinical response. The presumed uninfected subjects (low-risk group) all had a negative history for risk factors for TB, defined as being born and living mostly in the United States or Canada after 1947, no prior positive tuberculin skin test, no history of illness compatible with TB or immunization with bacillus Calmette-Guérin (BCG), and no known exposure to persons or places associated with a high likelihood of TB transmission (work in hospitals, homeless shelters, prisons, drug treatment units). We excluded from both study groups any person known to have a condition that could suppress delayed-type hypersensitivity (human immunodeficiency virus infection, cytotoxic chemotherapy, systemic corticosteroid therapy, or recent live viral vaccination).
Each study subject was given four simultaneous skin tests (two on each forearm). The sites where each reagent was injected (e.g., left or right arm, upper or lower position) were determined by randomization lists prepared separately for the two study groups. The skin test reagents used were the two tuberculin standards (PPD-S1 and PPD-S2), provided by the FDA, and the two commercial reagents (Tubersol [lot nos 2443-11 and 2458-11] and Aplisol [lot nos 01206p and 00417p]), donated by the manufacturers. Two observers, each blinded to the identity of the test reagents as well as to the other observer's readings, read the skin-test results 48 to 72 h after the testing.
Sample Size Determinations
We evaluated the immunogenicity (i.e., mean reaction sizes and distribution of reaction sizes) of the two tuberculin standards in the group of TB patients. In a previous study (10), the mean reaction size to tuberculin skin testing in such patients was reported as 15 mm (standard deviation [SD] ± 4 mm). Using these parameters, a sample size (11) of 64 TB patients was required to detect a 2-mm difference in mean reaction sizes between PPD-S2 and PPD-S1, with 80% power and 95% certainty. We evaluated test specificity (i.e., the percentage of persons believed uninfected who have a negative test result) in the low-risk group. Assuming that the false-positivity rate of the tuberculin skin test is 4% (12, 13), a sample size 11 of 1,146 low-risk subjects was required to detect a 2% difference in the rates of false-positivity of PPD-S2 and PPD-S1, with 80% power and 95% certainty.
Data Analysis Methods
We used analyses of the variances (ANOVA) and Student's t tests adjusted for multiple comparisons (14, 15) to assess the skin test reaction
sizes (immunogenicity) observed after testing TB patients with different tuberculin reagents. Because the results obtained after testing the
low-risk subjects were not normally distributed, we used nonparametric ANOVA and Friedman rank tests, adjusted for multiple comparisons (14, 15), to assess the reaction sizes (inter-reagent variability)
among those subjects who had at least one result measured as > 0 mm
after testing with any of the reagents used. Using the results obtained
from testing TB patients, we also estimated the test sensitivities (i.e.,
percentage of subjects who had positive [
5 mm] reactions) after
testing with the tuberculin standards, and compared them with the
test sensitivities of the commercial reagents.
To calculate test specificity, we used the false-positive rate observed after testing the low-risk study group. Because the recommended reaction-size cutoff for defining a tuberculin skin test result
as positive varies according to the characteristics of the population
tested (1, 5), we used two different definitions of a positive result,
10 mm and
15 mm, to simulate field scenarios. For the low-risk
group, we also compared the number of results interpreted as positive
after testing with one of the standard reagents but not with the other
(i.e., discordant results), using 10-mm and 15-mm cutoffs.
When assessing differences between two skin test reagents, it is
necessary to take two known potential variations into account: the clinician variations in reading the skin test results (interobserver variability) and the geographic variation in the prevalence of small (< 10 mm) tuberculin cross-reactions that appear to be related to the prevalence of certain nontuberculous mycobacteria (NTM) in the environment (geographic variability). We evaluated the interobserver variability in two ways. First, for each standard reagent, we calculated the
interobserver agreement and the kappa (k) statistic (16, 17) for the reaction size results recorded by two observers, grouped into three categories (0-4 mm, 5-9 mm, and
10 mm) for the low-risk group, or into
two categories (< 5 mm and
5 mm or < 15 mm and
15 mm ) for
the TB patient group. Second, for the low-risk group, we compared
how many results were interpreted as positive by one observer but not
by the other observer (i.e., discordant results), using 10-mm and 15-mm cutoffs. We evaluated the geographic variability by comparing,
for the low-risk group, the frequency distribution of reactions to the
two tuberculin standards among subjects grouped by risk for NTM infection. For this grouping we defined study subjects to be either at
high risk for NTM infection if their place of birth or main residence
was a southeastern state or Arizona, or at low risk for NTM infection if their place of birth or main residence was elsewhere in the United
States or Canada. These NTM high-risk areas correspond to areas in a
previous study in which at least 50% of study participants reacted to
0.0001 mg of PPD-B, an antigen prepared from the Battey bacillus,
M. intracellulare [the U.S. Public Health Study of Navy Recruits conducted between 1958 and 1965 (4)].
| |
RESULTS |
|---|
|
|
|---|
Study Population
Between May 14 and October 28, 1997, 1,596 persons with a low risk for tuberculous infection were enrolled and skin-tested; of these, 1,189 were eligible for and are included in the present analysis (Figure 1). We also enrolled and skin-tested 99 persons with culture-positive TB; of these, 69 were eligible for and are included in this analysis. For our analyses (except for interobserver variability), the average of the reaction sizes recorded by the two observers was used. Although the focus of this report is the concordance between PPD-S1 and PPD-S2, selected results from skin testing with the two commercial reagents (either lot) are also included for comparative purposes.
|
Immunogenicity and Sensitivity of Tuberculin Reagents in TB Patients
The 69 TB patients were mostly male (81%), nonwhite (71%),
and had a median age of 50 yr (range 22 to 78 yr). Their skin
test reaction sizes (Figure 2) had a mean of 15.6 mm (SD ± 6.6 mm) after testing with PPD-S1 and a mean of 14.8 mm (SD ± 5.6 mm) after testing with PPD-S2. (The means ± SD of the
reaction sizes for Aplisol and Tubersol were 16.5 ± 5.4 mm
and 14.3 ± 6.3 mm, respectively.) The ANOVA comparison of
the four means showed statistically significant differences (p = 0.02), but none of the six t-test comparisons showed significant
differences between paired means (i.e., PPD-S1 versus PPD-S2, PPD-S1 or PPD-S2 versus either of the two commercial
reagents, and Aplisol versus Tubersol). One TB patient had < 5 mm reactions to all four skin-test reagents, an indication
of anergy to tuberculin; of the remaining 68 subjects, 63 (93%)
and 66 (97%) had
5-mm reactions after testing with PPD-S1
and PPD-S2, respectively. These sensitivity calculations do not
differ significantly from the test sensitivity calculations for
Aplisol (98.5%) or Tubersol (92%).
|
Inter-Reagent Variability and Specificity of Tuberculin Reagents in Persons with Low Risk of TB
The 1,189 low-risk subjects were mostly female (63.2%), white (68.6%), and had a median age of 27 yr. Among these persons, 236 had at least one test result recorded as > 0 mm (Figure 3) and were included in the analysis of inter-reagent variability. The mean (± SD) reaction sizes for the two standard reagents were as follows: for PPD-S1, 2.3 ± 3.6 mm; and for PPD-S2, 2.8 ± 3.6 mm. (The mean [± SD] reaction sizes for the two commercial reagents were 3.1 ± 4.2 mm for Aplisol and 1.9 ± 3.2 mm for Tubersol.) In the overall comparison there was a significant difference (p = 0.001) among the four reaction size means; however, when we compared paired means, the only significant differences were between PPD-S2 and Tubersol (p = 0.001) and between Aplisol and Tubersol (p = 0.001).
|
Among all low-risk persons tested, the number of false-positive reactions and the test specificities were as follows: for PPD-S1, 16 (98.7%) and for PPD-S2, 18 (98.5%) at the 10-mm cutoff; for PPD-S1, three (99.7%) and for PPD-S2, two (99.8%) at the 15-mm cutoff. (Aplisol and Tubersol had test specificities of 98.1% and 99.1% at the 10-mm cutoff, and 99.2% and 99.8% at the 15-mm cutoff.) There were no statistically significant differences in the test specificities between the two standards or between the standards and the commercial reagents. If we assume that low-risk subjects testing positive with PPD-S1 are truly infected with M. tuberculosis, then testing with PPD-S2 would have presumably failed to detect infections (tested negative) in six (0.5%) persons at the 10-mm cutoff and three (0.3%) persons at the 15-mm cutoff.
Interobserver and Geographic Variability
We included in the analysis of interobserver variability the results recorded by two different observers who read the same test. The interobserver agreement and kappa coefficients by categories of reaction sizes are included in Table 1. The kappa coefficients for both study groups (0.52 to 0.78) all represent intermediate to good agreement (16). Moreover, for the low-risk group, the number of interpretations that were discordant by observer (i.e., how many results were interpreted as positive by one observer but not by the other observer) were similarly low for all the reagents: nine for PPD-S1 (0.75%) and 10 for PPD-S2 (0.84%) at the 10-mm cutoff, and seven for PPD-S1 (0.59%) and 10 for PPD-S2 (0.84%) at the 15-mm cutoff. In the analysis of geographic variability, 41% (482) of the 1,189 low-risk subjects were born or mainly resided in an area of high risk for NTM infection (Table 2). However, we found that the risk for NTM infection caused no significant differences in the size distributions of skin test reactions, nor in the test specificities (97.8 to 99.0% at the 10-mm cutoff), after testing with the two tuberculin standards.
|
|
| |
DISCUSSION |
|---|
|
|
|---|
To ensure consistency in the potency of commercial PPD tuberculin reagents, it is essential that they be compared with a well-defined reference standard before their release for widespread distribution and use. Both the current standard tuberculin reagent PPD-S1 and the proposed new reference standard PPD-S2 produced, in the group of TB patients, an expected normal distribution of reaction sizes and reaction size means of approximately 15 to 16 mm. These results were statistically indistinguishable and also not significantly different from the results obtained by testing with the currently available commercial reagents, Aplisol (Parkdale Pharmaceuticals, Rochester, MI) and Tubersol (Pasteur Merieux-Connaught Laboratories, Swiftwater, PA). The immunogenicity that we observed after testing with PPD-S1 and PPD-S2 is consistent, too, with previous reports of skin testing using standard tuberculin that reported reaction size averages of approximately 16 mm in populations infected with M. tuberculosis (18). Our assessment of the test's sensitivity found a nonreaction rate of 4 to 7% after testing TB patients with both standard reagents; these rates also correspond with a previously reported rate of 4% of patients with TB who did not react to standard tuberculin (18).
The most important use of the tuberculin skin test is to correctly classify asymptomatic persons as infected or uninfected with M. tuberculosis. Using a 10-mm cutoff in the low-risk-of-infection study group, the test specificities of the two standards were equally high: PPD-S1, 98.7%, and PPD-S2, 98.5%.
Reactions measured as 10 mm or greater with the current
standard, PPD-S1, and less than 10 mm with the proposed
standard, PPD-S2, occurred in six (0.5%) persons tested. If we
assume that the low-risk persons testing positive with PPD-S1
are truly infected with M. tuberculosis, this represents a very
low rate of infection, indistinguishable from chance, that
would have been missed by testing with PPD-S2. Conversely,
there were only eight (0.7%) presumable "false-positive" PPD-S2 reactions (i.e., measuring
10 mm) among persons
who had reactions of less than 10 mm to PPD-S1. Moreover, it
is also of note that the difference we observed between the
two reference standards was smaller than or equal to the variability inherent in the skin test procedure itself; that is, the
small rate of discordant interpretations (0.5 to 0.7%) between
the two standard reagents could potentially be explained by
two different sources of variability that are commonly recognized for the tuberculin skin test. These sources of variability
are: (1) clinician variability, i.e., discordant interpretations of
the same test by two different observers, which we found to be
0.6 to 0.8%; and (2) host variability, i.e., the difference in the
results of two simultaneously placed tests of the same tuberculin reagent, which we had previously estimated to be 0.5% (or
two of 366 subjects tested with two PPD-S1 injections) (9).
An accurate tuberculin skin test requires not only a properly
administered dose of a well-standardized tuberculin preparation, but also a correct interpretation of any observed reaction. We found that in our trial the interobserver agreement for both study populations was intermediate to good (16). We propose that by using the average of the two readings for our main statistical comparisons, we effectively reduced the effect of some of the human error associated with the skin test reading. Also, we found no evidence of geographic variability in skin test reaction-size distributions produced by the two tuberculin standards associated with the risk for NTM infection among our study population.
This absence of significant interobserver or geographic variability allowed us to best assess the variability associated with the reagents themselves. The availability of high-quality reference standards at the national level is basic to the continued production of
effective commercial preparations of PPD for public use. If there
is significant biovariability in the commercial tuberculin products
relative to the standard, this could result in either underdetection
or overdetection of latent TB infection. This study demonstrates
that the principal users of commercial PPD reagents
the clinician and his or her patient
will continue to be well served when
in the near future PPD-S2 supplants PPD-S1 as the U.S. reference standard for PPD tuberculin.
| |
Footnotes |
|---|
Correspondence and requests for reprints should be addressed to Margarita E. Villarino, 1600 Clifton Rd., NE, MS E-10, Atlanta, GA 30333. E-mail: MEV1{at}CDC.GOV
(Received in original form June 10, 1999 and in revised form September 30, 1999).
Acknowledgments: The authors thank the following persons for their contribution to this study: Chris Anderson, Martha Bennington, Kim Birmingham, Jose Becerra, Theresa S. Butler, Kenneth Castro, Katherine Danner, Ken Dansbury, Anna M. Elarth, Teri Festa, Peach Francisco, Jolene Garret, Stefan V. Goldberg, Gilda Griffin, Inga Heacock, Constance D. Henderson, Patricia Junquera, Jana Knutson, Ann Lanner, Ellen F. Logue, Audry Martinez, Jean Morris, Katheryn L. K. Miskovish, Chris Nguyen, Maribeth O'Neill, Greg Pate, Sharon Peppler, Jolie Schaeffler, Jane Tapia, and Samira Vossough.
Supported by the Centers for Disease Control and Prevention and the Food and Drug Administration.
| |
References |
|---|
|
|
|---|
1. Centers for Disease Control and Prevention. 1995. Screening for tuberculosis and for tuberculosis infection in high-risk populations: recommendations of the Advisory Council for the Elimination of Tuberculosis. MMWR 44: 18-34 .
2. Ferebee, S. H.. 1967. An epidemiologic model of tuberculosis in the United States. Bull. Natl. Tuberc. Assoc. 53: 4-7 .
3. Edwards, L. B., and C. E. Palmer. 1969. Tuberculosis infection (part II). In A. M. Lowell, editor. Tuberculosis. Harvard University Press, Cambridge. 123-202.
4. Snider, D. E.. 1982. The tuberculin skin test. Am. Rev. Respir. Dis. 125: 108-118 [Medline].
5. Huebner, R. E., M. F. Schein, and J. B. Bass Jr.. 1993. The tuberculin skin test. Clin. Infect. Dis. 17: 968-975 [Medline].
6. Seibert, F. B., and J. T. Glenn. 1941. Tuberculin purified protein derivative: preparation and analyses of a large quantity for standard. Amer. Rev. Tuberc. 41: 9-25 .
7. Affronti, L. F., J. J. Caprio, P. Q. Edwards, M. L. Furculow, S. Grzybowski, J. Katz, F. E. Hesse, and F. B. Seibert. 1969. What is PPD-S? A statement by the Committee on Diagnostic Skin Testing. Am. Rev. Respir. Dis. 99: 460-461 [Medline].
8. Landi, S. 1984. Production and standardization of tuberculin. In G. P. Kubica and L. G. Wayne, editors. The Mycobacteria. Marcel Dekker, New York. 505-535.
9.
Villarino, M. E.,
W. Burman,
Y.-C. Wang,
L. Lundergan,
A. Catanzaro,
N. Bock,
C. Jones, and
C. Nolan.
1999.
Comparable specificity of 2 commercial tuberculin reagents in persons at low risk for tuberculosis
infection.
J.A.M.A.
281:
169-171
10. Duchin, J. S., J. A. Jereb, C. M. Nolan, P. Smith, and I. M. Onorato. 1997. Comparison of sensitivities to two commercially available tuberculin skin test reagents in persons with recent tuberculosis. Clin. Infec. Dis. 25: 661-663 [Medline].
11. Pocock, S. J. 1983. Clinical Trials. John Wiley & Sons, New York. 123-127.
12. Rupp, M. E., A. W. Schultz Jr., and J. C. Davis. 1994. Discordance between tuberculin skin test results with two commercial purified protein derivative preparations. J. Infect. Dis. 169: 1174-1175 [Medline].
13. Grabau, J. C., G. T. DiFerdinando Jr., and L. F. Novick. 1995. False positive tuberculin skin test results. Public Health Rep. 110: 703-706 [Medline].
14. Westfall, P. H., and S. S. Young. 1993. Resampling-Based Multiple Testing, John Wiley & Sons, New York.
15. SAS/STAT User's Guide. 1989. Vols. 1 and 2, Version 6, 4th ed. SAS Institute, Cary, NC.
16. Landis, J. R., and G. G. Koch. 1977. The measurement of observer agreement for categorical data. Biometrics 33: 159-174 [Medline].
17. Dean, A. G., J. A. Dean, D. Coulombier, A. H. Burton, K. A. Brendel, D. C. Smith, R. C. Dicker, K. M. Sullivan, and R. F. Fagan. 1994. Epi Info Version 6: A Word-Processing, Database, and Statistics Program for Epidemiology on Microcomputers. Centers for Disease Control and Prevention, Atlanta.
18. Palmer, C. E., and L. B. Edwards. 1967. The tuberculin test: in retrospect and prospect. Arch. Environ. Health 15: 792-808 [Medline].
This article has been cited by other articles:
![]() |
T. Mori, M. Sakatani, F. Yamagishi, T. Takashima, Y. Kawabe, K. Nagao, E. Shigeto, N. Harada, S. Mitarai, M. Okada, et al. Specific Detection of Tuberculosis Infection: An Interferon-{gamma}-based Assay Using New Antigens Am. J. Respir. Crit. Care Med., July 1, 2004; 170(1): 59 - 64. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. TOBIN Tuberculosis, Lung Infections, and Interstitial Lung Disease in AJRCCM 2000 Am. J. Respir. Crit. Care Med., November 15, 2001; 164(10): 1774 - 1788. [Full Text] [PDF] |
||||
![]() |
G. Delogu and M. J. Brennan Comparative Immune Response to PE and PE_PGRS Antigens of Mycobacterium tuberculosis Infect. Immun., September 1, 2001; 69(9): 5606 - 5611. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Proc. Am. Thorac. Soc. | Am. J. Respir. Cell Mol. Biol. |