Published ahead of print on May 11, 2006, doi:10.1164/rccm.200408-1146OC
© 2006 American Thoracic Society doi: 10.1164/rccm.200408-1146OC
Impaired Performance in Commercial DriversRole of Sleep Apnea and Short Sleep DurationCenter for Sleep and Respiratory Neurobiology; Division of Sleep Medicine; Division of Pulmonary Allergy and Critical Care, Department of Medicine; Division of Sleep and Chronobiology, Department of Psychiatry, University of Pennsylvania, Philadelphia, Pennsylvania; Federal Aviation Administration, Washington, DC; and London Health Sciences Centre, University of Western Ontario, London, Ontario, Canada Correspondence and requests for reprints should be addressed to Allan I. Pack, M.B., Ch.B., Ph.D., 125 South 31st Street, Suite 2100, Philadelphia, PA 19104-3403. E-mail: pack{at}mail.med.upenn.edu
Sleepiness plays an important role in major crashes of commercial vehicles. Because determinants are likely to include inadequate sleep and sleep apnea, we evaluated the role of short sleep durations over 1 wk at home and sleep apnea in subjective sleepiness (Epworth Sleepiness Scale), objective sleepiness (reduced sleep latency as determined by the Multiple Sleep Latency Test), and neurobehavioral functioning (lapses in performance, tracking error in Divided Attention Driving Task) in commercial drivers. Studies were conducted in 247 of 551 drivers at higher risk for apnea and in 159 of 778 drivers at lower risk. A multivariate linear association between the sets of outcomes and risk factors was confirmed (p < 0.0001). Increases in subjective sleepiness were associated with shorter sleep durations but not with increases in severity of apnea. Increases in objective sleepiness and performance lapses, as well as poorer lane tracking, were associated with shorter sleep durations. Associations with sleep apnea severity were not as robust and not strictly monotonic. A significant linear association with sleep apnea was demonstrated only for reduced sleep latency. The effects of severe apnea (apneahypopnea index, at least 30 episodes/h), which occurred in 4.7%, and of sleep duration less than 5 h/night, which occurred in 13.5%, were similar in terms of their impact on objective sleepiness. Thus, addressing impairment in commercial drivers requires addressing both insufficient sleep and sleep apnea, the former being more common.
Key Words: commercial drivers excessive sleepiness obesity obstructive sleep apnea sleep duration In the United States, approximately 5,600 people are killed annually in crashes involving commercial trucks (1). Falling asleep while driving is an important factor in serious crashes involving commercial vehicles (2, 3), prompting the question, why? There are likely to be at least two mechanisms. The first is chronically insufficient sleep, that is, excessive sleepiness and performance impairments from accumulating sleep debt when individuals curtail the duration of their sleep day after day (46). Short sleep durations have previously been described in commercial drivers (7). The second mechanism is obstructive sleep apnea, which has been found to be common in commercial drivers (8, 9). This study was conducted to assess the relative role of these risk factors for impairments in commercial driver's license holders. Daytime sleepiness and performance were assessed using neurobehavioral tests relevant to the driving task. These were evaluated relative to laboratory measures of sleep-disordered breathing and objective estimates of sleep duration at home. There are currently no published data to guide policy and decision makers regarding performance assessment, or regarding factors associated with impaired performance in drivers of commercial vehicles. The results presented in this article have previously been reported in abstract form (10, 11). There has also been a publication, using data collected from this study, that focused on screening for apnea in this population (12).
Sample Studied Stratified sampling (13) was employed to enrich the sample with respect to the presence of sleep apnea (see Figure E1 in the online supplement). Government agencies provided contact information for n = 4,826 randomly selected holders of commercial driver's licenses living within 50 miles of the Center for Sleep and Respiratory Neurobiology (University of Pennsylvania, Philadelphia, PA). Individuals received, by mail, the Multivariable Apnea Prediction questionnaire (14) to facilitate oversampling of drivers at higher risk for sleep apnea. Information from this questionnaire included age, sex, body mass index (BMI), and an apnea symptomfrequency index determined from three individual questions and was used to determine the relative likelihood of apnea on a scale from 0 to 1. Complete responses were obtained for 1,329 of 4,410 (30.1%) valid contacts. By design we recruited n = 247 of 551 (44.8%) highest risk drivers and then, in randomized order, n = 159 of 778 (20.4%) lower risk drivers. Rationale and additional details for this design including a schematic diagram (see Figure E1) are provided in the online supplement. The study was approved by the Institutional Review Board of the University of Pennsylvania and all subjects gave written, informed consent.
Impairment Outcomes
Risk Factors for Impairment
Sleep apnea.
Statistical Approaches
Description of Sample Demographics Table E1 summarizes the sample; 93.3% of the sample were male and the mean age was 45.4 yr. In addition, 81.6% were employed as drivers of a commercial vehicle at the time of study enrollment. Of these, 9.3% reported exclusively long-distance driving, 65.5% reported local driving, and 25.2% reported both. Primary analyses were based on the total sample because those not working as commercial drivers at the time of the study had the capacity to do so. Similar results were obtained when analyses were repeated for the 81.6% currently working as commercial drivers. These results are not presented here but are summarized in the online supplement (see Table E4).
Description of Sleep Apnea and Sleep Durations
Test of Overall Study Hypothesis The multivariate linear association between outcomes and risk factors was significantly different (Wilks's lambda F[8, 594] = 5.7; p < 0.0001), revealing 14.2% shared variance between the two sets of variables.
Results from Primary Continuous Risk Factor Models
In contrast, reduced sleep durations were significantly associated with increases, with or without adjustment, in subjective sleepiness (p = 0.01), increases in objective sleepiness (p = 0.008), increases in vigilance lapses (p = 0.0001), and increases in tracking error (p = 0.0001) (all p values after covariate adjustment). For each 1.34-h decrease in estimated mean sleep time (the standard deviation for this variable), there was a 0.72 (SE, 0.28)-point increase in the predicted Epworth Sleepiness Scale total score, a 0.75 (SE, 0.28)-min reduction in the predicted value of MSLT, a 0.18 increase in the predicted value of the log of 1 plus the number of vigilance lapses per 10-min trial, and a 0.11 increase in the predicted value of log of 1 plus mean absolute tracking error on the DADT.
Assessment of Risk Factors in Categories for Different Outcomes
In contrast, adjusted mean values for Epworth Sleepiness Scale scores among sleep duration categories were as follows: 10.0, 9.9, 8.6, 7.9, and 8.7 for mean durations of less than 5, 5 to less than 6, 6 to less than 7, 7 to 8, and more than 8 h, respectively, which did vary significantly (F[4, 313] = 2.5; p = 0.05). Planned contrasts versus the reference category of 7 to 8 h achieved nominal statistical significance and also appeared clinically significant in magnitude (
Objective sleepiness.
Adjusted mean values for the MSLT among sleep duration categories were as follows: 6.3, 7.9, 7.8, 8.6, and 9.9 for mean durations of less than 5, 5 to less than 6, 6 to less than 7, 7 to 8, and more than 8 h, respectively (F[4, 329] = 2.5; p = 0.04). These are illustrated in Figure 1B. Nominal significance relative to reference (78 h) was found for less than 5 h (p = 0.007). After controlling for multiple comparisons, the differences in means between those with less than 5 h versus the reference category of 7 to 8 h remained significant (p = 0.05). The multiplicity-adjusted p value comparing less than 5 h with greater than 8 h was p = 0.06. None of the other category differences adjusted for multiple comparisons was statistically significant. There was an estimated 2.29 (SE, 0.85)-min reduction in sleep latency among those with sleep durations less than 5 h compared with the reference group with 7 to 8 h. Thus, the magnitude of the group difference between less than 5 h of average sleep compared with 7 to 8 h was similar in magnitude to that for severe sleep apnea compared with no apnea.
Although sleep latency was reduced in those with severe sleep apnea compared with no sleep apnea, evidence of differential effects of sleep apnea on objective sleepiness was observed among individuals. When we categorized subjects into those with pathologic sleepiness (MSLT, < 5 min), in the "gray zone" (MSLT, at least 5 and < 10 min), or normal (
Behavioral alertness. There were statistically significant differences among apnea severity categories for performance lapses (F[3, 325] = 4.0; p = 0.007). Comparisons among categories are illustrated in Figure 4A after back-transformation to the original scale. Pair-wise contrasts are expressed in terms of the ratio of expected values as a consequence of the log transformation. The regression-adjusted ratios comparing those with 30 or more, 15 to fewer than 30, and 5 to fewer than 15 events/h with those with fewer than 5 events/h were 1.24 (1.17), 0.98 (1.15), and 0.76 (1.09; mean [SE]), respectively. The only significant planned contrast versus the reference category of less than 5 events/h was for AHI at 5 to fewer than 15 events/h (p = 0.005). Unexpectedly, as seen in Figure 4A, those with mild apnea (AHI, 5 to fewer than 15 events/h) performed better than those with no apnea. Comprehensive analyses resulted in no evidence of confounding as a cause of this finding. These analyses are briefly described below, with further detail in the online supplement. After adjustment for multiplicity, significant pair-wise contrasts were observed between those with at least 30 events/h versus 5 to fewer than 15 events/h (p = 0.02) and between those with 5 to fewer than 15 events/h versus those with fewer than 5 events/h (p = 0.02).
Least-squares adjusted mean values back-transformed to the original scale are shown across sleep duration categories in Figure 4B. Sleep duration category differences were highly statistically significant (F[4, 325] = 3.8; p = 0.005). There was a significant difference between those with less than 5 h mean sleep duration relative to the reference category of 7 to 8 h (p = 0.0001). After adjustment for multiple comparisons, a significant contrast remained between those with less than 5 h mean sleep duration and those with 7 to 8 h (p < 0.01), as was also observed on comparing those with less than 5 h and those with more than 8 h (p = 0.009).
DADT.
In contrast, sleep duration category differences were statistically significant (F[4, 309] = 2.4; p = 0.05). The planned contrast between less than 5 h relative to the 7- to 8-h reference group also was significant (p = 0.003). After multiplicity adjustment, the significance of this contrast remained (p = 0.03). No other pair-wise comparison was significant.
Investigation of Nonmonotonicity between Outcome Measures and Sleep Apnea Severity
Likelihood of Impairment
In this study, we showed that both subjective and objective sleepiness, as well as performance impairments, are common in our sample of commercial driver's license holders. Our analyses reveal that chronic short sleep duration is a risk factor for subjective sleepiness, objectively measured sleepiness, and performance impairments. The results for sleep apnea are less clear. Although increases in sleep-disordered breathing were associated with reduced sleep latency, corresponding associations with subjective sleepiness and performance lapses were not confirmed. Increased tracking error was statistically significant only without covariate adjustment. In complementary analyses expressing risk factors in categories, severe sleep apnea relative to fewer than 5 events/h and short sleep durations of less than 5 h relative to 7 to 8 h had similar magnitudes of effect for objective sleepiness as measured by the MSLT (average reductions of 2.5 and 2.3 min, respectively). Durations of sleep of less than 5 h were more common than severe sleep apnea (weighted percentages, 13.5 compared with 4.7%, respectively). Our study design involved a random sample of persons with commercial driver's licenses, avoiding bias that might have occurred if drivers had been selected from only one company or had been selected on the basis of some other common characteristic. Moreover, the stratified sampling that we employed improved statistical precision of estimates, whereas sampling in clusters across several companies has the potential for greatly reducing statistical precision (32). A disadvantage of our approach was the expected low response rate, which strictly limits statistical inference to a "theoretical" population of drivers that would respond to surveys such as ours. However, the age, sex, and zip code distributions of respondents and nonrespondents were essentially identical, providing some evidence of minimal response bias. The respondents to our mail questionnaire knew that the study was about sleep but not the specific hypothesis being tested. Although the response rate was relatively low, response bias would have affected estimated relationships between risk factors and impairment outcomes only had there been differential recruitment of individuals with respect to one of the risk factors, that is, sleep apnea and short sleep duration and if the relationship between that risk and the impairment outcome differed between those responding and not responding. This seems unlikely. The issue of potential response bias means, however, that our estimates of the percentages of drivers with sleep apnea and short sleep durations need to be interpreted with caution. The percentage of subjects that we found with any sleep apnea (28.2%) is similar to that in other general population studies (3335) but less than that in other studies of commercial drivers (8, 9). This difference could be related to one or all of the following: (1) differences in response rate in different studies, with our study having the lowest response rate; (2) different sampling strategiesa population-based random sample in our study and a cluster-based sampling scheme across many companies in the study by Howard and coworkers (8); a nonrandom sample of drivers from a single company in the study by Stoohs and coworkers (9); (3) different definitions of respiratory events; and (4) different technologies used to estimate the degree of sleep-disordered breathing, with a simplified method (MESAM) used in the study by Stoohs and coworkers (9). The differences in estimated percentages with apnea are particularly large among studies in those with mild to moderate apnea, whereas the percentages with severe sleep apnea are more similar, that is, 4.7% in our study, 10.6% in the study by Howard and coworkers (8), and 10.0% in the study by Stoohs and coworkers (9). In our study, we measured sleep durations at home by actigraphy and demonstrated associations between sleep duration and self-report sleepiness; physiologic sleepiness by MSLT, with vigilance lapses indicated by PVT; and tracking error by the DADT. We considered two potential metrics of sleep duration derived from actigraphy. The first was the cumulative duration of relative inactivity during the sleep bout, thereby removing periods of movement as part of sleep. The second was the overall duration from the beginning to the end of the main sleep bout and continuing to count periods of movement as sleep. To validate these measurements, we performed simultaneous analyses of actigraphy during full sleep studies with sleep duration being assessed by EEG in a large convenience sample of drivers being studied (n = 277); 69.2% of the higher risk group and 63.9% of the lower risk group. Validation analysis, as described in the online supplement, revealed that the duration of cumulative inactivity gave a closer approximation to actual sleep time (average difference, 0.31 h) than the total duration of the sleep bout, which overestimated sleep by 1.33 h, on average. However, this latter metric had a reduced standard deviation relative to the former, indicating a potential for improved agreement with the "gold standard" EEG in this population. We therefore repeated analyses, using this alternative metric after removing constant bias. The associations of sleep apnea with performance lapses and with tracking error become statistically significant (p = 0.05 and p = 0.005, respectively) and that with MSLT retained significance (p = 0.02) in the multivariable continuous risk factor models. We attribute the strengthening of these associations to better separation between estimates of sleep apnea severity and sleep duration, because activity due to termination of apneic events is eliminated from sleep duration estimates. These results are described in the online supplement (see Table E3). In our sample, only 81.6% of the drivers studied were working as commercial drivers at the time of this study although all had the capacity to do so. Analyses were performed by restricting attention to currently employed drivers. These results are presented in Tables E4 and E5. The associations between outcomes and the primary measure of sleep duration were all somewhat stronger when restricting attention to drivers currently employed at the time of our study. However, the AHI linear slope for the MSLT was reduced in absolute value from 0.54 (p = 0.05, all drivers) to 0.44 (p = 0.16, currently employed drivers). To identify important nonmonotonic associations, sleep apnea severity and average sleep duration at home were expressed as categorical variables in secondary multivariable models. Planned contrasts revealed significantly increased objective sleepiness (MSLT; p = 0.03) and tracking error (DADT; p = 0.04) for those with severe sleep apnea (AHI, at least 30 events/h) relative to no sleep apnea (< 5 events/h), but not for PVT performance lapses. Although sleep apnea severity category was significantly associated with performance lapses (p = 0.007), the relationship was nonmonotonic and greatly influenced by drivers with AHI values less than 5 events/h performing worse than those with values between 5 and 15 events/h. Although particularly marked for performance lapses, this nonmonotonic relationship with apnea severity was present for all four outcome variables. Therefore, we investigated potential causes of confounding to explain this. We examined differences in sleep duration, presence of medical conditions, and use of medications that could cause sleepiness and did not find any obvious reason. Thus, confounding due to group difference in medical status did not explain this nonmonotonicity. Although we cannot rule out confounding from other unobserved factors, random sampling variation is as likely an explanation as is confounding for these results. Details are provided in the online supplement (see Tables E6 and E7). Although sleep duration was found not to be a confounding factor for sleep apnea severity, there remains the potential for sleep duration to serve as an effect modifier, that is, the effect of increasing sleep apnea severity may result in different levels of impairment depending on mean sleep duration. We tested for interaction but found no statistically significant interaction in any of the multivariable models. However, the statistical power for detection of an interaction effect between sleep duration and apnea severity is low because of the low power of interaction tests generally, and specifically because of the relatively small sample sizes in some of the cells in the cross-tabulation of sleep duration and apnea severity categories. Another factor that needs to be considered in interpretation of our results is obesity. Obesity has been shown to be associated with subjective and objective sleepiness independent of sleep apnea (36). In our study, there were no statistically significant associations between outcomes and BMI. But because only 11.3% of our sample had a BMI less than 25 kg/m2 our study is limited in its ability to detect associations between obesity and performance. Other aspects of this study are also worthy of comment. We used thermistors to assess airflow. This study started before results conclusively showing the superiority of nasal pressure measurements to assess flow had been presented (37). In our study, performance tests were done on the day after the overnight sleep study in the laboratory. Because the duration of sleep during the overnight polysomnography was not limited it is possible that some subjects obtained more sleep than was their usual habit. Consequently, our findings on objective sleepiness and performance in the laboratory will likely underestimate the effect of reduced sleep duration. We found that a high proportion of subjects (32.6%) had an Epworth Sleepiness Scale score in the range compatible with self-report excessive sleepiness. Thus, in our sample, self-report sleepiness was extremely common. It is arguable that this may be the result of response bias, that is, that individuals with self-report sleepiness were more likely to respond to our survey. But, as noted, the proportion of subjects in our study with sleep apnea is less than in other studies of commercial drivers (8, 9). Moreover, self-report excessive sleepiness was also extremely common and similar (29.0%) in the only other study of commercial drivers to assess this (8). We found in multivariable modeling that when assessed as a continuous variable, reductions in sleep duration did increase the expected level of self-reported sleepiness. A 1-SD decrease in sleep duration (1.34 h) was associated with an expected increase of 0.75 points (p < 0.01) on the Epworth Sleepiness Scale. However, sleep apnea was not associated with self-report sleepiness. This is unlike the results of Howard and coworkers (8), who found a weak association between Epworth Sleepiness Scale scores and sleep apnea severity in a sample of commercial drivers. The basis for this lack of association in our study is unknown but appears to limit the usefulness in commercial drivers of self-report sleepiness in determining who is likely to have sleep apnea. Whereas sleep apnea was not associated with self-report sleepiness, it was associated with objective sleepiness as measured by the MSLT in the primary continuous risk factor model (p = 0.05). In our categorical model it was found that this association was particularly evident among individuals with severe apnea relative to those no apnea (reduction in mean equal to 2.5) and also relative to those with mild apnea (5 to 15 events/h; reduction in mean, 3.0 min). (Previous studies of commercial drivers [8, 9] have not assessed objective tests of sleepiness or performance.) Simultaneously, average sleep duration at home was associated with objective sleepiness, with sleep durations less than 5 h associated with the most severe sleepiness (reduction in mean equal to 2.3 min). Thus, multivariable modeling revealed that severe sleep apnea and sleep duration at home of less than 5 h had similar effects on objective sleepiness as determined by MSLT. However, the percentage of subjects in this sample with durations less than 5 h was nearly triple that of subjects with severe sleep apnea. Our data are consistent with studies showing that performance is differentially susceptible to the effects of sleep deprivation (38, 39). The performance of some subjects is quite impacted by sleep deprivation whereas other subjects are relatively resistant (38, 39). We see evidence to support similar differential susceptibility to the effects of both chronic short sleep duration at home and the degree of sleep apnea (see Figures 2 and 3). However, it is possible that measurement error regarding apnea severity and mean sleep duration has contributed to our observations. When we examined definitions of impairment for PVT performance lapses and DADT tracking error based on data comparing results with those produced by alcohol intoxication, we found that 29.2% (for lapses) and 36.9% (for mean tracking error) had performance decrements comparable to that induced in control subjects, albeit in different populations, after alcohol intoxication. Moreover, 25.7% had a mean sleep latency in the range (< 5 min) considered pathologic. This is an issue of concern. However, we found considerable discordance among these objective impairment outcomes. The proportion of individuals with such performance impairments in our society is unknown, and there are no data that currently show an association between impairments in the tests we performed and crash risk in commercial drivers. Also, there are currently limited data on the association between, for example, the presence of sleep apnea and crash rates among commercial drivers. In the study by Stoohs and coworkers (40), no association was found between sleep apnea severity and crash risk, as was the case for the group described in the study by Howard and coworkers (8), who performed full sleep studies. The latter result is likely to be related to the small sample size (n = 161) because, in the larger questionnaire sample (n = 2,343) in this study, the multivariable apnea prediction index, a surrogate for apnea (14), was weakly related to the risk of a single vehicle accident but not to the total crash history (8). This topic has been more extensively studied, however, in drivers of passenger cars, where a clear association has been shown between crashes and obstructive sleep apnea (4148). There is a need for more in-depth studies of crashes involving commercial drivers and the role of sleep apnea and insufficient sleep. In particular, there is a need for a case-control study focusing on the role of sleep apnea, and potentially sleep duration, in major crashes of commercial vehicles. In conclusion, there are daytime neurobehavioral performance impairments that are found commonly in commercial drivers, and these are more likely among those with durations of average sleep less than 5 h/night and those with severe obstructive sleep apnea. These results suggest that strategies employed by the Federal Motor Carrier Safety Administration to reduce sleepiness, and potentially crash risk, in commercial drivers include plans to (1) develop and implement approaches to identify "impaired" drivers by objective testing, (2) implement and ensure quality programs to identify and treat individuals with severe sleep apnea as well as monitor adherence to therapy, and (3) introduce approaches to assess and promote increased sleep durations among commercial drivers.
The authors acknowledge the statistical programming work of Mr. Robert Hachadoorian and of Mr. Daniel C. Barrett in the preparation of this manuscript.
Supported by a contract from the Trucking Research Institute, American Trucking Associations (DTFH61-93-R-00088), that was funded by the Federal Motor Carrier Safety Administration and by NIH grants HL-60287 and RR-00040.
* Formerly with the Trucking Research Institute, American Trucking Associations. This article has an online supplement, which is accessible from this issue's table of contents at www.atsjournals.org Originally Published in Press as DOI: 10.1164/rccm.200408-1146OC on May 11, 2006 Conflict of Interest Statement: A.I.P. has a grant from ResMed, Inc., to study the relative role of ambulatory recording of sleep-disordered breathing as it compares with full sleep study. He also receives royalties from Marcel Dekker Publishers for a book he edited, entitled Sleep Apnea: Pathogenesis, Diagnosis and Treatment. He has a patent pending related to the use of serotonin agonists to treat sleep apnea in mammals. G.M. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. B.S. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. F.M.P. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. W.C.R. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. C.F.P.G. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. D.F.D. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. Under the terms of the contract, the Trucking Research Institute and the Federal Motor Carrier Safety Administration had 30 days to comment on the draft manuscript, but could not mandate changes. Received in original form August 31, 2004; accepted in final form May 11, 2006
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||