Academic Editor:Jon Ver Halen, Baptist Cancer Center; Baptist Memorial Healthcare; Vanderbilt Ingram Cancer Center; St Jude Childrens Research Hospital
Checked for plagiarism: Yes
Review by: Single-blind
A Comparison Study of the Fitbit Activity Monitor and PSG for Assessing Sleep Patterns and Movement in Children
Despite its expense, labor and intrusiveness, polysomnography (PSG) is the gold standard for diagnosing obstructive sleep apnea (OSA). Recently, commercially available electronic activity monitors, such as Fitbit, have become widely accepted and can provide an estimate of sleep patterns for screening children with possible OSA. A previous study demonstrated Fitbit to be valid compared to PSG in adults. To date, these devices have not been extensively utilized for research in children with sleep disordered breathing (SDB).
To evaluate the validity of the Fitbit activity monitor compared to PSG in children and adolescents with SDB.
Data was collected from 14 children, ages 3 through 11, who were scheduled for a PSG during the study period. Fitbit was worn concurrently during the night of the PSG. Analyses were performed by comparing total sleep time, number of awakenings, sleep efficiency and wake after sleep onset (WASO) Fitbit parameters with the corresponding parameters measured by PSG using Spearman’s rho. Fitbit movement epochs were also compared to PSG epochs showing movement behavior.
Pilot data suggest that Fitbit demonstrates a high sensitivity for sleep, a low specificity for wake and a trend suggesting good association of movement measurements.
Sleep disordered breathing (SDB) is a spectrum of diseases ranging from primary snoring (PS), upper airway resistance syndrome (UARS), to obstructive sleep apnea (OSA). OSA constitutes the more severe end of the spectrum. Nighttime symptoms of OSA include snoring, labored breathing, observed apnea, restless sleep, enuresis and diaphoresis. PS is the least severe on the spectrum and consists of nightly snoring without apnea, hypoxemia, or hypoventilation. Accurate characterization of a child’s SDB is required not only to ensure adequate treatment but also to prevent possible complications such as failure to thrive, systemic or pulmonary hypertension, neurocognitive impairment and learning and behavioral problems.1
Recent studies of PS demonstrate that although it is considered to be a more benign condition than OSA, it may also cause significant neurocognitive and behavioral issues.1, 2, 3, 4, 5 PS has been associated with an increased risk for hyperactivity, inattentive behavior and poor school performance in comparison to children who do not snore.3 The mechanisms by which PS may cause these issues are not well understood. If blood gas abnormalities are absent in children with PS, one possible causal factor may be sleep fragmentation or disruption Some studies have shown that children with SDB are more likely to have disturbance in their sleep architecture compared to healthy controls.6, 7, 8 Pediatric subjects with SDB have been demonstrated to have long sleep latencies, relatively low sleep efficiencies, and frequent arousals.
It is well accepted that polysomnography (PSG) is the most reliable and comprehensive method for diagnosing the severity of pediatric SDB, and is part of the American Academy of Sleep Medicine diagnostic guidelines. Such testing requires in-laboratory overnight multichannel and video recording of sleep and breathing with a trained technician in attendance. This procedure is time- and labor-intensive, as well as costly. Moreover, PSG provides a limited one night of information about a child’s sleep. Some have suggested that the intrusiveness of PSG and the disturbance of the usual bed environment may result in inaccuracy in assessing the severity of SDB in a child. More than for respiratory parameters, there is also evidence of greater night to night variability in sleep architecture and this cannot be feasibly measured by PSG.9 It has therefore become increasingly apparent that characterizing SDB in children and identifying those at risk for neurocognitive and behavioral impairment may be a challenge that cannot be reliant on PSG alone.
Accelerometer-based activity (actigraphy) monitors have recently been shown to provide a valid estimate of sleep patterns.10 They include devices such as small wrist-watch sized activity monitors that can collect data by an internal accelerometer. The collected data are then translated into epochs (30 sec or 1 min) of activity. Using validated algorithms, data is generated that provides an estimation of sleep-wake states. The device is small and allows for multiple-day data collection. It can also be easily used in a child's natural environment, thereby conferring validity to collected sleep data. To date, however, wide use of actigraphy to diagnose sleep abnormalities has not been implemented mainly due to the high cost of each unit (approximately $1,000 each) and the lengthy training and data analysis involved with retrieving useful information.
Recently, commercially available electronic activity monitors (such as those manufactured by Fitbit, Jawbone, and Nike) have become accessible to the general public and their popularity and use continue to grow. These small, relatively inexpensive (average price $100) devices improve on standard accelerometers by providing automated feedback and interactive tools via mobile device or personal computer. Their low cost, wide reach and apparent effectiveness make these activity monitoring devices appealing for clinical and research applications. But studies utilizing these devices in the pediatric population are limited, and to date, use of these devices for research in children with SDB has not been extensively explored.
The purpose of this study was to evaluate the reliability and validity of the Fitbit activity monitor to measure parameters of sleep, wake and movement compared to PSG in children and adolescents with SDB.
Children between 2 and 12 years of age scheduled for an overnight PSG at Children’s Hospital Los Angeles (CHLA) for the evaluation of sleep apnea or sleep disordered breathingbetween January and August 2015 were considered for inclusion. Children were excluded if their parents were not fluent in English, if they were taking any oral or systemic medications that may have affected sleep or respiration, were born prematurely, or had any significant cardiac, neurologic or developmental disease or disorder. Fourteen children aged 3 to 11 years met all inclusion criteria and participated in the study. The study was approved by the CHLA Institutional Review Board and written permission was obtained from all parents and written assent from children over the age of seven. Information was extracted from participant medical records including date of birth, height and weight, gender, medications, surgical and medical history, and race. Height and weight were used to calculate body mass index (BMI), which was converted to a BMI z-score to adjust for gender and age.
On the evening of the scheduled PSG, study participants were fitted with a Fitbit Flex™ wristband on their non-dominant wrist. The Fitbit was placed on the subjects by study staff and removed in the morning by the parent. Study staff instructed the parent on the method to activate and deactivate the Fitbit for sleep mode. On the following business day, the Fitbit was retrieved and the data uploaded. The Fitbit records patients’ sleep stage at 1 minute intervals as: sleep, restless, or awake by tracking frequency and intensity of movement with a three-dimensional accelerometer system. Data were extracted from Fitbit by Fitabase, a research platform from third party developer, SmallSteps Labs LLC and exported for analysis in Excel and SPSS.
Overnight PSG was conducted using the SomnoStar platform (CareFusion, Yorba Linda, CA). PSG-recorded measurements included left and right frontal, central, and occipital electroencephalography (International 10-20 placement); bilateral electrooculogram; submental and bilateral electromyogram; electrocardiogram; oronasal airflow with 3-pronged thermistor; nasal pressure with pressure transducer; rib cage and abdominal wall motion via respiratory impedance plethysmography; end-tidal capnometry; and arterial oxygen saturation accompanied by pulse waveform; snoring microphone, as well as continuous video and audio recordings. Behavioral observations and other events noted by the technician were transcribed directly into the PSG record. PSG’s were staged and scored as per American Academy of Sleep Medicine guidelines with each epoch reviewed by a physician with board certification in Sleep Medicine who provided the final interpretation. Epoch tables were exported from SomnoStar and relevant data points including arousals, apneas, recorded movements, changes in body positions, and sleep stage were transcribed into Excel for each 30-second epoch.
A comparison of the sleep/wake states coded epoch-by-epoch was performed, examining Fitbit sensitivity and specificity relative to PSG coding. Note that while the sample size of subjects was 14, a repeated measures, epoch-by-epoch analysis effectively increases the power of the within subjects analysis due to the repetition of events. We also compared PSG and Fitbit by epochs coded as showing movement of any kind. We assume that the greatest concordance between the two sleep measurement techniques will be for movement detection. Descriptive statistics were also conducted and all analyses were completed using SPSS 22.0 (SPSS, Inc., Chicago, IL) with a statistical significance set at p<.05.
Sleep/wake assessment: Overall sleep metrics were calculated for PSG and Fitbit: total sleep time (TST), wake after sleep onset (WASO), sleep efficiency (percent of time coded as “sleep” relative to TST), and number of minutes awake. Spearman’s rho was used to determine concordance of values across sleep measurement platforms.
PSG coding was considered to be the “true” or correct assessment of the child’s sleep/wake state. For the epoch-by-epoch comparison, each Fitbit file, coded at one minute intervals, was doubled in order to create a row-for-row match with PSG epochs. Sensitivity was defined as the proportion of epochs scored by PSG as sleep that were also identified as sleep by Fitbit. Specificity was defined as the proportion of PSG-scored wake epochs that were also identified as wake by Fitbit. Accuracy of the Fitbit was calculated as the percentage of detecting sleep and wake epochs, as coded by the PSG.
A Movement variable was created from the PSG epoch data, selecting epochs in which one of the following was noted: restless behavior, periodic limb movement, limb movement, body position movement or technician comment on movement. Likewise a Movement variable was created from the Fitbit epochs coded as awake or restless. Sensitivity to movement was defined as the proportion of epochs scored by PSG as a movement that was also identified as movement by Fitbit. Specificity of movement was defined as the proportion of PSG-scored no movement epochs that were also identified as no movement epochs by Fitbit.
The mean age of the 14 participants was 6.5 years (SD 2.9) and nine (64%) were female. The calculated BMI percentile of participants ranged from below the 1st percentile to above the 99th percentile with a median at about the 94th percentile. As a group, the subjects showed mild OSA, averaging an AHI of 1.7 (0 – 4.1) with an average of 8 central apneas (2-22) and 12.9% of total sleep time spent in REM (0-26.7) (Table 2). Additional demographic data is shown in Table 1. On average, the children spent 435 minutes in bed, with an average sleep efficiency of 86% (minimum 55% - maximum 97%) and average time asleep of 374.8 minutes (SD=47.1) (Table 3).Table 1. Subject Demographics
|Female (N, %)||9||64|
|Age (mean, SD)||6.6||3|
|BMI percentile median||94.5|
|Race (N, %)|
|Study ID||AHI||Central Apnea (count)||Obstructive apnea (count)||Obstructive hypopneas(count)||Nadir O 2 sat (%)||Highest PETCO 2 (mmHg)||% TST in REM|
|average (range)||average (range)|
|Total Sleep Time (TST)||435.8 minutes (398 – 476)||436 minutes (398 – 476)||.99, p<.05|
|Wake After Sleep Onset (WASO)||26.5 minutes (1.5 – 99.5)||. 6 minutes (0.0 – 4.0)||.32, n.s.|
|Sleep Efficiency||86% (55 – 97%)||94% (89 – 100%)||.22, n.s.|
|Awake Minutes (total)||61 minutes (11 – 195)||2.8 minutes (0 -8)||-.36, n.s.|
Table 3 depicts the average and ranges of TST, WASO, sleep efficiency, and total minutes awake. Spearman’s rho, a measure of the similarity of the values across PSG and Fitbit, is also shown. The PSG and Fitbit total sleep times showed excellent similarity (p<0.05), primarily due to how the files were cleaned by manually starting both data files at the time of lights off and on for each patient. Notably, the Fitbit does not appear to be accurately measuring wake periods, for either total minutes awake or WASO. There is slight agreement between the measures for sleep efficiency, but the Fitbit consistently demonstrated higher sleep efficiency than the PSG.
Sensitivity and specificity of the Fitbit relative to the PSG for detection of sleep or wake epochs was calculated (Table 4). The Fitbit appeared to correctly identify sleep epochs as defined by the PSG with an average sensitivity of 99%. However, the Fitbit rarely detected a wake epoch (average=10% specificity).Table 4. Sensitivity and Specificity for sleep/wake and movement across epochs (average epochs per subject=865)
|Sleep/wake||Mean (Standard Deviation, Median)|
|Sensitivity (correctly identify a sleep epoch)||99% (SD=.1%, 98%)|
|Specificity (correctly identify a wake epoch)||10% (SD=15%, 0%)|
|Movement Sensitivity (correct identification of movement)||44% (SD=23%, 39%)|
|Movement Specificity (correct identification of no movement)||98% (SD=2%, 98%)|
|Movement False Positive (Fitbit identified movement, PSG no movement after first sleep)||2.4% (SD=2%, 2.1%)|
When the epochs from both the PSG and Fitbit were coded for movement (presence or absence of movement), the Fitbit sensitivity for movement averaged 44%. Specificity for absence of movement averaged 98%. The false positive rate for Fitbit epochs (Fitbit showed movement, PSG no movement) averaged 2.4% and the false negative rate (Fitbit showed no movement, PSG movement) averaged 56%. The positive predictive value (PPV) for the Fitbit showing movement averaged 64%, whereas the negative predictive value (NPV) for the Fitbit showing no movement averaged 94%.
Figure 1 demonstrates the total number of minutes coded by the PSG and Fitbit as movement for each subject. The individual data indicate that seven (50%) of the patients, all over 6 years of age, showed good correlation between the PSG and Fitbit for minutes the patient moved during the sleep interval (average = 9.9 minutes difference). The seven younger patients (less than 6 years of age), averaged 30.1 minutes difference, significantly greater than the older group (p=0.019, Table 5). In addition, a trend can be visualized indicating an association between total numbers of minutes of PSG-recorded movement also detected by the Fitbit (Figure 1). Finally, about half of patients had 50% agreement between PSG and Fitbit or higher for epochs showing movement (Figure 2). These two findings suggest that Fitbit is able to detect movement during a sleep period, but not at the same levels as the PSG.Table 5. Mean difference in minutes between PSG and Fitbit, by age
|Age Group||Mean Difference||P-Value|
|3-6 years old||30.1 (14.9)||0.019|
|7-11 years old||9.9 (13.0)|
To our knowledge, this is the first comparison of the Fitbit Flex™ to polysomnography performed exclusively in children with sleep disordered breathing. It is also the first study that has attempted to correlate the measurements of movement between Fitbit and PSG. The increasing availability and sophistication of low-cost commercially available health technologies make them an attractive option for gaining additional information on children’s sleep.
Our results indicate that the Fitbit is not well suited for distinguishing wake from sleep in this pilot sample of children with suspected SDB. We found that the Fitbit has a high sensitivity for sleep but a low specificity for wake compared to PSG. Our data also demonstrate a similarity in TST but a higher estimate of sleep efficiency with Fitbit than PSG. This is consistent with previous studies in adults and children.11, 12 Most recently, Toon et al13, compared a commercial wrist-based accelerometer, a smartphone application accelerometer, actigraphy (Actiwatch2) and PSG in children and adolescents with SDB. In this study, the wrist-band accelerometer demonstrated excellent sensitivity (0.92) and poor specificity (0.66) for awake and sleep, respectively, but good correlation for TST and sleep efficiency. Montgomery-Downs et al11 compared the Fitbit Ultra to actigraphy (Actiwatch 64) and overnight PSG in 24 adults and found that the Fitbit overestimated sleep efficiency and total sleep time, and had an overall high sensitivity to sleep stages and low specificity to identify wake. Another recent study compared the Fitbit Ultra to actigraphy (Motion Logger Sleep Watch) and PSG in 63 children and adolescents.12 These authors also found the Fitbit to have good sensitivity (0.86), poor specificity (0.52) and overestimated total sleep time and sleep efficiency. We attribute the low specificity for wake described in our study and by others to the inability of the Fitbit to detect arousals which can be coded by PSG as changes in EEG. Although part of the attractiveness of the Fitbit is its less invasive nature and an absence of electrodes connected to the child, it is these electrodes that measure EEG changes from sleep to wake.
The Fitbit’s inability to differentiate restless sleep from wakefulness and quiet wake from sleep would appear to limit its utility in the assessment of SDB. However, in this study Fitbit does show promise in detecting movement. Our finding of a correlation between PSG and Fitbit recorded minutes of movement suggests that a child with restless sleep will be detected by the Fitbit, but at reduced sensitivity than a sleep study. Even though the Fitbit underestimated the total minutes of movement during the sleep period, the discrepancy was consistent. That is, the more minutes of movement detected by PSG, the more minutes of movement detected by the Fitbit. It is important to clarify that a standard overnight polysomnogram will not typically emphasize or report movement as characterized in this study. For the purposes of this investigation, all information indicating movements during the PSG were collected manually and compared to Fitbit.
Although there is a paucity of literature regarding Fitbit, there is wealth of information that has compared actigraphy to PSG in adults and children. A systematic review identified 228 papers describing the use of actigraphy to measure some aspect of sleep.14 Similar to findings with Fitbit, the studies comparing actigraphy to PSG reported consistently high sensitivity (82.2-90.1) for sleep and low specificity (50.9 – 72.8) for wake. This similarity is not surprising given the comparable accelerometer technology used in each to detect sleep and wake by tracking frequency and intensity of movement. Actigraphy studies performed in SDB populations have focused primarily on the usefulness of actigraphy estimated TST and combining this with tests of respiratory function in order to improve the accuracy of the calculated AHI.15 Other studies have used actigraphy to estimate TST and sleep patterns in patients with SDB without comparison to a sleep standard such as PSG.14 To date, there have been no studies that have proposed actigraphy alone as a method to determine the presence of SDB. However, Pillar et al.16 found significant correlation between the arousal index estimated from the combined use of actigraphy and peripheral arterial tonometry and the arousal index generated by PSG. And most recently, Behar et al.17 demonstrated that an automated OSA screening application for smartphones that derives data from actigraphy and photoplethysmography was accurate in classifying the severity of OSA.
Presently, very few investigators have evaluated actigraphy as a means to measure movements during sleep and have studied this in the context of SDB. One study has explored the temporal relationship between PSG arousal events and actigraphy measured body or limb movements and concluded that actigraphy based measurements could identify arousal events that may have greater impact on sleep quality.18 The same authors performed a manual coding of movements during sleep and wake periods in children with restless sleep and compared this to actigraphy measured movements. They determined that they could identify characteristics of actigraphy movements specific to sleep and awake and presumed that this analysis could improve the performance of actigraphy and reduce false wake detections. Lastly, in an attempt to enhance the performance of conventional actigraphy, researchers used multisite accelerometry with motion recorded from not only the wrist but also thorax and ankle and demonstrated a significant performance benefit and improvement both the specificity and sensitivity of actigraphy.18
Our data also appear to support a potential trend towards an age based discrepancy between Fitbit and PSG-recorded movements. It has previously been shown6, 19 that sleep architecture differs between pre-school aged children (3-5 years old) and school aged children (7-11 years old) with OSA. In these studies, younger children demonstrated fewer awakenings and better maintenance of sleep efficiency.6 The data presented here show a statistically significant difference in the agreement of Fitbit and PSG between children younger and older than 6 years of age. The reasons for such a potential difference are unclear, however it is possible that the Fitbit can more accurately detect changes in movement in older children than younger children due to their greater sleep disturbance. Additional studies are required to clarify this phenomenon.
Current guidelines regarding milder forms of SDB and PS are lacking. Our knowledge that despite lack of intermittent hypoxia these children can suffer adverse consequences on their behavioral, cognitive and emotional health suggests that detection of those at risk is imperative. The limitations of PSG, including its cost, intrusiveness and inability to measure variations in sleep patterns and architecture over time may diminish its predictive value for these children. Because the correlation between PS and neurocognitive impairments is thought to be due to sleep disruption or fragmentation, there is a significant need to identify other assessments of sleep quality in this population. Our novel findings of a significant correlation between Fitbit and PSG related movements highlight the possibility that Fitbit measurements of movement might be used as a means to evaluate sleep disturbance in children with SDB. Furthermore, the ability to obtain several nights of data regarding sleep quality within a child’s natural environment and its cost-effectiveness make the Fitbit an attractive, ecologically valid instrument that may have clinical utility in screening for children with SDB and poor sleep. Perhaps coupled with other means of measuring respiratory parameters, such as peripheral arterial tonometry or pulse oximetry, the wrist-band accelerometer may have even greater function in studying SDB populations.
The primary limitation of the current study is the small number of participants. The small sample size of this pilot study could have affected our ability to make a more accurate comparison between the Fitbit and PSG, particularly in regards to movement. There was also a very narrow range of AHI values and this resulted in our inability to describe any correlation between severity of SDB and Fitbit-detected movements. This information would be critical in understanding whether the Fitbit can assist in identifying those children at greater risk for SDB and neurocognitive impairments.
This pilot study contributes to the growing body of evidence that supports the use of commercial based accelerometers such as Fitbit to evaluate sleep patterns and quality in children. It does appear that Fitbit is as effective in detecting sleep and movement as currently available actigraphy devices. Further studies will be necessary to establish and validate its use as a screening tool in children with SDB.
The authors would like to thank the children and families who participated in this study. We also thank the staff of Children’s Hospital Los Angeles Sleep Lab for their assistance and Sue Dolbee in particular for her assistance extracting sleep study data. This study was supported by the Division of Otolaryngology – Head and Neck Surgery at Children’s Hospital Los Angeles.