Abstract
Keywords
Introduction
The best documented, most successful, and most widely used treatment for ADHD is medication, which shows a robust effect in group data, with placebo-controlled effect sizes (ESs; Cohen’s d; Cohen, 1988) of 0.7 to 1.5 for methylphenidate and amphetamine (Arnold, 2004) and 0.7 for atomoxetine (Michelson et al., 2002). However, even when administered in a carefully crafted algorithm with another established treatment for ADHD, behavior modification, 32% of children do not fully benefit from this optimal combination treatment (Swanson et al., 2001). In addition, an unknown percentage of families refuse to try these effective medications even though their children might benefit. Therefore, additional complimentary and/or alternative interventions are greatly needed.
One possibility is neurofeedback (NF; formerly called electroencephalographic [EEG] biofeedback and occasionally referred to as neurotherapy), which trains the brain, via operant conditioning, to improve its regulation of itself by providing real-time video/audio information about its electrical activity measured from scalp electrodes. The theoretical foundation for NF treatment of ADHD rests on the following: the idea that brain waves can be consciously modified (Kamiya, 1968); research showing excessive EEG theta activity (characterized by a drowsy/inattentive state) and decreased beta activity (characterized by an awake/attentive state) in patients with ADHD compared with normal controls (Monastra, 2005); neuroimaging, Positron Emission tomography (PET), and Single proton emission computed tomography (SPECT) studies demonstrating a neurophysiological basis of ADHD (Clarke, Barry, McCarthy, & Selikowitz, 2001; Swanson & Castellanos, 2002); and studies of EEG and slow cortical potential dysfunctions and their relationship to underlying thalamocortical mechanisms (Barry, Clarke, & Johnstone, 2003) and EEG changes associated with a positive medication response (Song, Shin, Jon, & Ha, 2005).
The first publication on NF for ADHD was by Lubar and Shouse (1976), who reported significant IQ increases and behavioral improvements in an ABAB design. Since then, numerous case studies, open trials, and partially controlled studies have been published or presented at conferences, most of which have suffered from significant methodological limitations. To date, 15 studies have used a randomized controlled design, including 8 published studies (7 in peer-reviewed journals and 1 book chapter; deBeus & Kaiser, 2011; Gevensleben, Holl, Albrecht, Schlamp, et al., 2009; Gevensleben, Holl, Albrecht, Vogel, et al., 2009; Holtmann et al., 2009; Lansbergen, van Dongen-Boomsma, Buitelaar, & Slaats-Willemse, 2011; Leins et al., 2007; Lévesque, Beauregard, & Mensour, 2006; Linden, Habib, & Radojevic, 1996; Perreau-Linck, Lessard, Levesque, & Beauregard, 2010) and 6 presented at conferences (Fine, Goldman, & Sandford, 1994; McGrady, Prodente, Fine, & Donlin, 2007; Orlandi & Greco, 2004; Palsson, Pope, Ball, Turner, & Nevin, 2001; Picard, Moreau, Guay, & Achim, 2006; Urichuk et al., 2009). Three of these were reported after the study reported here was initiated. Most studies reported significant reductions in ADHD symptoms compared with the control group and 4 showed neurophysiological changes specifically associated with NF. In the 9 studies that reported ESs (Cohen’s d), for overall ADHD symptoms, there was a posttreatment mean ES of d = 0.67, considered a strong medium ES, between NF and the control condition. These results are promising but not conclusive. Only 5 of the 15 studies used blinding and sham-NF designs. Three of those five did not demonstrate differences between the real NF and sham-NF groups (Lansbergen et al., 2011; Perreau-Linck et al., 2010; Urichuk et al., 2009). Furthermore, only two of the double-blind, sham-controlled randomized studies have undergone peer-reviewed publication (Lansbergen et al., 2011; Perreau-Linck et al., 2010) and neither was able to show superiority of NF over sham (however, a third, reported in a chapter deBeus & Kaiser, 2011, did show NF superiority in a secondary analysis). In addition, studies varied in the number and frequency of NF treatments given (18-45 and 1X-5X/week, respectively); hence, it is not clear how many treatments are needed or how frequently they should be given. Finally, some NF experts questioned the feasibility of a sham placebo treatment because of its potential negative effect on recruitment, retention, and blinding.
With numerous open and partially controlled studies suffering design flaws, and with promising results from a glamorous treatment involving intense involvement and commitment that invites nonspecific placebo response, rigorous testing for a specific effect is greatly needed, especially considering NF’s expense in time and money. Before proposing a more definitive large-scale randomized controlled trial (RCT), this pilot study explored the feasibility of a double-blind sham-controlled trial of NF for 6- to 12-year-olds with ADHD. Three primary questions were based on a review of current evidence.
Feasibility of double-blind, sham-controlled design: We expected to recruit 36 in two school years (to get teacher ratings) with adherence and retention more than 85% and that child and parent post hoc guess regarding treatment assignment would be no better than chance.
Advisability of two versus three treatments/week: This had two components: (a) palatability/adherence—whether families preferred treatment 3X/week or 2X/week as shown by attendance, satisfaction ratings, and choice of treatment frequency midway through the trial and (b) relative efficacy of treatment frequency—whether the preferred frequency would be as effective as the nonpreferred frequency, when randomly assigned to 24 treatments, as shown in graphs of clinical outcomes.
Necessary number of treatments: The number of treatments at which improvement stabilized, as shown in tables and graphs of outcomes.
Method
Participants
Participants were recruited by advertisement and clinical referral. They were unmedicated 6- to 12-year-olds, with rigorously diagnosed Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) ADHD determined by a licensed clinician with the aid of the Children’s Interview of Psychiatric Syndromes: Child/Parent Forms (2000; Weller, Weller, Rooney, & Fristad, 1999a, 1999b). In addition to the categorical diagnosis, an item-mean score of ≥1.5 on a 0 to 3 metric was required on the parent/teacher Swanson, Nolan, and Pelham (SNAP) Rating Scales–IV (Swanson & Carlson, 1994). Exclusion criteria were IQ < 80, mental age < 6, comorbid disorder requiring psychoactive medication, medical disorder requiring medication that had psychoactive effects, >5 previous NF treatments, antipsychotic medication within 6 months prebaseline, fluoxetine/atomoxetine 4-weeks prebaseline, stimulant 1-week prebaseline, or any other psychotropic medication 2-weeks prebaseline.
Design
Children who passed screen were twice randomized in a 2:1 ratio to active NF (n≥24) versus sham NF (n ≥12) and simultaneously in a 1:1 ratio to 2X versus 3X/week treatment frequency (≥18 in each frequency, approximately 12 active and 6 sham) for forty 45-min treatments. Randomization to treatment was balanced on the child’s premedication status. NF was administered via a single-channel CZ electrode placement and a reference electrode on each ear, with feedback to decrease theta/alpha and increase beta, including sensorimotor rhythm EEG activity.
Elements of NF were provided via a commercially available training device, SmartBrain by CyberLearning Technology, LLC (www.smartbraintech.com). This device utilized off-the-shelf videogames (Sony PlayStation and MS Xbox) via use of an interface that modulated input to the videogame hand-controller based on EEG activity. It was selected because it was the most frequently used technology in RCTs of NF for ADHD at the time of the grant application in 2007, appearing in three of the nine available publications and conference presentations (deBeus & Kaiser, 2011; Orlandi & Greco, 2004; Palsson et al., 2001), and was the only device used in a sham-controlled small RCT (20); it was developed using NASA technology with some supporting science behind it (Alan Pope was a consultant to the study); and it was interactive, engaging, and, unlike many other devices, did not require a “NF coach” to guide trainees.
In the active condition, children used a game controller for normal gaming functions, but its responsiveness (speed, control, and vibration) was contingent on the child’s real-time EEG activity. Reinforcement was provided for EEG theta–beta power ratio below a threshold that was set minute-to-minute by fuzzy logic based on the immediately preceding EEG. The sham-NF condition appeared identical to the active condition in all aspects (equipment, duration, frequency, and videogame choices) except that the interface module was preprogrammed to give random feedback not contingent on the child’s EEG. To blind staff to treatment condition, the SmartBox interface devices were independently preprogrammed by an off-site consultant who had no interaction with participants or data (analogous to prepackaged randomized medication).
To examine the palatability of treatment session frequency (part of Question 2, “Advisability of two vs. three treatments/week”), at Treatment 24, participants and their parents were given the option to continue with or change their initial weekly frequency assignment.
Treatment fidelity was monitored and confirmed by the supplier of the equipment by two personal visits to observe, reviewing videotapes of treatment sessions throughout the treatment phase of the study, and phone consultations as needed.
Measures
To make the test of recruitment and retention realistic, an assessment battery similar to the burden to be expected in a definitive RCT was used. Treatment outcome was measured several ways using reliable and valid measures standard in ADHD treatment outcome research. Changes in ADHD symptoms were tracked from baseline every three (for parents) and six (for teachers) treatment sessions via a rating of the 18 DSM-IV ADHD symptoms, SNAP-IV on a 0 to 3 metric (Swanson & Carlson, 1994). The primary clinical outcomes were the ADHD symptoms (an average of 18 items of SNAP-IV) rated by teachers and parents.
Major assessments were conducted immediately before Treatment 1; immediately after Treatments 12, 24, and 40; and at a 2-month follow-up. These assessments included the SNAP; Conners’ Parent/Teacher Rating Scales–Revised: Long Version (Conners, 2002a, 2002b); Brief Rating Inventory of Executive Functioning (Gioia, Isquith, Guy, & Kenworthy, 2000); parent-and teacher-rated Impairment Rating Scale (Fabiano et al., 2006); clinician-completed Clinical Global Impression Scale (Guy, 1976); Wechsler Individual Achievement Tests–2nd Edition, Abbreviated (Wechsler, 2001);Wide-Range Achievement Tests, 4th Edition, Program Monitoring Version (Roid & Ledbetter, 2005); a timed math test (Arnold et al., 2004); Wechsler Abbreviated Scale of Intelligence (Wechsler, 1999); 7 Brain Resource Center computer-based normed neuropsychological tests (Paul et al., 2005); and changes in EEG activity (single-channel [CZ] EEG).
Treatment frequency preference was determined by the participant’s decision to switch frequencies at Treatment 24 and by stated preference on a consumer satisfaction questionnaire (CSQ), parent and child. The CSQ, administered at Treatments 24 and 40, also included questions to examine blindness to treatment assignment. Any changes in concomitant treatment/educational services and potential adverse effects of NF were recorded at each treatment session and at the 2-month follow-up.
Data analysis
The data analyses in this pilot study were mainly descriptive; this study was not powered to test a definitive difference between treatment and control. The changes over time of each clinical variable were summarized using SAS (version 9.2). Except baseline measurements, data analyses included participants who had at least 12 treatment sessions. The few missing data were filled using the last observation carried forward (LOCF) principle or using the data at the nearest time point. Pre–post ESs of clinical outcomes for each group were calculated as the estimated difference between baseline and post treatment divided by the standard deviation of the baseline measurement. ES of treatment versus control was estimated as the difference of the pre–post changes for two groups divided by the pooled standard deviation of the changes. Sensitivity studies used mixed models for repeated measures that assumed data missing at random. Because few data were missing, the sensitivity analyses yielded similar results as the LOCF approach and are thus not reported here.
Results
Recruitment to double-blind study
In less than 2 years, 39 children were randomized (age 8.9 ± 1.7 years, 79.5% male, 87.2% Caucasian), 26 to active treatment, 13 to placebo, 20 to 3X/week, and 19 to 2X/week (Figure 1). The only notable baseline difference between randomized groups was that there were proportionately twice as many girls in the placebo group (31%) as in the active group (15%; Table 1).

CONSORT diagram.
Sample Characteristics.
Note: SNAP = Swanson, Nolan, and Pelham Rating Scales–IV of ADHD; EEG = electroencephalographic; CGI-S = Clinical Global Impression–Improvement Scale.
Retention was 92% at treatment 21 and 87% at treatment 40. Three of 13 (23%) placebo participants dropped out, 1 lost to follow-up at Treatment 4 and 2 to pursue medication (1 at Treatment 6 and 1 at Treatment 21). Two of 26 (8%) in active treatment dropped, 1 at Treatment 4 to pursue medication and 1 at Treatment 24 due to travel distance and poor grades. One additional participant assigned to active treatment did not return for follow-up. One participant started fluoxetine at Treatment 33, which did not appear to affect the outcome trajectory. Between Treatment 40 and 2-month follow-up, six participants resumed ADHD medication, all of them originally assigned to active treatment. No other participants took psychoactive medication during the study. After Treatment 40, regardless of randomly assigned frequency, child and parent ratings of NF’s ease of use was high, 6.0 and 5.8, respectively, on a 0 to 7 scale (0 = not at all to 7 = very, with 3 to 4 being neutral); and 21/34 (62%) children and 26/34 (76%) parents would recommend NF to others.
Blinding outcome
Of 34 participants at Treatment 40, 35% of children and 29% of parents said that they did not know which treatment they had been assigned to and declined to guess. Only 32% of children and 24% of parents guessed correctly, with 32% and 47%, respectively, guessing incorrectly.
Frequency advisability outcome (2X vs. 3X/week)
Child and parent satisfaction was high for both randomly assigned frequencies, 5.36 for each on a 0 to 7 scale (0 = not at all satisfied to 7 = very satisfied, with 3-4 being neutral). There was a slight tendency, not statistically significant, for those in active treatment to be more satisfied with 3X/week (5.62 ± 1.27 vs. 5.21 ± 1.80) and those in the placebo treatment to be more satisfied with 2X/week (5.40 ± 1.07 vs. 5.08 ± 1.73). More importantly, at Treatment 24, when self-selected choice of frequency was allowed, twice as many chose to switch from 2X/week to 3X/week (7/16, 44%) as chose to switch from 3X/week to 2X/week (4/18, 22%). The parent/teacher-rated symptom outcome at Treatment 24 was at least as good for 3X/week as for 2X/week on parent/teacher ratings (Figure 2).

Relative efficacy of 2X versus 3X/week active NF treatments by teacher ratings (upper panel) and parent ratings (lower panel) on the 18 DSM-IV ADHD symptoms on the SNAP (0-3 metric, lower score better).
Necessary duration of treatment, asymptote of treatment effect
As shown in Figure 3 for parent-rated symptoms, there did not appear to be any additional improvement after treatment 24.

Asymptote of active NF treatment effect by Treatment 24 on the 18 DSM-IV ADHD symptoms rated by parents on SNAP (0-3 scale, lower score is better, ratings every three treatments).
Comparison of active to sham treatment
The clinical and neuropsychological outcomes in general showed no apparent advantage of active treatment over placebo. In fact, the sham placebo treatment showed nominally better results on many measures (e.g., Figure 4). Both randomized treatments showed a large significant pre–post ES of improvement by Treatment 24 on parent SNAP ratings of ADHD symptoms, especially inattention, but there was no advantage for active treatment as supplied by the CyberLearning technology used in this pilot.

Outcome trajectories through 40 treatments and 2-month follow-up for parent-rated inattention (top left panel), hyperactivity-impulsivity (top right panel), and all 18 ADHD symptoms (lower left panel) and teacher-rated all 18 ADHD symptoms (lower right panel) on SNAP (0-3 scale, lower score is better).
Safety data, as expected, were unremarkable, with no adverse events attributable to treatment and essentially no differences between NF and placebo.
Comment
The three feasibility questions of this pilot study were successfully answered:
It is possible to conduct a double-blind NF trial with good recruitment and retention and successful blinding of parents and children. At the end of the treatment, twice as many parents guessed the assigned treatment incorrectly as correctly. The children did a little better, with equal percentages of correct and incorrect. Both are not significantly different from the guess of treatment by chance (50/50); guesses were not associated with the assigned treatment group (p > .10). It should be noted that the parents, who provided much of the assessment data, were the more important ones to blind, and their guesses were worse. Almost one third of both parents and children did not have enough inkling to even take a guess. Thus, blinding was successful. Furthermore, it did not seem to detract from participant retention, with 87% completing all 40 treatments and almost 92% completing >20 treatments (see duration of treatment below). This suggests that a large double-blind sham-controlled RCT is feasible.
There was also a reasonably clear answer to the desirable treatment frequency. Although both randomly assigned 2X/week and 3X/week had high satisfaction ratings, the families “voted with their feet” for 3X/week by choosing it at Week 24 when they were allowed to self-select. Importantly, the results with 3X/week were at least as good (nominally better) at Treatment 24 as with 2X/week. Therefore, it appears that an RCT could use 3X/week frequency to shorten the trial, save expense, and fit more participants within the school year to obtain teacher classroom ratings and parent ratings of homework time.
The duration of treatment (number of treatments) was the third feasibility question. Graphed data from parent ratings of active treatment suggest that 30 treatments should be enough. The effect on parent ratings of ADHD symptoms actually plateaued by Treatment 24. Reducing the trial to 30 treatments should, like the use of 3X/week frequency, shorten the trial, reduce costs, and allow more participants to fit within the academic year window. Taken together, these two findings, 30 treatments at a frequency of 3X/week, could shorten a trial to 10 weeks, half of the duration of this pilot trial. However, it is possible that a different method of NF would show a longer slope of improvement. In fact, many NF experts feel that manually adjusted thresholds that remain fixed for periods of time work better than the fuzzy-logic moment-to-moment adjustments used in the CyberLearning technology used by us. Therefore, we do not have as much confidence in 30 treatments showing the maximal effect as we do in the other two feasibility findings.
There are many possible reasons for our clinical results differing from the more positive reports of unblinded trials. The children were diagnosed in a rigorous research fashion by a doctoral-level clinician using a structured diagnostic interview and DSM-IV criteria. The severity of symptoms on the SNAP was in the expected range for a sample including inattentive type, but there was some unintended selection for families willing to give up medication for 5 months, which would tend to select for either milder severity, unresponsiveness to medication, or prejudice against medication. This could differentiate our sample from the others, which generally allowed medication during NF treatment, with withdrawal at baseline and end of treatment for assessments. However, a descriptive examination of subgroups based on prior medication did not identify a subgroup of better responders. This sample was also a bit younger than some of the other reports, going down to age 6 years, and we observed that a couple of 6-year-olds had some difficulty with the videogame used as the medium. It might be wise for a large definitive trial to have a sample a couple of years older. Most importantly, our sample was not selected for high theta–beta ratios at baseline even though the treatment was supposed to work by lowering theta–beta ratios. The baseline theta–beta ratios were intermediate between those for ADHD and normal children reported by Monastra (Monastra et al., 1999, 2005; Snyder & Hall, 2006). However, our theta–beta ratios were calculated from an average of these four conditions: eyes closed, eyes open, silent reading, and listening to another read, whereas Monastra’s (41, 42) reported that theta–beta ratios for ADHD and normal controls were calculated from eyes open, reading, listening, and writing/drawing. Furthermore, Monastra used 13 to 21 Hz for beta power, whereas we used 12 to 21 Hz. Because including 12 Hz in the denominator would lower the ratio and Monastra (personal communication, March, 2011) noted that writing/drawing elicited the highest theta–beta ratio, this difference in EEG sampling could account for some of the discrepancy between his ADHD means and ours. Future samples should be selected for high theta–beta ratio if theta–beta downtraining is used as treatment.
Treatment fidelity was monitored by three in-person visits for training and monitoring, teleconferences, review of videotapes of treatment sessions, and review of downloaded data from the SmartBox interfaces by the supplier of the equipment, CyberLearning Technology. The company president repeatedly confirmed that the treatments appeared to be administered accurately. Therefore, we have reasonable confidence that treatments were faithful to protocol. However, it is still possible that the fuzzy logic used in this treatment was not as effective as the personalized manual adjustment of reinforcement threshold favored by most NF experts. Therefore, a different method of NF may have a larger effect than found here.
Limitations
This pilot feasibility trial has many limitations, including small sample size (39 randomized, 34 completers), self-selection for families willing to give up or delay medication for 5 months, and failure to select for high theta–beta ratios. Our mean theta–beta ratio of 4.8 is appreciably less than the theta–beta ratios of 6.6 to 8.5 reported by Monastra and Snyder for 6- to 11-years-olds with ADHD (Monastra et al., 1999; Snyder & Hall, 2006). The inclusion of some with low theta–beta ratio could have obscured a signal from those with high theta–beta ratio; however, examination of response by baseline theta–beta ratio did not support that possibility. The biggest limitation was the choice of NF technology, which used fuzzy logic to alter the reinforcement threshold from minute to minute, adapting the threshold to just-completed performance and not requiring focus on the NF training itself. Although this seemed a good choice at the time and was derived from technology used by NASA to train astronauts, most NF experts question its effect; they recommend manual changing of threshold and focusing on the EEG as a task rather than working indirectly through a videogame.
Conclusions and Recommendations
A well-blinded large RCT of NF using a sham control of equal intensity and duration is feasible and needed. The results of this pilot suggest that three treatments per week would be feasible, efficient, and palatable. A threshold of theta–beta ratio should be set as an inclusion criterion if theta–beta downtraining is used. The planning and execution of the RCT should involve both mainstream scientists (to insure credible scientific rigor) and NF experts/advocates (to insure credible and rigorous treatment). It will be important for all stakeholders to have input so that the results, whatever they are, will be credible to all. Such planning has already been undertaken by a group formed at a 2010 ChADD symposium, with weekly teleconferences, which we hope will lead to a large, multisite, double-blinded RCT of NF for ADHD.
Footnotes
Authors’ Note
Free use of SmartBrain neurofeedback equipment was supplied by CyberLearning Technology, LLC (Domenic Greco, PhD, President). Dr. Greco and Ms. Lindsay Greco trained staff in the use of the equipment and monitored treatment fidelity. Alan Pope, PhD, consulted in the early stages of the study.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Dr. Arnold receives or has received research support and/or consulting honoraria from Lilly, Shire, Curemark, Neuropharm, Noven, Organon, Seaside, and AstraZeneca. Other authors do not have any financial interests to disclose.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by National Institute of Mental Health Award R34 MH080775 and Award UL1RR025755 from the National Center for Research Resources (The Ohio State University Center for Clinical and Translational Science).
