The adherence to methodological rigor in the development of psychosocial intervention procedures has distinguished behavior therapy from other treatment modalities for over 40 years (Barlow, Hayes, & Nelson, 1984). As a consequence, a number of behavior therapy procedures have been recognized as empirically supported treatments within clinical psychology (Chambless et al., 1998; Chambless et al., 1996), and the mental health profession in general (Barlow, 1993).
It is in this context that the registration packet for the 1999 and 2000 annual conventions of the Association for Advancement of Behavior Therapy (AABT) contained promotional brochures for Neurotherapy. Neurotherapy is a form of behavior modification that uses electroencephalographic (EEG) biofeedback technology to increase voluntary control over the amplitude and pattern of various brain wave frequencies. Although the rationale for each specific type of Neurotherapy might vary according to purpose (e.g. the treatment protocols differ for depression and ADHD), the general rationale for Neurotherapy remains the same. The general rationale is that "neuronal self-regulation" can be obtained through operant conditioning of a "desired" physiological state, i.e. by delivering a reinforcer contingent upon a particular neurophysiological response. Thus, the general procedure is also called neurofeedback.
Research clearly shows that brain wave activity can be altered through various forms of biofeedback (Lubar, 1997). However, the modification of such brain wave activity is alleged to have salubrious effects on a range psychological disorders and their symptoms. These include undercontrolled disorders of childhood, substance use disorders, anxiety disorders, mood disorders, and dissociative disorders (Abarbanel, 1995; Evans & Abarbanel, 1999; Lubar, 1997).
The electroencephalogram (EEG) has many legitimate diagnostic and research uses (see Hugdahl, 1995). Rigorous experimental research has yielded unambiguous findings for some applications neurofeedback (e.g., Mulholland, Goodman, & Boudrot, 1983). For example, teaching treatment-refractory epileptics to suppress seizures shows considerable promise (see Whitsett, Lubar, Holder, Pamplin, & Shabsin, 1982). Nonetheless, when Neurotherapy is extended to psychological disorders, the experimental procedures and criteria of judging the efficacy of neurofeedback are the same as those for the evaluation of psychosocial treatments.
Methodological Issues in Evaluating Efficacy
We have examined the peer-reviewed published research on the efficacy of Neurotherapy for ADHD, substance dependence, anxiety disorders, mood disorder, and dissociative disorder. We considered a number of procedural controls (Foa & Meadows, 1997) considered essential for internal validity. These variables included the unbiased (random) assignment to control conditions and use of standardized measures of target symptoms for the disorders. The assessment of the change in brain wave activity is necessary to determine if self-regulation of neuronal activity is altered by contingent feedback. However, altered brain wave activity is an insufficient index of treatment efficacy for the various psychological disorders to which it is applied. Target symptoms for those disorders must also be assessed with additional standardized and validated measures.
We also considered the nature and specificity of experimental control conditions. Guidelines for the determination of treatment efficacy (Chambless et al., 1998; Chambless et al., 1996) suggest that comparison of an active treatment to a wait list control condition is necessary for the determination of treatment efficacy. However, Herbert (2000) has criticized the wait list control as insubstantial, as the favorable effects of a given treatment relative to a wait list can only indicate that such treatment is more efficacious than nothing at all. Chambless and Hollon (1998) have suggested that a treatment can be considered both efficacious and specific to a disorder if it is compared to an experimental condition that controls for the nonspecific factors of that treatment. These factors can include expectation of improvement engendered by plausible treatment rational, effort justification, attention by a credible professional, therapy allegiance of the professional, etc. Such experimental tests may include comparison to conditions that include nonspecific or "placebo" factors. Other experimental tests can incorporate dismantling designs that test mechanisms of action through component-control comparisons. See Lohr et al. (1999) for a more extensive discussion of the control of nonspecific treatment factors in the evaluation of treatment efficacy.
Thus, it has been suggested (Chambless & Hollon, 1998; Herbert, 2000) that there are three increasingly stringent qualifications for empirically supported treatment. The first is "Possibly Efficacious" treatment and denotes a treatment that has been shown to be better than no treatment (or wait list), but the findings have not been as yet replicated by independent researchers. "Probably efficacious" treatments are those for which two studies conducted by independent research teams show that the treatment is better than no treatment (or wait list). "Efficacious and specific" treatments are those in which treatment efficacy has been established relative to a pill or psychological placebo controlling for nonspecific treatment factors. This latter category was previously known as "Well-established" treatment (Chambless et al., 1998; Chambless et al., 1996).
Review of Treatment Outcome Research
Our review of Neurotherapy encompasses those group design experiments that have been published in peer-reviewed scholarly journals. The review does not include outcome studies presented at professional meetings. Such studies (e.g., Cartozzo, Jacobs, & Gevirtz, 1995; Manchester, 1995; Manchester, Allen, & Tackiki, 1994; Othmer & Othmer, 1995; Scheinbaum, Zecker, Newton, & Rosenfeld, 1995) do not necessarily undergo a rigorous peer-review process, or they are not available through libraries or other traditional archives for detailed review and critical analysis. In addition, we did not evaluate experiments that included research analogue participants who did not manifest clinical symptoms (Rasey, Lubar, McIntyre, Zofutto, & Abbot, 1996; Rosenfeld, Cha, Blair, & Gotlib, 1995). We also excluded case reports or case series reports (Baehr, Rosenfeld, & Baher, 1997; Baehr, Rosenfeld, Baher, & Earnest, 1999; Barabasz & Barabasz, 1996; Boyd & Campbell, 1998; Fahrion, Walters, Coyne, & Allen, 1992; Thomas & Sattlberger, 1997; Wadhwani, Radvanski, & Carmody, 1998) that do not conform to time-series experimental designs (Barlow et al., 1984). In none of the disorders reviewed were there a sufficient number of time-series experimental designs (Lubar & Lubar, 1984; Lubar & Shouse, 1976; Lubar & Shouse, 1977) to meet the criteria for specific and efficacious (well-established) treatment (Chambless et al., 1996; Chambless & Hollon, 1998). Lastly, we excluded articles that did not include a control group, the least of which was a no-treatment or wait list condition. There are several uncontrolled studies investigating Neurotherapy for ADHD (Alhambra, Fowler, & Alhambra, 1995; Fenger, 1998; Lubar, Swartwood, Swartwood, & O'Donnell, 1995; Nall, 1973; Tansey, 1991; Thompson & Thompson, 1998), substance dependence (Peniston & Kulkosky, 1999; Peniston, Marrinan, Deming, & Kulkosky, 1993; Saxby & Peniston, 1995), and anxiety disorder (Peniston et al., 1993).
Rossiter and La Vaque (1995) compared children with a diagnosis of ADHD who were non-randomly assigned to either Neurotherapy or psycho-stimulant medication, though some of the children who received Neurotherapy also received psycho-stimulant medication at the time of pre-Neurotherapy baseline assessment. In addition, some children receiving each treatment also received ancillary treatments including behavior modification and supplemental educational work. Assessment of treatment effects included the Test of Variables of Attention (TOVA) and the Behavior Assessment System for Children (BASC). Multiple one-tailed t-tests that were not alpha-protected revealed that the children receiving Neurotherapy showed improvements on attentiveness, impulsivity, and processing speed. Similar t-tests revealed improvements on the BASC. The same analyses on the medication group showed similar changes, though there were no differences between the groups. While the statistical analyses suggest improvements due to Neurotherapy, the statistical analyses conducted increase the likelihood of Type 1 error (false positive effects).
The use of non-standardized measures with minimal ecological validity is also a major concern. For example, dependent variables included an unspecified questionnaire completed by parents assessing symptoms and functioning. In addition, the TOVA (formerly known as the Minnesota Computer Assessment) is of questionable ecological validity as a measure of treatment efficacy. The TOVA is a continuous performance test (CPT) that is intended to be an index of attention. While the research evidence indicates that CPT measures consistently discriminate ADHD children from normal children (Barkley, 1991), it inconsistently discriminates ADHD children from psychiatric control children (Halperin, Matier, Bedi, Sharma, & Newcorn, 1992; Halperin, Newcorn, Matier, Sharma, McKay, & Schwartz, 1993; Plizka, 1992). Moreover, the correlation of CPT scores and parent and teacher ratings of ADHD symptoms is moderate at best (DuPaul, Anastopoulos, Shelton, Guevremont, & Metevia, 1992; McGee, Clark, & Symons, in press; Nigg, Hinshaw, & Halperin, 1996). Thus, it is questionable whether the TOVA is a standardized measure that has ecological validity as a target symptom of treatment efficacy. Moreover, the BASC is an assessment for a number of ADHD symptoms that has yet to demonstrate adequate standardization and ecological validity as an efficacy measure for ADHD treatment studies.
Statistical analyses and dependent variables aside, several procedure and design flaws are the most serious threats to the internal validity of the study. Non-random assignment to experimental conditions, collateral treatment procedures, and the absence of a wait list control condition make it impossible to attribute measured change on any variable to the effect of Neurotherapy. Lastly, the comparison of Neurotherapy to standard pharmacological treatment does not allow conclusions about the efficacy of Neurotherapy per se (Borkovec & Castonguay, 1998). Only if two active treatments are compared to a wait list control condition in the same outcome experiment can unambiguous conclusions be drawn about the efficacy of either treatment.
Linden, Habib, and Radojevic (1996) randomly assigned children and adolescents diagnosed with either Attention Deficit Disorder/ADHD or Learning Disability to either 45 sessions of Neurofeedback or a wait list control. Treatment efficacy measures were Kaufman Brief Intelligence Test scores and the IOWA-Connors Behavior Rating Scale. Analyses of variance on intelligence test scores revealed that the required interaction between treatment condition and pre-post assessment failed to reach statistical significance. Nonetheless, the authors conducted a subsequent comparison of pre-post scores in the Neurofeedback condition and found a statistically significant difference. The authors also reported a statistically significant pre-post change in Inattention scores in the Neurofeedback condition, but it is unclear whether the necessary interaction between treatment condition and pre-post assessment was statistically significant. There was no difference in Aggressive/Defiant pre-post scores for the Neurotherapy condition, and there was no difference between the Neurotherapy and wait list condition on hyperactive behaviors at post-treatment. These statistical anomalies are important because they increase the likelihood of Type 1 inference errors. Thus, the erroneous and ambiguous nature of the statistical analyses make it extremely difficult to conclude that Neurotherapy had causal influence upon valid measures of treatment efficacy. Moreover, the use of only a wait list control condition makes it impossible to rule out nonspecific factors that could be responsible for the apparent change in the outcome measures.
Peniston and Kulkosky (1989) randomly assigned alcohol dependent subjects to either Neurotherapy or standard group therapy with psycho-educational lectures for alcohol abstinence. These two groups were compared to a non-dependent control group on Beck Depression Inventory scores and brain wave activity before and after treatment. The results showed that the Neurotherapy subjects produced greater changes in brain wave function and lower depression scores than the other two groups. In addition, there was greater abstinence maintenance following treatment in the Neurotherapy group than in the standard therapy group. Although it appears that Neurotherapy resulted in specific treatment effects, caution must be exercised in drawing such a conclusion. First, it is uncertain as to how change in depression scores qualifies as a normative standardized measure for modification of substance dependence. Second, the assessment of abstinence was conducted with research participants and spouses who were not blind to treatment condition. Third, as with Rossiter and La Vaque (1995), the comparison of Neurotherapy to standard group therapy without a wait list control condition provides no information about the efficacy of Neurotherapy per se. Thus, the findings of Peniston and Kulkosky (1989) are only suggestive of the efficacy of Neurotherapy for substance dependence.
Peniston and Kulkosky (1990) subsequently reported on the same subjects who also had completed the 16 Personality Factor (16-PF) test and the Millon Clinical Multi-Axial Inventory MCMI). The results indicated that on 4 of the 16-PF scales there were statistically significant interactions between treatment condition and pre-post assessment. Subjects in the Neurotherapy group reported increases on the Cool-Warm, Concrete-Abstract, Shy-Bold, and Practical-Imaginative scales. The data also revealed that on 4 of 20 MCMI pathological traits, there were statistically significant interactions between treatment condition and pre-post assessment. Subjects in the Neurotherapy group reported reductions in Dysthymia, Paranoia, Borderline, and Schizoid characteristics. As with Peniston and Kulkosky (1989), however, it is uncertain as to how change on a small proportion (22%) of personality dimensions following Neurofeedback qualifies as normative standardized measures for modification of substance dependence. In addition, the same experimental design and procedure limitations as Peniston and Kulkosky (1989) make it difficult to attribute these apparent changes to Neurotherapy.
Rice, Blanchard, and Purcell (1993) randomly assigned subjects meeting the criteria for Generalized Anxiety Disorder to either a wait list, pseudomeditation, frontal electromyographic (EMG) biofeedback, alpha enhancement biofeedback, or alpha suppression biofeedback. The pseudomeditation group was included as an attention-placebo to control for nonspecific factors such as treatment credibility. Treatment efficacy was assessed with the State-Trait Anxiety Inventory (STAI), the Welsh Anxiety Scale (WAS), and the Psychosomatic Symptom Checklist (PSC). The results showed only the effect of pre-post treatment assessment on all three outcome variables. Despite the absence of the required interaction, post hoc within group analyses showed that all treatment conditions showed reductions in Trait Anxiety and PSC scores relative to the wait list control. Post hoc analyses of WAS scores showed reductions in only the EMG and alpha enhancement conditions. Thus, it appears that the bulk of the apparent treatment effects are not due to the specific effects of EEG biofeedback, but rather to nonspecific placebo factors.
Vanathy, Sharma, and Kumar (1998) randomly assigned subjects meeting diagnostic criteria for Generalized Anxiety Disorder to either a wait list, alpha neurofeedback, or theta neurofeedback. The two treatment conditions provided feedback for alpha or theta augmentation and along with beta suppression. The State-Trait Anxiety Inventory was completed by all subjects and the Hamilton Anxiety Rating Scale (HARS) was administered by an interviewer blind to experimental condition. Multiple t-tests that were not corrected for Type 1 error revealed that State anxiety scores and HARS scores were lower in the two treatment conditions relative to the wait list control. Trait anxiety scores were not significantly different. Comparison within conditions showed reductions in Trait anxiety in both treatment conditions, reductions in State anxiety in theta training only, and reductions in HARS scores in alpha training only. While these results appear to suggest the effect of both types of neurofeedback, the analysis of alpha and theta showed no change in power or microvolt levels in individual EEG bandpass following either type of feedback. Thus, it appears that the symptom measures changed despite the fact that the neurological target of Neurofeedback did not. We conclude that even if there were real changes in anxiety symptoms, they cannot be attributed to the effect of neurofeedback alone. Instead, any such changes were more likely due to nonspecific treatment factors in the neurofeedback procedure. This interpretation is strengthened by the findings of Wenck, Leu, and D'Amato (1996) who found that anxiety is reduced following thermal and electromyographic biofeedback relative to a no treatment control condition.
Peniston and Kulkosky (1991) examined the efficacy of neurofeedback for combat-related Posttraumatic Stress Disorder (PTSD) using experimental procedures similar to Peniston and Kulkosky (1989, 1990). Combat veterans were randomly assigned to one of two treatment conditions. One involved standard Veterans Administration treatment that included pharmacological intervention, individual therapy, and group therapy. The other condition included pharmacological intervention, eight sessions of temperature biofeedback, and thirty, 30-minute sessions of neurofeedback. After one week of daily practice of neurofeedback, drug dosage was gradually reduced at the request of the individual participants. Treatment outcome was assessed with a number of Minnesota Multi-Phasic Personality Inventory (MMPI) scales at pre- and post-treatment. The analysis of variance revealed statistically significant reductions in the neurofeedback condition on nine of ten MMPI clinical scales in the neurofeedback condition and on the PTSD scale. The control condition showed reductions on only one of the clinical scales and no reduction on the PTSD scale. However, any conclusions about the efficacy of neurofeedback are qualified by a number of methodological limitations. The confound of pharmacological treatment and temperature biofeedback in the neurofeedback condition make it difficult to attribute any differential effect between the groups to neurofeedback per se. Most important, the comparison of neurofeedback to standard group therapy without a wait list control condition provides no information about the efficacy of neurofeedback per se (c.f., Borkovec & Castonguay, 1998; Peniston & Kulkosky, 1989; Rossiter & La Vaque, 1995). The same design flaw also makes it difficult to rule out other nonspecific factors (credibility, effort justification, allegiance effects, etc.) as responsible for the differences between treatment conditions. Thus, the findings of Peniston and Kulkosky (1991) are only suggestive of the efficacy of neurofeedback for PTSD. There have been no subsequent published reports replicating the findings of Peniston and Kulkosky (1991).
Mood disorder and other disorders
The application of a specific form of neurofeedback for depression is theoretically consistent with neurological correlates of mood disorder. The experience of positive, approach-related emotion has been associated with relative left-frontal and anterior temporal activation, while the experience of negative, withdrawal-related emotion has been hypothesized to be associated with higher relative activation of the right frontal and anterior temporal regions (Ahern & Schwartz, 1985; Davidson, 1995; Heller, 1990; Tomarken, Davidson, & Henriques, 1990). The most extensive literature supporting this hypothesis concerns relationships between emotional responses, emotional style, and anterior activation of the electroencephalogram (EEG; see Davidson, 1995; Reid, Duke, & Allen 1998, for reviews).
Other relevant EEG research has focused on the hypothesis that resting anterior asymmetry relates to a stylistic tendency toward emotional reaction that is present from infancy and early childhood (Davidson & Fox, 1989; Fox, Rubin, Calkins, Marshall, Coplan, Porges, & Long, 1995) and persists throughout the lifespan. Left frontal activation has thus been related to personality traits associated with decreased vulnerability to depression (Harmon-Jones & Allen, 1997; Sutton & Davidson, 1997; Tomarken & Davidson, 1994; Kline, Allen, & Schwartz, 1998; Kline, Blackhart, & Schwartz, 1999; Kline, Knapp-Kline, Schwartz, & Russek, in press). In addition, decreased relative left frontal activation is associated with increased vulnerability to depression (Henriques & Davidson, 1991; Davidson, 1995). Studies supporting this hypothesis predict subsequent emotional behavior on the basis of anterior asymmetries recorded during baseline conditions, or correlate anterior asymmetries with personality traits or vulnerability to depression. For example, Davidson and Fox (1989) reported that infants’ responses to distress associated with maternal separation could be predicted from previous baseline measures of frontal EEG asymmetry. Tomarken et al. (1990) reported that participants with greater relative right than left frontal EEG activation at baseline reported more negative emotions in response to negatively valenced films.
Other work has documented psychometric properties of the resting anterior asymmetry (RAA). It has shown a high degree of temporal stability (test-retest reliability) in both non-depressed (Tomarken, Davidson, Wheeler, & Kinney, 1992) and depressed (Hitt, Allen, and Duke, 1995) individuals, and appears to distinguish depressed from non-depressed individuals both during episode and during remission (Allen, Iacono, Depue, & Arbisi, 1993; Henriques & Davidson, 1991).
Baehr, Rosenfield, Baher, & Earnest, (1999) have employed these findings in applying neurofeedback to 5 depressed patients, with preliminary results that they termed "promising." However, in much of the neurofeedback literature, the promise of an alternative technique is often taken as sufficient grounds for the widespread promulgation of the technique as "empirically supported." Thus, even though frontal asymmetry neurofeedback seems a novel and timely idea, there is at present insufficient evidence to conclude that it is an empirically supported treatment.
We have found no controlled peer-reviewed studies assessing the efficacy of neurofeedback in dissociative disorders, although neurofeedback has been suggested as clinically appropriate for these disorders (Brownback, Mason, & Mason, 1999; Manchester et al., 1994).
Summary of efficacy studies
The only study to provide minimally adequate procedural and experimental controls in the treatment of valid symptoms of attention deficit disorders is Linden et al. (1996). However, erroneous and ambiguous statistical analyses make it impossible to conclude that change actually occurred. In addition, the wait list control condition can not rule out the influence of nonspecific treatment factors rather than the specific effect of Neurotherapy. The same conclusions have been reached by Waschbusch and Hill (in press).
The findings of Vanathy et al. (1998) suggests that both alpha and theta neurofeedback (with beta suppression) may result in reduction of anxiety in Generalized Anxiety Disorder. However, the fact that neurofeedback failed to change both alpha and theta in the predicted direction suggests the change in measured anxiety was an effect of nonspecific factors for which the wait list condition could not control. Further, the findings of Rice et al. (1993), who controlled for nonspecific placebo factors (pseudomeditation and EMG biofeedback), strongly suggest that EEG biofeedback does not have a specific effect in reducing symptoms of Generalized Anxiety Disorder.
We have found that the evidence for the efficacy of Neurotherapy for psychological disorders is generally limited by the use of outcome measures that have questionable psychometric and ecological validity. More important, the experimental control conditions are sufficiently weak so that the criteria for efficacious treatments (Chambless et al., 1998; Chambless et al., 1996) have not yet been met for ADHD, substance dependence, anxiety disorders, mood disorders, or dissociative disorders. At the present time, the most optimistic conclusion is that Neurotherapy might meet the criteria for "Possibly Efficacious" treatment for Generalized Anxiety Disorder (Vanathy et al., 1998). However, methodological and statistical problems render even these conclusions as questionable, and the findings of Rice et al. (1993) suggest that nonspecific treatment factors are responsible for symptom reduction.
Suggestions for Future Research
Herbert's (2000) critique of the criteria for empirically supported treatments (Chambless et al., 1998; Chambless et al., 1996) is based on the need for strong tests of the specific effects of any treatment procedure. Such tests are those that provide for the experimental disconfirmation of the putative pathological conditions that are treated, or the therapeutic mechanisms that derive from the procedures (Borkovec & Castonguay, 1998; Hazlett-Stevens, & Borkovec, 1998). The inclusion of such control conditions in Neurotherapy treatment research is critical as some neurofeedback efficacy studies have shown that changes in clinical symptoms can occur independent of the changes in brain waves (Fenger, 1998; Vanathy et al., 1998). It would appear that Neurotherapy techniques, and the neurophysiological theory upon which they are based, are ideal candidates for such strong experimental tests. Despite over 20 years of research on neurofeedback, however, it appears that nearly all of the experimental tests published in peer-reviewed journals are surprisingly weak. Only a very small proportion of such studies provide even the minimal wait list control comparison (Linden et al., 1996; Rice et al., 1993; Vanathy et al., 1998), and only one (Rice et al., 1993) has manipulated specific and nonspecific features of Neurotherapy treatment. Future research on the efficacy of Neurotherapy could include component comparisons or yoked-control procedures. The former might include comparison of augmentation feedback vs augmentation combined with suppression feedback of specific frequencies, depending on the theory of the disorder (e.g., ADHD). The latter could include the comparison of a standard Neurotherapy protocol with a false-feedback control condition, or EMG biofeedback (c.f., Rice et al., 1993). Moreover, the comparison of Neurotherapy to alternative treatments without the use of a wait list control (Rossiter & La Vaque, 1995; Peniston & Kulkosky, 1989, 1990, 1991) are not informative as they cannot provide evidence for the efficacy of Neurotherapy per se (Borkovec & Castonguay, 1998).
If the proponents of Neurotherapy wish to promote their treatment as specific and efficacious, they must do so on the basis of efficacy experiments that provide strong experimental tests. For example, it appears that depressive symptoms and the anterior asymmetry that accompanies them could be a productive focus for the test of neurofeedback theory and efficacy. Efficacy studies that provide wait list controls, nonspecific factor controls, and controls for the modification of specific brain wave functions could be strong experimental tests of the putative pathological processes and their therapeutic modification. However, we suggest that until such tests are forthcoming, behavior therapists should be cautious about the efficacy of Neurotherapy, and AABT should be more circumspect about participation in the promotion of Neurotherapy.