Abstract
Introduction
ADHD is characterized by pervasive and impairing levels of inattention and/or hyperactivity/impulsivity (H/I; American Psychiatric Association [APA], 2013). The worldwide prevalence of the disorder is estimated around 5% in school-aged children (Polanczyk, de Lima, Horta, Biederman, & Rohde, 2007). Follow-up studies showed that up to 65% of individuals with ADHD present with impairing symptoms in adulthood (Faraone, Biederman, & Mick, 2006).
ADHD is a major public health issue and imposes an enormous burden on society. The average annual incremental costs of ADHD have been recently estimated at US$143 to US$266 billion in the United States (Doshi et al., 2012). In Europe, annual national costs range between €1,041 and €1,529 million (Le et al., 2013).
The management of ADHD is multimodal, including pharmacological and non-pharmacological strategies. Psychostimulants, that is, methylphenidate and amphetamine derivatives, are considered the first-line intervention for the treatment of ADHD in children and adolescents in several countries, including the United States (Pliszka & AACAP Work Group on Quality Issues, 2007), and for severe cases in others, such as the United Kingdom (Kendall, Taylor, Perez, & Taylor, 2008). Despite the availability of several formulations of psychostimulants, methylphenidate immediate-release (MPH-IR) continues to be the most used medication for ADHD in many countries (Brault & Lacourse, 2012; Hodgkins, Sasane, & Meijer, 2011; Pottegard, Bjerregaard, Glintborg, Hallas, & Moreno, 2012; Zoega et al., 2011).
Meta-analyses have demonstrated the efficacy of MPH-IR for children, adolescents, and adults in short-term (defined here as ≤12 weeks) treatment studies (Faraone, 2009; Faraone & Glatt, 2010). However, efficacy in the longer term (≥12 weeks) is less clear. Indeed, to date there is only one systematic review of long-term treatment with psychostimulants for ADHD, published more than a decade ago (Schachar et al., 2002).
To evaluate the long-term (>12 weeks) effects of MPH-IR, and to possibly confirm the efficacy established in short-term meta-analyses, we performed an updated systematic review of the literature and meta-analysis, including published and unpublished studies available up to April 2014. We pooled data regarding the efficacy of MPH-IR and we explored the feasibility of pooling data regarding adverse events, tolerability, or adherence. In addition, we performed a meta-regression analysis to assess if different factors (i.e., age, study quality, and treatment length) influenced the heterogeneity of the results.
Method
Selection Criteria
Study type
We included peer-reviewed randomized clinical trials, open clinical trials, or studies in naturalistic settings lasting more than 12 weeks. The inclusion of different studies designs was considered due to the paucity of long-term randomized clinical trials in this field. This was supported by results of a previous study (Sprafkin & Gadow, 1996) that aimed to compare information about stimulant drug effects generated in an uncontrolled setting versus those in a highly controlled research protocol involving placebo, and double-blind conditions. The authors found that a rigorous controlled protocol generates very similar data of an uncontrolled trial from a community-based outpatient service.
Although the threshold of 12 weeks is arbitrary, it was based on definitions from previous systematic reviews in the field (Ferguson, 2000; Greenhill et al., 2002; Huang & Tsai, 2011; Jadad et al., 1999; Schachar et al., 2002; Van de Loo-Neus, Rommelse, & Buitelaar, 2011); we note that studies of 12 weeks or more have not been explicitly explored in previous reviews (Faraone, 2009; Faraone, Biederman, Spencer, & Aleardi, 2006; Faraone & Buitelaar, 2010).
To avoid Type I errors, articles were excluded if they had less than 20 patients in each arm (Kraemer, Gardner, Brooks, & Yesavage, 1998; Turner, Bird, & Higgins, 2013).
Population
We included studies assessing children (aged 6-18 years) with ADHD, diagnosed according to Diagnostic and Statistical Manual of Mental Disorders (3rd ed. [DSM-III], APA, 1980; 3rd ed., rev. [DSM-III-R], APA, 1987; 4th ed. [DSM-IV], APA, 1994; or 4th ed., text rev. [DSM-IV-TR], APA, 2000) criteria. IQ lower than 70 and neurological comorbidities were exclusion criteria. We also excluded studies in which participants were concomitantly treated with medications other than MPH-IR.
Search Methods for Identification of Studies
Electronic searches
PubMed, EMBASE, Cochrane database, ISI Web of Knowledge, PsychINFO (American Psychological Association), and the Latin American and Caribbean Health Science Literature Database (LILACS) were searched from 1980 to April 2014. Online Appendix A provides a detailed description of the terms and syntax used in the electronic search for each database. We included published papers as well as papers accepted for publications but not yet published in the printed version of the journal issue (i.e., online first and/or ahead of print).
Searching other resources
Unpublished clinical trials were searched in registers websites: World Health Organization International Clinical Trials Registry Platform (WHO ICTRP), Centre for Reviews and Dissemination, ClinicalTrials.com, ClinicalTrials.gov, Current Controlled Trials, and TrialsCentral.org. Researchers and pharmaceutical companies that focused on ADHD research and treatment were also contacted to inquire about ongoing clinical trials and additional information.
Identification and selection of studies
Reference lists from selected articles and systematic reviews were also systematically screened to minimize the chance of missing any relevant study. Potential studies that fulfilled the inclusion criteria were selected for a detailed review. In the absence of an abstract, or if no adequate information were found, we assessed the full text. If a sample was described in more than one article, preference was for the most complete data, and/or the largest sample size. The potential abstracts of interest were selected by one author (C.R.M.M.) and all these abstracts were reviewed by two authors (C.R.M.M. and G.V.P.). Comparison on inclusion rates resulted in kappa = 0.8.
Data extraction
From each selected study, we extracted the following data: country where the study was conducted, length of MPH-IR use, age (mean and SD), ADHD diagnostic criteria, number (%) of males, mean total daily dose at endpoint (mg/kg/day), number of patients at the baseline and endpoint, rating scale used as primary outcome, mean/SD of the rating scale score at baseline and last observation, and data on adherence and adverse events provided by rating scales. To assess the effect of the treatment detected by different raters, we collected separately the outcome measures from teachers and parents, both for inattentive and H/I scores. Authors were contacted if data were missing, incomplete, or unclear.
The data extraction was made independently by two authors: C.R.M.M. and A.C. (κ = 0.9). Disagreements in data extraction were evaluated and resolved after discussion with the senior author (L.A.P.R.).
Assessment of study quality
The methodological quality of the selected articles was assessed according to a checklist designed for both randomized and non-randomized studies (Downs & Black, 1998).
Data Analysis
Effect sizes were calculated as the standardized mean difference (SMD) from the primary outcome measure, based on pre- and post-treatment values of each study, divided by the pooled SD (Cohen, 1992). Each study was weighted according to the number of participants. As not all studies clearly defined a primary outcome, we chose the data collected with ADHD rating scales accepted by international guidelines (American Academy of Pediatrics, Subcommittee on Attention-Deficit/Hyperactivity Disorder, & Committee on Quality Improvement, 2001; Kooij et al., 2010; National Institute for Health and Clinical Excellence [NICE], 2009; Pliszka & AACAP Work Group on Quality Issues, 2007; Subcommittee on Attention-Deficit/Hyperactivity Disorder, Steering Committee on Quality Improvement and Management, 2011; Turgay et al., 2005; see supplementary material in online Appendix B for further details).
The I2 statistic was used to assess heterogeneity among studies. Values between 0 and 100 represent the range percentage of heterogeneity and variability, where values between 25% and 50% indicate low heterogeneity, those between 50% and 75% moderate heterogeneity, and those above 75% are considered as high heterogeneity (Higgins, Thompson, Deeks, & Altman, 2003). Considering the large variability that might be present across studies due to different methodological approaches, the effect of treatment was analyzed with the DerSimonian and Laird method of random-effect meta-analysis (DerSimonian & Laird, 1986).
To identify if the results were affected by a particular study, a sensitivity analysis was performed, using the leave-data-out jackknife method, whereby one study at time was removed and the pooled effect size was re-calculated (Saltelli, 2002; Wu, 1986).
In addition, we also conducted a random-effects meta-regression analysis using the pooled SMD of treatment effect for inattention and H/I according to parents and teachers evaluations. The co-variates assessed were age, treatment length, study design, and study quality. We chose them based on a conceptual review of the literature and availability from the studies included. As covariates to be included in the final model of the multivariate meta-regression, we selected those that were associated with a flexible p ≤ 0.2 (Maldonado & Greenland, 1993) for the symptoms scores in the univariate analyses. The estimation of the between-study variance, tau2(residual), was performed by Restricted Maximum Likelihood (REML).
We also explored the feasibility of conducting a meta-analysis of adherence to treatment and adverse events rates.
The STATA 11.0 software (StataCorp, 2009) was used for all analyses.
Results
The bibliographic search across electronic databases retrieved initially 15.857 records. After exclusion of duplicates and after discarding studies that, based on the title, did not clearly met our inclusion criteria, 4,498 abstracts were selected for abstract review. The contact with ADHD experts, manual search of the relevant reviews (37 articles), and the clinical trials databases retrieved no abstract, while pharmaceutical companies provided 12 additional abstracts. This first review was performed by one author (C.R.M.M.) resulting in the inclusion of 175 abstracts. Two authors (C.R.M.M and G.V.P.) reviewed these 175 abstracts and most of them were excluded due to lack of information, even after attempts to contact the authors (when the e-mail address was available), resulting in 28 studies potentially pertinent for the present meta-analysis. After careful examination of the full text by two authors (C.R.M.M. and A.C.), 21 studies were discarded due to exclusion criteria (Supplementary Table 1, online Appendix C). Retained studies included 444 children and adolescents, 125 of which did not complete the treatment study. The PRISMA (Moher, Liberati, Tetzlaff, & Altman, 2009) flowchart for literature search and article selection is presented in Figure 1.

Flowchart.
Regarding the primary outcome measure, two scales were used by more than one study: the Turgay DSM-IV-Based Child and Adolescent Behavior Disorders Screening and Rating Scale (T-DSM-IV-S-H; Ercan, Amado, Somer, & Çıkoğlu, 2001), and the Swanson, Nolan, and Pelham–IV (SNAP-IV) Questionnaire (Swanson et al., 2001). In one study (Klein, Abikoff, Hechtman, & Weiss, 2004), data were collected just for the first year, as patients were switched to placebo in the second year. Only three studies had available data for parents and teachers in all ADHD symptom domains: Multimodal Treatment of ADHD study (MTA Cooperative Group, 1999), Ercan et al. (Ercan, Ardic, Kutlu, & Durak, 2014; Ercan, Varan, & Deniz, 2005; MTA Cooperative Group, 1999), and Yildiz et al. (Yildiz Oc et al., 2007). In addition, seven studies (Chazan et al., 2011; Ercan et al., 2014; Ercan et al., 2005; Klein et al., 2004; MTA Cooperative Group, 1999; Schachar, Tannock, Cunningham, & Corkum, 1997; Tsai et al., 2013; Yildiz Oc et al., 2007) provided data for the domain H/I rated by parents (only). For the domain “inattention,” five studies (Chazan et al., 2011; Ercan et al., 2014; Ercan et al., 2005; MTA Cooperative Group, 1999; Tsai et al., 2013; Yildiz Oc et al., 2007) provided information from parents and four (Chazan et al., 2011; Ercan et al., 2014; Ercan et al., 2005; MTA Cooperative Group, 1999; Yildiz Oc et al., 2007) provided information from teachers. The studies’ characteristics are summarized in Table 1.
Study Characteristics of the 07 Immediate-Release Methylphenidate Clinical Trials Included.
Note. MPH = methylphenidate; IOWA-C (H) = IOWA Conners Rating Scale (hyperactivity); b.i.d. = twice daily; DB = double blind; RCT = randomized clinical trial; DSM-III-R = Diagnostic and Statistical Manual of Mental Disorders (3rd ed., rev.; American Psychiatric Association, 1987); SNAP-IV = Swanson, Nolan, and Pelham–IV; t.i.d. = 3 times a day; DSM-IV = Diagnostic and Statistical Manual of Mental Disorders (4th ed.; American Psychiatric Association, 1994); CPRS-H = Conners Parent Rating Scale–Hyperactivity Index; CTRS-H = Conners Teacher Rating Scale–Hyperactivity Index; N/I = not informed on mg/kg/day; T-DSM-IV-S-H/I = Turgay DSM-IV-Based Child and Adolescent Behavior Disorders Screening and Rating Scale (hyperactivity); Prosp/Nat = prospective/naturalistic; H/I = hyperactivity/impulsivity.
The pooled SMD for H/I evaluated by parents and teachers were, respectively, 1.12 (95% confidence interval [CI] = [0.85, 1.39]) and 1.25 (95% CI = [0.7, 1.81]). For inattention, the SMD for parents was 0.96 (95% CI = [0.60, 1.32]), and, for teachers, 0.98 (95% CI = [0.09, 1.86]). There was significant heterogeneity across studies even when inattention and H/I and data from parents and teachers were analyzed separately, as showed by the forest plots (Figures 2 and 3). Therefore, we conducted a sensitivity analysis excluding one study at a time. For parent inattention, the individual exclusion of each study at a time did not decrease heterogeneity substantially. For parent H/I, the exclusion of MTA (MTA Cooperative Group, 1999) modified the heterogeneity from moderate to low. Last, for teachers’ inattention and H/I, the exclusion of MTA (MTA Cooperative Group, 1999) decreased the heterogeneity from high to low and high to moderate, respectively.

Forest plot of inattention rating scales measures according to parent’s and teacher’s evaluations.

Forest plot of hyperactivity/impulsivity rating scales measures according to parent’s and teacher’s evaluations.
Regarding the meta-regression we found (a) for parents inattention, only paper quality was associated with inattentive scores for a p ≤ 2; (b) for teachers inattention, paper quality and study design were associated with inattentive scores for a p ≤ 2, and in the multivariate analysis none of them kept significance; (c) for parents H/I, paper quality and study design were associated with H/I scores for a p ≤ 2; however, in the multivariate analysis, none of them kept significance; (d) in teachers H/I, age mean, paper quality, and study design were associated with H/I scores for a p ≤ 2, and again none of them kept significance in the multivariate analysis (see Table 2).
Covariates and Information Source: Effect on Standard Mean Difference in Meta-Regression Analysis.
Note. Obs1: For parents and teachers, the covariates in inattention symptoms were not included due to insufficient number of observations. Obs2: The estimative of the between-study variance was assessed by REML. CI = confidence interval; REML = Restricted Maximum Likelihood.
I2(residual) = 32.28; p = .22. **I2(residual) = 0; p = .25.
Adherence to treatment and adverse events were not clearly reported in the majority of papers. Three studies measured adverse events with a rating scale, and reported explicitly the existence of dropouts due to adverse events (Ercan et al., 2005; MTA Cooperative Group, 1999; Schachar et al., 1997; Wang et al., 2011). One study (Klein et al., 2004) reported 10 dropouts from the MPH-treated group in the first year of treatment, but no additional data were provided.
With regards to adherence, the MTA (MTA Cooperative Group, 1999) reported that missing data were addressed with the Last Observation Carried Forward (LOCF) analyses, and only Chazan et al. (2011) used the Mixed Effect Model approach to deal with missing data.
Therefore, due to the heterogeneity of data presented in the papers, and the lack of information in most of the studies, it was not possible to combine outcomes on adverse events, tolerability, or adherence.
Discussion
To our knowledge, this is the first meta-analysis that assessed the efficacy of MPH-IR for childhood ADHD treatment in a long-term regimen (≥12 weeks). A robust effect of MPH for both parents’ and teachers’ reports was found for inattentive as well as H/I symptoms. Previous reviews have clearly documented the efficacy of stimulants for ADHD in the short-term (Faraone, 2009; Faraone, Biederman, Spencer, & Aleardi, 2006; Faraone & Buitelaar, 2010). In line with previous meta-analyses of combined psychosocial and pharmacological treatments (Majewicz-Hefley & Carlson, 2007), and of stimulants for ADHD (Faraone, 2009), we found a large effect size for MPH-IR.
Differently from many systematic reviews and meta-analyses (Faraone, 2009; Faraone, Biederman, Spencer, & Aleardi, 2006; Faraone & Glatt, 2010), the core symptoms from rating scales, as well as raters’ evaluations were analyzed separately in our study. This confers strength to our analyses and allows more accurate comparisons between raters. In a previous meta-analysis assessing MPH and psychosocial treatments (Van der Oord, Prins, Oosterlaan, & Emmelkamp, 2008), the authors found significantly higher effect sizes for teacher ratings compared with parents ones. The robust effect sizes for both parents and teachers in our study were independent from the symptomatic dimension assessed. This reinforces the notion that both home and school environments provide important and complementary data from different moments of the day. Also, it is important to bear in mind that variability of effect sizes for MPH effects by different information sources might translate the noise introduced by using different instruments to gather data from the same information source among different studies.
As stated by Higgins (Higgins & Thompson, 2002), I2 values exceeding 56% could “induce considerable caution,” and values below 31% “might cause little concern.” Based on this, we conclude that parents’ heterogeneity was found around the expected values, but teachers exceeded the cutoff proposed by Higgins. However, the exclusion of the MTA from the analysis demonstrated a robust change in the heterogeneity of the scores only for parent’s H/I (from I2 = 64.9%, p = .009 to I2 = 7%, p = .372).
Our findings must be interpreted in the light of several limitations. First, despite the large effect sizes that we found, this meta-analysis was based on a limited number of studies (n = 7). Therefore, caution should be used in interpreting its results and generalizing them to the daily clinical practice. Second, we did not include exclusively randomized placebo-controlled studies; rather, we allowed the inclusion of uncontrolled clinical trials. However, limiting our analysis to randomized clinical trials (RCTs) would have more than halved the number of studies included in the meta-analysis, thus generating uncertainty regarding our results. Importantly, we noted that no difference between a rigorous double-blind placebo-controlled design and a study conducted in an outpatient community to evaluate the MPH dose effect in ADHD patients has been reported (Sprafkin & Gadow, 1996). Moreover, meta-analysis of observational studies without control groups is a common procedure that is recommended by the Cochrane group to provide evidence of effects for long-term outcomes (Reeves, Deeks, Higgins, & Wells, 2011). In addition, as shown in sensitivity analyses of previous studies, the inclusion of open clinical trials does not alter the effect size (Balk et al., 2002; Hanwella, Senanayake, & de Silva, 2011). Third, we did not use the state-of-the-art tool to evaluate bias, that is, the Cochrane risk of bias assessment tool (Higgins, Altman, & Sterne, 2011), or the Jadad’s instrument, yet one of the most widely used scales (Jadad et al., 1996) to assess bias. However, as only three of the studies included here were RCT, the Cochrane risk of bias assessment tool and the Jadad’s instrument would have not been applicable to all studies. Fourth, both reviewers (C.R.M.M. and A.C.) who made the extraction and data revision were not blind to paper identification. However, this procedure was tested and proved to be unnecessary for conducting a meta-analysis (Berlin, 1997). Fifth, we were not able to collect information about missing data from papers. However, we noted that a meta-analysis on stimulants for children (Faraone & Buitelaar, 2010) demonstrated no significant difference with or without the LOCF approach. Sixth, we could not combine data on adverse events during MPH-IR treatment. In this regard, we note, however, that a recent expert-consensus paper (Cortese et al., 2013) concluded that most of adverse events during treatment with ADHD drugs are manageable without discontinuing the treatment. Seventh, the small number of studies included in this meta-analysis may have compromised the power to detect a true heterogeneity. However, one should bear in mind that the best approach to deal with this is to use the I2 statistics as we did in our manuscript (Higgins et al., 2003). Last, we did not provide the treatment effect according to MPH-IR daily dose due to small number of papers containing these data (five studies).
Despite the mentioned limitations, our findings provide robust evidence for the efficacy of MPH-IR in treating children and adolescents with ADHD longer than 12 weeks. Our data support the notion that investing in one of the least expensive pharmacological treatments for ADHD can be highly effective. Our findings certainly need to be tested in larger samples. In addition, further comparative meta-analyses are needed to establish the pooled head-to-head long-term efficacy of available drugs for ADHD. Thus, this study may be the first step for encouraging decision makers and grant funders to invest in head-to-head long-term studies comparing MPH-IR and the new generation of ADHD medications, as was done for second- and third-generation antipsychotics (Fraguas et al., 2011; Shaw et al., 2006).
Footnotes
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Dr. Carlos Renato Moreira Maia receives financial research support from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq). During the elaboration of this work he received financial research support from Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), has served as speaker to Novartis, developed educational material to Novartis, and received travel awards from the Health Technology Assessment Institute (IATS), Universidade Federal do Rio Grande do Sul (UFRGS), and travel and registration support to the Fourth World Congress on ADHD from the World Federation of ADHD. Dr. Samuele Cortese has served as scientific consultant for Shire Pharmaceuticals from June 2009 to December 2010. He has received support to attend meetings from Eli-Lilly and Co in 2008 and from Shire in 2009-2010. He receives royalties from Argon Healthcare Italy (educational website on ADHD). There are no further conflicts of interest. Graduate student Arthur Caye receives financial research support from Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), and travel support from Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP). Graduate student Thomas Kuhn Deakin has no conflict of interest. Prof. Guilherme Vanoni Polanczyk has served as a speaker and/or consultant to Eli-Lilly, Novartis, Janssen-Cilag, and Shire Pharmaceuticals; developed educational material to Janssen-Cilag; received travel awards from Shire for taking part in two scientific meetings; receives authorship royalties from Manole Editors; and received unrestricted research support from Novartis. Prof. Carisi Anne Polanczyk has no conflict of interest. Prof. Augusto Paim Rohde was on the speakers’ bureau/advisory board and/or acted as consultant for Eli-Lilly, Janssen-Cilag, Novartis, and Shire in the last 3 years. The ADHD and Juvenile Bipolar Disorder Outpatient Programs chaired by him received unrestricted educational and research support from the following pharmaceutical companies in the last 3 years: Abbott, Eli-Lilly, Janssen-Cilag, Novartis, and Shire. He also receives authorship royalties from ArtMed and Oxford Press.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by research grants from National Council for Scientific and Technological Development (CNPq, Brazil; Edital MCT/CNPq/CT-Saúde/MS/SCTIE/DECIT Nº 067/2009) [Bolsa de Produtividade em Pesquisa to G.V.P., L.A.P.R.], and Hospital de Clinicas de Porto Alegre (HCPA), Porto Alegre, Brazil.
