Abstract
The collection of articles in this special issue both raise the bar and inspire new thinking with regard to both design and methodology concerns that influence drug use/abuse research. Thematically speaking, the articles focus on issues related to missing data, response formats, strategies for data harmonization, propensity scoring methods as an alternative to randomized control trials, integrative data analysis, statistical corrections to reduce bias from attrition, challenges faced from conducting large-scale evaluations, and employing abductive theory of method as an alternative to the more traditional hypothetico-deductive reasoning. Collectively, these issues are of paramount importance as they provide specific means to improve our investigative tools and refine the logical framework we employ to examine the problem of drug use/abuse. Each of the authors addresses a specific challenge outlining how it affects our current research efforts and then outlines remedies that can advance the field. To their credit, they have included issues that affect both etiology and prevention, thus broadening our horizons as we learn more about developmental processes causally related to drug use/abuse and intervention strategies that can mitigate developmental vulnerability. This is the essential dialogue required to advance our intellectual tool kit and improve the research skills we bring to bear on the important questions facing the field of drug use/abuse. Ultimately, the goal is to increase our ability to identify the causes and consequences of drug use/abuse and find ways to ameliorate these problems as we engage the public health agenda.
All life is problem solving
Buddhist Leanings
I went through a certain “stage” in my life, very different from a Piagetian cognitive developmental stage, but still a stage that was very informative and helped to crystallize my identity. For a while, I was deeply immersed in Buddhist readings (Suzuki, 1949), Vedic stuff, and I coupled this effort with an existential search for meaning (Frankl, 1959). To be quite frank, I credit this brief hiatus from scientific reasoning for a spiritual awakening that drove me to pursue a career in psychology. During the course of this experience, I read a book by the Indian philosopher and spiritual teacher Jiddu Krishnamurti. 1 The book was titled Think on These Things (1964) and it contained a collage of personal narratives addressing several fundamental concerns regarding the basic composition of the “good” life. In his own contemplative nontheosophical style, Krishnamurti discussed education, love, freedom, ambition, beauty, conformity, and self-discipline to name a few topics. The book is truly a reflection outlining his own psychological awakening, cast in terms of life’s fundamental challenges, who are we, what is our purpose, and what should we be doing? Remembering back from when I read this influential book, I believe that his authorial goal was to capture the reader’s imagination and make him or her realize that the way we think about things provides a pretext for how we live and make sense of our world.
Recounting the influence of this book serves merely as backdrop to my present role as a guest editor of this special issue, but it gives some credence to the issue’s focus on design and methodology. Charged with the task of putting together a special issue, I took it upon myself to accomplish two things in particular. First, I wanted the contributors to tackle some important topics that allude to who we are as scientists (i.e., our identity) and especially how we conceptualize problems. Second, and harking back to my overtures to carefully think about these things, I wanted that the contributing authors examine some of the more trenchant issues we face studying drug use/abuse but from a different perspective. Finding ways to advance our knowledge base with different tools is what prompted me to ask contributors to “free wheel” a bit and think outside the proverbial box (i.e., recall I asked them to think on these things). I then cautioned them to direct their energies and focus toward design and methodological issues that are of paramount importance to research in drug use/abuse. 2
The Future Is Here Now
Design and methodology are both critical and fundamental to how we address psychological issues, they are no doubt the foundation upon which rests our experimental rigor (Cook & Campbell, 1979). The design and methodological tools we possess in our scientific tool kit allow us to exact rigorous control, statistical, and otherwise and identify the causal processes responsible for behavior change. In the case of alcohol and drug etiology, we need methods that enable us to partition complex relations reflecting personality, peer, family, and community influences, all of which are eventual targets of intervention. Prevention scientists benefit from the careful scrutiny of causal models adapting them to create logic models containing specific intervention postulates. Regardless of whether our focus rests with etiology or prevention, we test and refine these models with statistical tools based on rigorous methods and study designs that enable us to infer causality. Along the way, we test confounders, manipulate variables through transformation, prune irrelevant variables, refine our understanding of the causal processes that set vulnerability into motion, and eventually achieve parsimonious explanations of the causes, prevention, and treatment of drug use/abuse.
The field of prevention science is well aware that it faces its own trenchant issues (e.g., Greenberg, 2008). Take, for instance, a program evaluation that requires parsing variance in the dependent measure into the portion attributed to the school or classroom and the portion ascribed to the individual. Statistical innovations like hierarchical linear modeling (Raudenbush, 1997) allow us to deal with the phenomenon of clustering, where students are nested within classrooms or schools. Although clustering estimates are somewhat small as evidenced in smoking (e.g., Murray et al., 1994) and drug prevention trials (Scheier, Griffin, Doyle, & Botvin, 2002), failure to address them increases the risk of making a Type I error and biasing parameter estimates. Consider a second example, where we examine program effects on “trends” in drug use using growth modeling. One can recall that, up until recently, program efficacy was examined either by contrasting group means between experimental conditions using analysis of covariance (ANCOVA) or by hypothesizing “static” influences using fixed-effect linear regression or some type of structural path model. Advances in methodology now provide the tools to contrast slope trajectories between treated and control students with a myriad of ways to model baseline intercepts (e.g., Mason, Kosterman, Hawkins, Haggerty, & Spoth, 2003; Taylor, Graham, Cumsille, & Hansen, 2000).
This shift in our emphasis (and hypotheses) resulted from methodological improvements and statistical innovations that have made it possible to examine the timing of intervention effects and do this in the context of dynamic features of development. 3 Indeed, as a field, we have witnessed greater emphasis on developmental epidemiology (Kellam, Koretz, & Mościcki, 1999), and this has provided impetus for investigators to model program effects on developmental trajectories of target risk mechanisms and behavior across time. Investigators employing these techniques are encouraged to possess a priori hypotheses regarding differential rates of growth in target risk mechanisms or outcome behaviors following intervention exposure. Furthermore, these methods would require at a minimum to have measures of preintervention (baseline), postintervention, and follow-up data. Even more data collection is plentiful, generally boosts power (Petras, 2016), and provides a means to assess curvature and perturbation or decay of effects (e.g., Park et al., 2000). Worth noting, however, is the considerable gap in time between the introduction of drug prevention efficacy trials (1970s) and the application of latent growth modeling for program evaluation (1990s). In these intervening years, the tool kit we use to assess program efficacy expanded considerably. Given another decade or more, we will most assuredly witness introduction of methodological approaches with increasing complexity that will influence the way we go about our business in years to come.
All Things Being Equal
My overtures to the contributing authors were not by any stretch of the imagination a “blind” or unsupported request. Searches of the general behavioral science literature indicate that numerous fields have copiously addressed design and methodological issues (e.g., Clarke, 1995; Ledgerwood, 2014). These efforts have been extended to include studies of sexual risk (Schroder, Carey, & Vanable, 2003) and, closer to home, issues related to evaluating prevention programs (Massetti, Simon, & Smith, 2016). Although the conceptual foundations of these studies may vary, they provide a template upon which to understand how different fields address their methodological challenges and the ways in which they formulate their concerted responses.
As part of meeting this challenge, we revisit the sanctity of the randomized control trial (RCT), which has been called the “gold standard” in experimental research (e.g., Cartwright, 2007; Sanson-Fisher, Bonevski, Green, & D’Este, 2007). We use RCTs for the level of control they offer by equating extraneous factors (systematic sources of bias are made random) between experimental conditions that could influence treatment outcomes (e.g., what Cook and Campbell called probabilistic equivalence). Confounding is, to say the least, messy, and RCTs provide a strong research design that achieves balance of pretreatment covariates and elevates our ability to infer causation by producing unbiased estimates of treatment effects (e.g., Rubin, 1974).
There will, however, be situations where an RCT is not feasible, for ethical or design considerations, or because they are cost prohibitive. In some cases, particularly when working with marginalized or racial/ethnic populations, RCTs may not be deemed as culturally acceptable (Henry, Tolan, Gorman-Smith, & Schoeny, 2017). Moreover, under real-world conditions, and given choices regarding adopting an evidence-based treatment, many communities or schools will summarily reject the concept of random assignment. Holding back effective treatments is not in the best interests of public policy, particularly with regard to schools facing mounting pressures to eradicate behavioral problems and vying for effective “health promotion” programs. It is not uncommon then, even with RCT designs that contamination (i.e., diffusion of treatments) and compensatory rivalry will come into play, biasing findings in ways that were beyond the researcher’s control. 4 According to Rubin (2005), contamination is a violation of the stable unit treatment value assumption, which posits that for RCTs to work, the actions of one set of units (treatment) cannot affect the actions of the other (control group). In light of the potential for this to happen in the “real world,” we need to shift our mind-set to alternative strategies that can yield the same level of statistical control and experimental rigor. 5
There are other examples of challenges that have produced thinking outside the box. Null hypothesis testing has been scrutinized by several authors (Cohen, 1994; Krantz, 1999; Loftus, 1996) and yielded debate about practical versus statistical significance (Jacobson & Truax, 1991). The conceptual foundation behind utilizing discrete diagnostic “classification” as the foundation of psychiatric nosology has also been questioned (Meehl, 1995) as has the issue of dimensional versus categorical classification (Muthén, 2006). There have been criticisms of dichotomization to truncate meaningful quantitative measures, the end result losing important information, and biasing correlations (MacCallum, Zhang, Preacher, & Rucker, 2002). The latter approach is frequently used to transform highly skewed frequency of drug use measures encountered with relatively young samples prior to drug use debut. 6 Other possible methodological challenges include the effects of reporting bias, sample selection bias, participation bias, recall effects, survey methodology (ACASI vs. paper and pencil), and measurement concerns (i.e., reliability and temporal measurement invariance). In each case, these illustrations remind us that the way we “think” or perceive the problem can be a shackle to real discovery.
How We Got Here
The focus on design and methodology as it influences drug use/abuse research stems from my own long-standing professional interests in studies of drug etiology (Scheier, 2010) and prevention (Scheier, 2015). However, there is more to that story as well. The history of studies of drug use/abuse and the broader issues that surround studies of deviant behavior in general is rife with examples of design or methodological challenges. For one thing, it would be ethically reprehensible to experimentally assign people to become drug users; thus, we have to implement alternative designs that allow us to examine the natural developmental course of drug use. This has been a strong suit for etiology over the years, evidenced by numerous naturalistic longitudinal studies of youth at different ages, taking place in many different settings using samples with heterogeneous racial composition (Scheier, 2001). Statistically speaking, low base rates for self-reported drug use in youthful populations have also caused some consternation. Solutions have included negative binomial or Poisson regression, zero-inflated count models (Buu, Li, Tan, & Zucker, 2012), and two-part semicontinuous models (Olsen & Schafer, 2001); all techniques that are robust to highly skewed distributions (e.g., Atkins & Gallop, 2007).
In dealing with the prevention side of the equation, we face similar challenges. A handful of examples include dealing with attrition and selection effects (Hill, Rosenman, Tennekoon, & Mandal, 2013), noncompliance in treatment studies (Jo, Ginexi, & Ialongo, 2010) as well as prevention trials (Jo, 2002; Little & Yau, 1998; Stuart, Perry, Le, & Ialongo, 2008), assessing fidelity (e.g., Hill, Maucione, & Hood, 2007; Lee et al., 2008), and implementation concerns (e.g., Crowley, Coffman, Feinberg, Greenberg, & Spoth, 2014; Payne & Eckert, 2010); all factors that influence going to scale (e.g., Botvin & Griffin, 2010).
Despite these scientific challenges, there are numerous examples of how our colleagues’ design alternatives and methodological enhancements have become a beacon for other behavioral scientists. This includes work examining the validity and accuracy of self-report (e.g., Del Boca & Darkes, 2003), application of missing data estimation methods using the three-form planned missing design (e.g., Graham, Taylor, Olchowski, & Cumsille, 2006), advanced statistical treatment of nonresponse (e.g., Enders, 2010), addressing clustering that occurs in group-randomized trials (e.g., Murray & Hannan, 1990), alternative randomized designs (e.g., Brown & Liao, 1999), person-centered classificatory strategies to test program effects with subgroups (e.g., Lanza & Rhoades, 2013), and the decomposition of program effects using mediation analyses (e.g., MacKinnon, Taborga, & Morgan-Lopez, 2002; Scheier, Botvin, & Griffin, 2001) to name a few. 7 To me, as important as these challenges are to both etiology and prevention, they only scratch the surface. It should be clear that there are even greater challenges that await us, and this special issue is devoted to sketching out a few of these challenges in a rudimentary, conceptually driven manner.
Article Anthology
With this in mind, the articles in this special issue address different design issues and methodological challenges that influence drug use/abuse research. In the first article, Mason and colleagues bring into question the hegemony of hypothetico-deductive reasoning and offer a completely different view of the scientific reasoning that we apply to support logical inferences. Their counterpart is the abductive theory of method, which has strong roots in philosophical thinking regarding theories of evidence. Unlike hypothetico-deductive reasoning, which deduces from cause to effect, the abductive theory of method works “backward” from the problem to find the best fitting explanation(s). Included in their wonderful article is a challenge to the supremacy of the RCT. For a variety of reasons, the RCT does not offer the best experimental approach or offer the certitude required to resolve several issues at hand (i.e., studies of interventions seeking policy shifts or environmental interventions involving whole communities). After they remind us of the various caveats that come along with design and methods, they offer several recommendations to push the event horizon, with each suggestion building a composite of reasoning and understanding regarding why a prevention program or intervention works.
In the second article, Hansen and colleagues lay the foundation for data harmonization and manipulations that will eventually provide a means to create faux “virtual” controls. Their motivation for integrating data is the tremendous costs associated with collecting data on concurrent control students (or schools) that are part of drug prevention trials. Controls represent the counterfactual or potential outcomes that would occur in the absence of an intervention. Finding ways to reasonably and cost-effectively create synthetic comparators (Hansen, Derzon, & Reese, 2014) creates cost economies that would have ripple effects in the prevention industry, to say the least. Synthetic comparator controls are derived from extant data that can be used to evaluate already disseminated drug prevention studies or local programs where funding is not sufficient to collect data on both treated and control youth. However, we currently lack systematized strategies to engage data harmonization procedures leading Hansen and colleagues to produce a workable framework and then testing its heuristic utility. Their article elaborates several crucial considerations including the necessary steps to create “conceptual” concordance between items and scales from different studies. They also apply Random Forest imputation procedures to handle missing data and using receiver operating characteristic curves to evaluate model performance. The end result of their data harmonization efforts is to achieve parsimony and efficiency with an eye cast toward cost-effectiveness.
The past few years has seen increasing discussion, acrimonious at times, regarding the replicability crisis in psychology (Maxwell, Lau, & Howard, 2015), and this has been extended to other disciplines as well (e.g., Goodman, Fanelli, & Ioannidis, 2016; Valetine et al., 2011). A good deal of this discussion revolves around the need for clearer guidelines in reproducing study findings particularly noting the differences between “exact” and conceptual replications and the manner in which we can discover regularity in human behavior (Stroebe & Strack, 2014). Clearly, this debate applies to intervention effects that may be considered “specious” if not replicated when moving from efficacy to effectiveness trials. Scientists have a much greater chance to detect reliable and consistent behavior patterns as well more judiciously test theory if they “pool” existing data sets, overcoming the limitations that may be ascribed to smaller underpowered studies when examined in a stand-alone manner. In the next article, Curran and colleagues highlight this important discussion over replication and then illustrate the many benefits of integrated data analysis (IDA). They typify its application by examining youth polydrug use with multiple longitudinal samples delineated based on whether a child’s parent was an alcoholic. In this particular case, they ask whether polysubstance use is theoretically and psychometrically consistent when assessed as a latent construct across several studies.
The studies share similar design profiles (i.e., following children of alcoholics longitudinally) and are all concerned about the deleterious effects of an alcoholic parent on child outcomes. However, they don’t contain the exact same observed measures of polydrug use, requiring strategies aimed at achieving data harmonization, weighting, and imputation. The fact that three studies are used provides a rigorous means to test the “idiosyncrasies” of the study, its population, design, and setting versus the staying power of the ideas being tested (i.e., theoretical representations). The authors outline the numerous strengths of IDA (i.e., stretched cohorts increasing the age span studied) and also note its limitations because it is not a panacea in some cases (i.e., measurement differences between studies). Still, it is a very important tool in the arsenal as we combat detractors who suggest prevention science consists of a smattering of small studies, producing small effects sizes and failing to uncover reproducible behavioral patterns that demonstrate the lasting utility of our efforts.
In the fourth article, Chang and Little synthesize three design innovations including a multiform questionnaire protocol using planned missingness, visual analog scaling (VAS), and retrospective pretest–posttest design. From an evaluation point of view, blending these design tactics is quite novel for several reasons, not the least of which is cost-efficiency. Their article includes a wonderful exposition on the fundamentals of missing data and the use of multiform protocols. The cost-efficiencies represent one advantage, reducing negative emotions from participants faced with burdensome surveys is another one. The overall gains include increased power and precision in the parameter estimates. The second offering is visual analog scaling, perhaps the strongest response to the multi-item Likert-type scales (e.g., Guyatt, Townsend, Berman, & Keller, 1987), offering a surefire alternative to the categorical scale problem. Visual analog scales include extreme anchors with clear interval level demarcations (60% is twice 30%) 8 that is not available with categorical Likert-type scales (4 or agree is not twice 2 or disagree response). Moreover, the VAS is perhaps more sensitive (i.e., responsive) to detect clinically significant behavior “changes” over time (Hasson & Arnetz, 2005). The final piece in the equation is the retrospective pretest–posttest design, which also provides cost-efficiencies (eliminating the pretests entirely), and avoids response shift bias (Howard, 1980). Asking the respondent at posttest how they feel “at this time” or currently and how they felt “when they started the study” (pretest) levels the playing field with respect to their “frame of reference” or the internal standard respondents use to gauge their attitudes or behavior. This approach also eliminates any bias that may arise from overestimation of one’s capabilities, behavior, or attitudes, producing a more veridical estimation. If there is any change worth noting, the respondent can answer based on their own true assessment, without having to oscillate back and forth searching for a frame of reference (respondents need “anchors” to set their mind on some internal comparison; Jacowitz & Kahneman, 1995), which is lacking in a true pretest. Failure to address response shift bias poses a threat to internal validity. The authors then coalesce their methodological innovations using two excellent empirical illustrations using longitudinal data.
In the fifth article, Rhew and colleagues examine program effects of a relatively large-scale community-based drug prevention intervention using inverse probability weighting (IPW) to adjust for students leaving their school catchment area during the course of the longitudinal study. No RCT can be completely pristine, and students may exit a study for a variety of reasons, some academic (i.e., performance), some based on socioeconomic factors (i.e., parent’s earning potential, job relocation), and extraneous factors (i.e., neighborhood factors compel parents to move). Regardless of the prime reason for relocating, failing to address subject loss (especially if this loss differs by condition) threatens the internal validity of the study and can bias findings. Earlier multilevel analyses using the “intention-to-treat” method provided evidence of favorable intervention effects. The reanalysis using IPW considers “exposure” as a major factor in prevention outcomes. This provides a more sophisticated means of grappling with the basic tenets of prevention science (i.e., “how much dose is enough?”) and also creative ways to handle missing data. The article discusses additional strengths of the IPW approach including maintaining the integrity of the RCT design by avoiding biases that arise from postrandomization data manipulations. Overall, the authors report there are clear benefits to youth staying in the communities assigned to the experimental condition, which gave them more exposure to prevention activities and was associated with better behavioral outcomes compared to control youth, who also stayed in their community.
In the sixth article, Tein and colleagues tackle the perennial problem of high rates of dropout encountered with randomized community-based interventions. In their case, they offered court mandated parenting classes to divorced parents in an effort to improve parenting skills (e.g., ability to navigate divorce with reduced parent conflict) and improve their children’s behavioral outcomes. The problem is that many parents signed up and then showed up for at most one or two classes. In some cases, parents failed to show up at all and basically dropped out of the study. The subject loss undermines power and also makes it hard to discern intervention efficacy. To develop a methodology that can resolve this issue, the authors hypothesized that some exposure to the “active ingredients” is better than nothing. However, the study design called for an active control group receiving some form of intervention. To rectify this situation, the authors applied a propensity scoring method also using IPW to reintroduce dropout parents back into the analytic scheme. Essentially, they demarcated parents that attended any sessions, those administered the “active control” condition, and parents in any condition who attended no sessions at all. With this postrandomization strategy in hand, they used propensity score methods to “balance” covariates between the conditions, essentially preserving the ability to make strong causal inferences regarding program effects. In so doing, they effectively reduced the high dimensionality of the covariates (one scalar weight vs. numerous covariates), balancing the different experimental groups on preexisting conditions that could confound outcomes. 9 By reducing bias that might exist between conditions, even after program implementation, the strategy offers a modicum of precision to ascertain whether the intervention “worked,” particularly when contrasting parents with some intervention exposure (i.e., a dose effect) versus those that had nothing. 10
Shifting gears somewhat, in the final article in this special issue, Derzon recounts the many pitfalls we encounter conducting large-scale evaluations; providing potential remedies that we have at our fingertips. One formidable issue in evaluation science is the inclusion of “macro” or host setting measures that affect how well a program is implemented. Derzon calls these the “activities, contexts, and characteristics of the settings implementing the intervention,” but more generally, they are the “conditions of implementation.” The quandary faced by an evaluator is “what happens when we implement the same program in two different settings and obtain two different results?” In order to formulate an answer, it is essential that we first recognize that macro or contextual factors can and do affect implementation and trickle down to the level of individual behavior change. Implementation research repeatedly shows that teachers lacking “buyin,” feeling that their training was insufficient, or believing the school organizational climate failed to support the intervention don’t deliver the same precise curriculum as teachers who possess greater passion and zeal (e.g., Pas, Waasdorp, & Bradshaw, 2015; Wanless, Patton, Rimm-Kaufman, & Deutsch, 2013). Recognizing the sheer impact of these setting-level factors can help to improve our understanding of how to maximize positive program outcomes (Durlak & DuPre, 2008). Requirements in the process of evaluating setting influences (and implementation quality) include obtaining reliable and valid measures that can be used to monitor organizational and training factors that may inhibit program delivery (i.e., moderate program effects). Also, finding the right balance between qualitative and quantitative methods, use of existing archival data (i.e., putting a finger on the pulse of organizational factors), and implementing a sophisticated design (e.g., rapid cycle evaluations that are used by hospitals) to tease apart the myriad of different setting effects are all factors that can improve precision in estimating program effects (and intersite differences). Truth be told, if a program evaluation entails 30 schools or 30 communities, each site is really an intervention of its own; an independent “trial” with its own set of crisis revolving around delivering the intervention with fidelity.
Conclusion
The articles in this special issue represent a vanguard of design and methodological approaches that are set to push the event horizon in prevention science for years to come. Importantly, the articles suggest cutting-edge techniques that can provide unique insight into the questions we pose about etiology and prevention. This is the stamp of imprimatur for those emphasizing methodology, and their mission involves providing answers to questions about very deep and profound issues, in our case the causes, consequences, and prevention of drug use/abuse. This is no easy task, as I have reiterated in several places. It should be patently clear to readers that only when we consider the challenges made threadbare in the special issue articles that we will ultimately avail ourselves of the tools required to pursue scientific resolutions to very important human problems. The articles in this special issue go a long way toward reinforcing that our techniques require refinement, we need new ways of seeing things, and the balance of our work remains in front of us.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
