Abstract
We sought to identify, examine, and summarize empirical literature focused on early childhood behavior interventions examined using a single case research designs (SCD) and published between 2001 and 2018. Using systematic procedures, 28 studies that met established inclusion criteria were identified, reviewed, and compared with respect to general and social validity assessment characteristics of SCD studies on behavior interventions for young children with problem behavior. The findings of the current review suggest: (a) promoting implementation fidelity through implementation support to improve social validity outcomes, (b) providing guidelines for timing and frequency of social validity assessment, and (c) development of social validity assessment tools designed to assess each of the social validity dimensions (i.e., goals, procedures, and outcomes).
Young children who engage in behaviors that adults perceive as challenging, such as physical aggression, property destruction, tantrum, and prolonged crying, often experience negative developmental, social, and behavioral outcomes (Brauner & Stephens, 2006; Dunlap, Ester, Langhans, & Fox, 2006; Powell, Dunlap, & Fox, 2006). Because of the negative trajectory associated with these problem behaviors in young children, national organizations and researchers have emphasized providing early intervention using evidence-based practices in toddler and preschool years (Conroy, Dunlap, Clarke, & Alter, 2005; Fox, Dunlap, & Powell, 2002; Hemmeter, Snyder, Fox, & Algina, 2016; Wood, Blair, & Ferro, 2009). One factor affecting the use of evidence-based interventions is whether the interventions are socially valid (Horner et al., 2005). Many researchers have stressed that when the interventions are socially valid, they are feasible and result in long-lasting reductions of problem behavior, leading to improvement, generalization, and maintenance of child outcomes (Moes & Frea, 2002; Oono, Honey, & McConachie, 2013). Therefore, the field has emphasized the importance of promoting the use of socially valid interventions that are feasible, effective, and sustainable in typical natural settings (Gerow et al., 2018; Horner et al., 2005; Schwartz & Baer, 1991; Spear, Strickland-Cohen, Romer, & Albin, 2013).
Baer, Wolf, and Risley (1968), who introduced the concept of social validity as an essential feature of applied behavior analysis (ABA), emphasized the socially important goals and outcomes of intervention in discussing being “applied” and “effective” within the dimensions of ABA. The researchers indicated that behaviors targeted for intervention must be important to the person and society and that intervention should result in socially significant behavior change. In reaffirming the emphasis on social validity, Wolf (1978) discussed three dimensions of social validity: (a) social importance of goals (socially important dependent variable), (b) social acceptability of procedures (practical and cost-effective implementation of the independent variable; implementation of the independent variable over extended time period by typical intervention agents), and (c)social importance of outcomes (socially important magnitude of change in the dependent variable). Researchers have employed parts or all of these dimensions of social validity in designing, implementing, and evaluating interventions. These dimensions are essential to the decision-making process regarding what behavior should be changed, how it should be changed, how much the behavior should be changed, and how we will know it was effective (Gresham & Lopez, 1996). Researchers have involved consumers (i.e., teachers and parents) in deciding intervention target behaviors and routines, selecting or modifying the intervention procedures, and assessing the acceptability of the intervention process and outcomes (Strain, Barton, & Dunlap, 2012).
Despite the fact that efforts have been made to use social validity interventions and to expand methods for assessing social validity, there are still a limited number of studies reporting social validity (Ledford, Hall, Conder, & Lane, 2016; Snodgrass, Chung, Meadan, & Halle, 2018), and there is still a need for guidelines on how to assess social validity. For example, Ledford et al. (2016) reviewed 109 single case design (SCD) studies reported in 54 articles on social skills interventions for young children with autism spectrum disorder (ASD) published between 1994 and 2013 and found that less than half of the studies reported social validity data. Snodgrass et al. (2018) found that only 26.8% (n = 115) of 429 articles using SCD published in six special education journals between 2005 and 2016 reported a social validity assessment, and that of those articles with social validity assessment information, only 6.5% (n = 7) reported all three dimensions of social validity (i.e., goals, procedures, and outcomes).
These findings concerning the lack of social validity assessment were similar to those findings reported in earlier reviews of social validity (Carr, Austin, Britton, Kellum, & Bailey, 1999; Conroy et al., 2005; Hurley, 2012; Kennedy, 1992), indicating that the rate of studies reporting social validity has not increased. Ledford et al. (2016) discussed that the dearth of social validity assessments in SCD studies might be due to the lack of guidance and methods for conducing the assessments, cost and time to complete the assessments, and page limits set by journal policies. Ideally, social validity assessment results lead to the design and development of more feasible and effective interventions and inhibit the development of interventions that are likely to fail in real world (Schwartz & Baer, 1991).
Accordingly, researchers have suggested several methods of assessment to address all three dimensions of social validity, such as subjective assessment, objective assessment, and social or normative comparison. Subjective assessment typically involves gathering opinions about acceptability of and satisfaction with the intervention from consumers following the intervention (Hawkins, 1991; Turan & Meadan, 2011). Rating scales (Reimers, Wacker, & Cooper, 1991; Witt & Martens, 1983), questionnaires with open-ended questions (Gresham & Lopez, 1996), interviews (Spohn, Timko, & Sainato, 1999), and focus groups (Leko, 2014) have been used to gather consumer opinions. Given that assessments are conducted by asking consumers or implementers for their opinions about the intervention, these assessment methods are considered subjective in nature even though rating scales may utilize systematic assessment methods (Hurley, 2012).
Several researchers have suggested employing objective social validity assessments (Gresham & Lopez, 1996). For example, to assess the intervention acceptability, Hanley (2010) suggested directly assessing choices of intervention options with direct consumers (e.g., participating children), which could be assessed using a simultaneous treatment design. Additional objective social validity assessments include assessing consumers’ continuous use of interventions after intervention implementation support is removed, comparing the target child’s behavior change with that of typically developing peers (normative comparison), having masked raters who are unaware of the conditions being implemented, rating target behavior change and the extent to which the procedures implemented are beneficial, and examining maintenance of behavior change (Barton, Reichow, Schnitz, Smith, & Sherlock, 2015; Ennis, Jolivette, Fredrick, & Alberto, 2013). Yet, these objective methods are seldom used (Spear et al., 2013).
In addition, not only how but also when to assess the social validity has been discussed in the literature. Although social validity has sometimes been conceptualized as an outcome, it has also been conceptualized as a process (Finney, 1991; Schwartz & Baer, 1991), suggesting that assessing the three dimensions of social validity can occur at various points in time during development, implementation, and testing. Researchers also have assessed consumers’ perceived acceptability of the identified intervention goals and procedures prior to implementing the intervention or during the intervention phase to ensure that consumers perceive the intervention procedures to be feasible and easy to implement (Strain et al., 2012). This preintervention buy-in or acceptability of intervention procedures has been discussed as a potential factor that may affect implementation fidelity; that is, if interventions are acceptable, they are more likely to be implemented with fidelity (Lane, Beebe-Frankenberger, Lambros, & Pierson, 2001; Miramontes, Marchant, Allen-Heath, & Fischer, 2011; Witt & Elliott, 1985).
However, most researchers measure social validity of goals and procedures without natural change agents before or during intervention implementation (Spear et al., 2013). In fact, researchers have paid little attention to assessing the social significance of goals and social acceptability of procedures (Gresham & Lopez, 1996; Snodgrass et al., 2018). Spear et al. (2013) reported that of the 22 SCD studies on behavior interventions for students with emotional and behavioral disorders, none met all four social validity assessment quality indicators identified by Horner et al. (2005), which reflect Wolf’s (1978) three dimensions of social validity. Spear et al. also reported that only one study included practitioner input during the intervention design. Similarly, Snodgrass et al. (2018) found none of the subgroup of articles that addressed all three dimensions of social validity assessment (n = 28) conducted all six steps of the scientific method in employing social validity assessment. Snodgrass et al. suggested that social validity assessments should employ a scientifically rigorous process that includes all six steps of a scientific method: (a) research question, (b) literature review, (c) hypothesis, (d) data collection method, (e) hypothesis testing, and (f) analysis procedures.
Previous examinations of social validity provide valuable information regarding social validity measurement, including the overall prevalence, the frequency with which researchers use different methods, and the presence of quality indicators. However, current research does not clearly delineate the extent to which researchers or practitioners make adaptations to interventions based on social validity measurement. Furthermore, there are no clear guidelines regarding how social validity assessment results should be reported. In general, the social validity assessment results are reported using a brief summary statement (Snodgrass et al., 2018). We extended the literature on social validity by systematically reviewing SCD studies on early childhood behavioral interventions. Given that none of the systematic reviews on social validity focused on behavioral interventions for children with problem behavior, we aimed to examine the general and social validity assessment characteristics of SCD studies on behavioral interventions for this population. Specific research questions were: (a) what are the common characteristics of SCD studies on behavior interventions for young children with problem behavior, (b) how prevalent is it for the three dimensions (i.e., goals, procedures, and outcomes) of social validity to be assessed and to what extent are they assessed, and (c) what types of social validity assessment methods are commonly used?
Method
Article Search
Article search procedures were performed to identify articles published between 2001 and 2018 that focused on behavior interventions for young children. The initial search involved online databases: Academic Search Premier, PsycINFO, and Web of Science. First, a key word search of abstracts was conducted using the following key words: practice, intervention, treatment, support, strategy, therapy, program, procedure, and approach, in conjunction with such key words as problem behavior and challenging behavior. Second, the following key words were searched in the full text of articles: infant, toddler, preschooler, and young child, in conjunction with disability, disabilities, and delay. The initial search resulted in 698 articles (453 from Academic Search Premier, 161 from PsycINFO, and 163 from Web of Science, minus 79 duplicated studies). In the second search phase, 36 additional studies were identified. This two-phase article search resulted in a total of 734 studies as being potentially relevant.
Article Selection Procedures
To select articles for final review, we used a four-step screening process with the 734 articles using the following inclusion criteria: (a) published in peer-reviewed journals, (b) included child participants who have a diagnosed disability or developmental delay, or are at risk for disability in social-emotional development due to problem behavior, (c) included at least one child aged 6 or under, (d) employed an SCD, (e) implemented an intervention to address problem behavior, (f) reported qualitative or quantitative social validity assessment data, and (g) was written in English. To determine SCD studies for analysis, three SCD features suggested by the What Works Clearinghouse (WWC; 2017) were used. The features consisted of (a) an individual case (single participant or a cluster of participants) as the unit of intervention and unit of data analysis, (b) the individual case providing its own control for purposes of comparison, and (c) repeated measurement of outcome variable within and across different conditions or levels of the independent variables.
First, the titles of the identified 734 articles were reviewed, resulting in the elimination of 48 studies. Studies that contained explicitly unrelated terms in the title, such as meta-analysis, randomized trial, validation of scale, and cohort study, were excluded. Second, review of the article abstracts resulted in the elimination of an additional 509 articles, leaving 177 articles. Third, the Method sections of each of the remaining 177 studies were reviewed, resulting in the exclusion of 62 articles. Finally, the full text of the remaining 115 studies was reviewed, during which 87 studies were excluded for the following reasons: (a) did not include child participants aged between 0 and 6 (n = 14), (b) did not report any social validity assessment data (n = 56), (c) did not include a child with problem behavior (Galensky, Miltenberger, Stricker, & Garlinghouse, 2001; Healey, France, & Blampied, 2009; Jolstead et al., 2017), (d) included typically developing children whose status of at risk for social-emotional difficulties was not confirmed using a social-emotional screening or assessment tool to further determine the child’s needs for behavior intervention (Galensky et al., 2001; Healey et al., 2009; McLaren & Nelson, 2009; Murphy, Theodore, Aloiso, Alric-Edwards, & Hughes, 2007; Rispoli et al., 2015; Sawyer, Crosland, Miltenberger, & Rone, 2015), (e) did not employ an SCD, or (f) the unit of analysis was not individuals, but rather classroom (Benedict, Horner, & Squires, 2007; Carter & Van Norman, 2010; Hemmeter, Hardy, Schnitz, Adams, & Kinder, 2015; Hemmeter, Snyder, Kinder, & Artman, 2011; Stormont, Smith, & Lewis, 2007). The percentage of studies that addressed social validity was 51.4%. At the conclusion of this four-step screening process, 28 articles were identified for in-depth analysis (Figure 1).

Flowchart of study selection process.
Coding Procedures
General characteristics of individual studies
To identify the general characteristics of the selected 28 studies, the authors coded the following variables: (a) author and year, (b) number of child participants, (c) child gender, (d) child age, (e) type of disability or developmental delay, (f) type of intervention, (g) intervention implementer, (h) design, (i) setting, (j) reporting of treatment fidelity, (k) dependent measure, and (l) intervention implementation support (i.e., frequency, duration, and method of implementation support). These variables were analyzed to gather information about the child participants’ characteristics, evidence-based or promising behavior interventions used for young children, the contexts under which the behavior intervention was implemented, and study design and behaviors targeted to evaluate the intervention outcomes.
Analysis of social validity assessments
Based on previous review studies on social validity (Hurley, 2012; Snodgrass et al., 2018; Strain et al., 2012), 10 variables were coded: (a) inclusion of a research question about social validity, (b) inclusion of literature review on social validity, (c) presence of measurement of three dimensions of social validity (goal, procedures, outcomes), (d) social validity assessment method (questionnaire–validated scale, questionnaire–author modification of validated scale, questionnaire–self-developed, questionnaire–self-developed with open-ended questions, interview, blind rating, normative comparison), (e) type of assessment tool (self-developed, validated tool, adapted from validated tool), (f) response method (verbal, paper, email, observation), (g) frequency of social validity assessment, (h) social validity assessment respondents (direct consumer, indirect consumer, immediate community member, extended community member), (i) data reporting method (summary statement only, raw data, descriptive statistics, parametric statistics, qualitative), and (j) presence of intervention revision based on feedback from social validity assessment participants.
Social validity assessment, coding categories were developed based on the categories discussed by Schwartz and Baer (1991). Direct consumers are individuals who are directly involved in the intervention, and indirect consumers are individuals who are strongly affected by the effects of intervention. Immediate community members are individuals who interact with direct and indirect consumers, and extended community members are individuals who may never have direct contact with intervention consumers.
Interrater Reliability
Interrater reliability was assessed during article screening and selection and during coding. During the fourth step of the selection process, review of full text, the authors applied the inclusion and exclusion criteria to the remaining 115 articles under consideration for inclusion to undergo in-depth data analysis. Interrater reliability was calculated as a percentage of agreements by dividing the number of agreements over the number of possible agreements. The interrater reliability in selecting the final 28 articles was 98.3%. The discrepancies were resolved through discussion and consensus for any discrepancies between coders to meet 100% agreement. The first author coded the selected 28 articles, and an undergraduate student in special education who was naïve to the purpose of the review and who received training on using the coding form independently coded 32.1% (n = 9) of the articles selected at random. The interrater reliability for coding was 92.3% (range = 88.9%-100%) across variables. The disagreements were resolved to 100% agreement.
Results
Characteristics of Individual Studies
Table 1 presents descriptive characteristics across studies. The total number of child participants were 74, including 47 boys and 21 girls; no gender information was provided for six children. Age range was 2 to 7 years. Of the 28 studies, 11 (39.3%) included children with ASD and nine studies (32.1%) included children who had a developmental delay in language, communication, or speech development. Four studies (14.3%) included children with developmental delays whose information concerning the developmental domains associated with the delays was not provided. Three studies (10.7%) included children with language disabilities. Five studies (17.9%) included children at risk for disabilities. The most commonly used intervention was positive behavior support (25%, n = 7), implemented at home or in the classroom. The second most frequently used interventions were functional communication training (18.8%, n = 5) and function-based intervention (14.3%, n = 4), followed by mindful parenting, self-management or self-monitoring, and visual support, each with 7.1% (n = 2). Other interventions (21.4%, n = 6) included antecedent-based intervention, Parent–Child Interaction Therapy, family-centered Prevent-Teach-Reinforce, and embedded self-determination practices. Most studies provided insufficient information on intervention frequency and duration, with only 35.7% (n = 10) reporting the total duration of intervention, which ranged from 4 days to 52 weeks.
Characteristics of Studies.
Note. ASD = autism spectrum disorders; ADHD = attention deficit hyperactivity disorder; PTR = Prevent-Teach-Reinforce; TA = teacher assistant; MBL = multiple baseline; PB = problem behavior; RB = replacement behavior; NS = not specified; ID = intellectual disability; FBI = function-based intervention; FCT = functional communication training; CB = challenging behavior; AB = appropriate behavior; NR = not reported; PBS = positive behavior support; CP = cerebral palsy; UT = use of turtle technique; DD = developmental delay; SS = session; EI = early intervention; SV = social validity; PCIT = parent-child interaction therapy; AT = alternating treatment; SI = social interaction; FCR = functional communication response; IR = independent request; SLD = speech/language disorder; GS = graduate student; K = kindergartners; FA = functional assessment; E = engagement; C = Compliance.
In the majority of the studies (50.0%, n = 14), classroom teachers were involved as implementer or co-implementer. Families (parents or other member) were involved as implementer in 12 studies (42.9%). Other implementers included practicum student, researcher, graduate student, instructor, early intervention provider, and other staff (paraprofessional and counselor). The most common design to evaluate the interventions was multiple baseline design (75.0%, n = 21), followed by withdrawal design (21.4%, n = 6). One study (3.6%) employed a combination of alternating treatments and multiple baseline designs. With the exception of two studies, all studies were conducted in the classroom (46.4%, n = 13), at home (family routines; 39.3%, n = 11), or both at home and in other settings (7.1%, n = 2; home and classroom or home and community). The other two studies were conducted at a clinic or an early intervention center within a university. Provision of ongoing implementation support to implementers during intervention after initial training varied across studies. In most studies (67.9%, n = 19), implementation support was provided 1 to 2 times during the intervention phase, on a weekly or biweekly basis, 5 to 45 min per session. However, nine studies (32.1%) did not provide specific information about the ongoing implementation support, only provided initial training before intervention implementation, or had the implementer subjectively self-monitor their implementation of the intervention procedures. The most commonly used implementation support methods were in vivo coaching and performance feedback delivered through verbal, graphical, or both verbal and graphical methods.
Assessment of Three Dimensions of Social Validity
Table 2 provides the analysis results on the prevalence and the extent to which the three dimensions of social validity were assessed in behavior intervention studies for young children with problem behavior. The results indicated that 11 studies (39.3%) included a research question on social validity, and five studies (17.9%) discussed social validity literature. Among the three dimensions, less than half of the studies (46.4%, n = 13) involved consumers in determining the intervention goals and procedures, and four studies (4.3%) only addressed goals before intervention implementation; this information has been omitted from Table 2 due to space limitations. In examining whether all three dimensions were formally assessed, it was found that only 25.0% (n = 7) of the studies provided the social validity assessment outcomes for goals, whereas 78.6% (n = 22) of the studies provided the assessment outcomes for acceptability of the intervention procedures and outcomes and 85.7% (n = 24) of the studies provided the assessment outcomes for acceptability of intervention outcomes.
Characteristics of Social Validity Assessment.
Note. SV = social validity; AMVS = Author Modification of Validity Scale; BR = blind rating; TARF-R = Treatment Acceptability Rating Form–Revised; DS = descriptive statistics; QR= qualitative report; SD = self-developed; OEQ = open-ended questionnaire; VS = validated scale; TEI-SF = Treatment Evaluation Inventory–Short Form; ARP-R = Assessment Rating Profile–Revised; IRP = Intervention Rating Profile; RD = raw data; SS = summary statement only; BIRS-A = Behavior Intervention Rating Scale–Adapted version; TAI = Therapy Attitude Inventory; SUPS = Subjective Units of Parenting Satisfaction; NA = not applicable; G = goal; P = procedure; O = outcome;Q = questionnaire; TAS = Treatment Acceptability Survey; C = consumer.
Commonly Used Social Validity Assessment Methods
The results indicated that subjective consumer questionnaires were the most frequently used social validity assessment method (89.3%, n = 25). Across studies, 48.0% (n = 12) used a self-developed questionnaire, 12.0% (n = 3) used a self-developed questionnaire with open-ended questions, 12.0% used a questionnaire with author modification of validated scale, 24.0% (n = 6) used a questionnaire with validated scale, and 4.0% (n = 1) used only open-ended questions without rating scale items (Brock & Beaman-Diglia, 2018). Six studies (21.4%) conducted an interview (Bailey & Blair, 2015; Blair, Fox, & Lentini, 2010; Fettig, Barton, Carter, & Eisenhower, 2016; Hancock, Kaiser, & Delaney, 2002; McCoy, Morrison, Barnett, Kalra, & Donovan, 2017; Park & Scott, 2009). Some researchers used objective measures such as: masked ratings (using videos with masked observers) or a normative comparison. Three studies used a masked ratings (Duda, Clarke, Fox, & Dunlap, 2008; Duda, Dunlap, Fox, Lentini, & Clarke, 2004; Dunlap et al., 2006), and one study used a normative comparison which involved comparing target child behavior with peer behavior (Zimmerman, Ledford, & Barton, 2017). One study used both subjective and objective methods such as a questionnaire and a masked ratings (3.6%; Bailey & Blair, 2015).
Self-developed survey tools were also common tools used to measure social validity (66.8%, n = 17). The majority of the studies used a validated tool (25.0%, n = 7) or a modified or adapted version of a validated tool (10.7%, n = 3), such as the Treatment Acceptability Rating Form–Revised (TARF-R; Reimers et al., 1991), Treatment Evaluation Inventory–Short Form (TEI-SF; Kelley, Heffer, Gresham, & Elliott, 1989), Therapy Attitude Inventory (TAI; Brestan, Jacobs, Rayfield, & Eyberg, 1999), Subjective Units of Parenting Satisfaction (Stanley & Averill, 1998), and Behavior Intervention Rating Scale–Adapted (BIRS-A; Elliott & Treuting, 1991). Using pencil and paper was the most common response method (85.7%, n = 24). Survey through email was found in one study (Strand & Eldevik, 2018). Observation (Zimmerman et al., 2017), verbal (Fettig et al., 2016), and audiotape recording (Blair et al., 2010) were also used. The frequency of social validity assessment ranged from 1 (67.9%, n = 19) to 8 (3.6%, n = 1). Three studies (10.7%) reported assessing the social validity at two different time points, and three studies (10.7%) reported assessing it at three different time points during the SCD study. One study reported assessing social validity at one or two time points, depending on the participant. In a 10-year longitudinal study (Lucyshyn et al., 2007), social validity was assessed at eight different time points over the course of the study.
Direct consumers participated in 79.3% of social validity assessments (n = 23). Direct consumer roles varied across studies and included parents, teachers, and the child. Indirect consumers participated in five studies (17.9%). Immediate community members such as early intervention providers participated in one study’s (3.6%) social validity assessment. Extended community members such as naïve observers participated in four studies (14.3%) for assessing social validity. Six studies (21.4%) included multiple evaluators in social validity measurement. Direct and indirect consumer assessment was identified in three studies, direct consumer and extended community member in two studies, and direct consumer and immediate community member in one study. The primary method used to report social validity results was descriptive statistics (71.4%, n = 20). Four studies (14.3%) provided descriptive statistics with qualitative information, and two studies (7.1%) reported descriptive statistics with raw data. Four studies (14.3%) reported only summary statements. Five studies (18.0%) reported the results using a qualitative method. No studies reported a modification or adjustment of the intervention procedures during intervention based on the social validity assessment results.
Discussion
This study examined the general and social validity assessment characteristics of SCD studies that addressed problem behavior in young children with disabilities, developmental delays, or at risk for disabilities. The focus was to provide recommendations for practices and future research in addressing social validity.
Major Findings and Implications
With regard to the first research question, common characteristics of SCD studies on behavior interventions for young children, the results revealed a relatively large number of SCD studies evaluated evidence-based or promising behavior interventions for young children with problem behavior. It was encouraging that the vast majority of behavior interventions were implemented in the home or early childhood classroom settings by parents (family members) or classroom teachers for young children with diverse needs. Due to space constraints, specifics on behavioral outcomes and implementation fidelity scores for the individual studies are not provided herein, but all reviewed studies reported positive outcomes for the participating children, and all studies with information on implementation fidelity reported high levels of fidelity. These findings support existing evidence that natural change agents can be effective in implementing research-based interventions to address problem behavior in young children (Conroy et al., 2005). Given that more than 90% of the reviewed studies reported positive social validity outcomes, the interventions evaluated in the studies appeared to be acceptable to the natural change agents and effective. Providing implementation support to natural change agents during intervention is imperative to ensure the interventions are being implemented as designed with fidelity and result in positive outcomes. However, the current review offered little evidence that the implementers (natural change agents) received effective implementation support after initial training prior to the intervention. Nine studies (32.1%) did not report the provision of implementation support during intervention implementation, and of these studies, most lacked clear information on the frequency and duration of the support.
With regard to the second research question, the prevalence and extent to which the three dimensions of social validity (goals, procedures, outcomes) were addressed in the behavior intervention literature for young children, the results confirm that the proportion of studies addressing all three dimensions remains low regardless of the target populations and interventions, as found in previous reviews on social validity (Hurley, 2012; Ledford et al, 2016; Snodgrass et al., 2018). Of the reviewed studies, only a limited number of studies addressed all three dimensions of social validity over the course of the study (Bailey & Blair, 2015; Blair et al., 2010; Duda et al., 2008; Duda et al., 2004; Dunlap et al., 2006; McDaniel & Flower, 2015; Smith, Lewis, & Stormont, 2011). The current study focused on examining whether the researchers not only socially validated the goals and procedures before or during intervention but also assessed them after intervention. It was found that among the three dimensions, the lowest was the reporting rate on assessment of intervention goals, both before (or during) and after intervention. Based on the results of the analysis, the probable reasons that all three dimensions of social validity are rarely assessed in the field might be due in part to the lack of validated social validity assessment tools that provide clear distinctions of the three dimensions.
Concerning the third research question, commonly used social validity assessment methods, although the majority of the reviewed studies used validated social validity assessment tools, none of the tools were designed to assess all three dimensions of social validity. For example, based on a factor analysis, Kelley et al. (1989) reported that the nine-item TEI-SF with a 5-point scale loaded on two factors, “acceptability” and ‘ethical issues/discomfort’; no items were designed to assess acceptability of “goal.” Similarly, the 10-item TAI, a parent-report scale, is designed to assess only two aspects of intervention: process and outcome (Brestan et al., 1999; Eyberg, Edwards, Boggs, & Foote, 1998); again, the third dimension, goal, is omitted. The Behavior Intervention Rating Scale (BIRS) is also designed to assess the intervention’s acceptability, effectiveness, and time to effect (Elliott & Treuting, 1991). BIRS does not consider the goal of the intervention. This suggests that developing social validity assessment tools designed to assess all three dimensions might be a way to promote addressing all three dimensions of social validity in designing and implementing effective and feasible interventions and conducting high-quality research.
Although the social validation of intervention goals and procedures can be conducted at the beginning, during, and after the intervention, the results of the study indicate that, as consistently found and discussed in the social validity literature (Hurley, 2012; Snodgrass et al., 2018), the researchers of behavior interventions for young children with problem behavior also tend to assess social validity after termination of intervention. This implies that the researchers and practitioners missed the opportunity to work with the consumers to shape or refine the goals and procedures throughout the intervention. Even though the nine studies (32.1%) in the current review measured social validity more than once, no study reported that the intervention procedures were modified based on consumer feedback during intervention. Given that many researchers in behavioral interventions have consistently argued that social validation of goals and procedures should be used for intervention modifications and implementer training to improve outcomes (Miltenberger, 1990; Schwartz & Baer, 1991; Strain et al., 2012), providing a guideline for timing and frequency of social validity assessment may promote researchers and practitioners to employ all three dimensions of social validity in working with natural change agents in designing and implementing interventions.
We also examined the people who participated in social validity assessments and the types of social validity assessment methods used in the early childhood problem behavior literature. It was found that most of the studies involved direct consumers (interventionists) to assess social validity, and only four studies (12.1%) involved naïve (blind) observers who could provide more objective information about the intervention outcomes than direct consumers. In addition, only one study used a normative comparison method, in the form of peer comparison (Zimmerman et al., 2017), and only two studies assessed social validity with target children. Given that young children with disabilities have limited developmental skills to respond to subjective social validity assessment, using a normative comparison might be useful in assessing social validity of interventions with target children, as suggested by Hanley (2010). The results of the study also indicate that the range of reporting methods for social validity assessment results varied from summary statement only to reporting the results for each item, which corroborate the findings from a previous systematic review on social validity (Snodgrass et al., 2018). The primary reporting method was using descriptive statistics to summarize the results. Although there were several studies (n = 7) that assessed all three dimensions of social validity, no studies provided separate assessment results for each dimension. Considering that simply describing the results with a brief summary statement or providing information on individual items without integrating the findings does not provide sufficient information for decision making regarding the social validity of an intervention process and outcomes, future researchers should contemplate providing sufficient information on the social validity assessment results for readers to judge the social validity of any intervention.
Limitations and Conclusion
The research studies reviewed in this article offer evidence that researchers have actively promoted implementation of socially valid evidence-based or promising behavior interventions in natural home and classroom environments for young children with problem behavior. However, a few limitations should be considered when interpreting the results and for future research. The reviewed studies were limited to studies that used SCDs, excluding group design studies. Furthermore, although databases and reference reviews were used extensively to select the studies, it is likely that the authors overlooked articles that should have been included in the analysis. In addition, due to the limited number of studies reviewed, more in-depth analysis was not conducted, which would have allowed the authors to examine additional variables that might moderate social validity outcomes.
Findings from the present review extend the contributions from previous comprehensive or systematic reviews of social validity assessment in SCD studies, which evaluated social competence interventions for preschool children (Hurley, 2012), social skills interventions for young children with ASD (Ledford et al., 2016), and intervention research in special education (Snodgrass et al., 2018). We focused on studies concerning behavior intervention for children with problem behavior published between 2001 and 2018 and provided specific characteristics of individual studies and social validity assessment procedures and outcomes reported in each study. Given that the use of socially valid interventions that are feasible, effective, and sustainable in natural settings is essential to address problem behavior in young children, assessing social validity on the front end may be beneficial. This will allow natural change agents to work with others to make adjustments to the intervention goals and procedures so that they are more meaningful and feasible, and the change agents receive ongoing implementation support as needed. Although providing clear guidelines on how to incorporate social validity in all aspects of intervention may help practitioners effectively work with the typical change agents, the current status of social validity assessment in the field suggests the need for developing a social validity assessment instrument designed to assess all three dimensions of social validity. Researchers and practitioners should address all three dimensions of social validity in designing and evaluating interventions and improving intervention quality and outcomes.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
