Abstract
Evidence is accumulating that early intervention can be effective in improving the skills of young children with autism spectrum disorder. However, the science is hampered by the lack of agreed “gold standard” tools for the measurement of progress and outcome. What is required is a reliable, valid, and sensitive measure of change in the core domains of autism, which can be undertaken blind to group and time. This article explores the use of a promising measure of change, for which reliability, validity, and sensitivity to change over a lengthy period have been previously demonstrated. Pilot data indicate that, despite some sensitivity to change over a short period of time, it does not capture treatment effects more effectively than an existing diagnostic tool. Future directions for the ongoing search are suggested, including consideration of how to achieve sensitivity to differential change as well as to change over time.
Introduction
Autism spectrum disorders (ASD) are neurodevelopmental, lifelong conditions diagnosed using a set of behavioral criteria (American Psychiatric Association, 2013; World Health Organization, 1994), and characterized by persistent deficits in social communication and social interaction, along with restricted, repetitive patterns of behavior, interests, or activities. Children with ASD are typically diagnosed between 2 and 6 years of age (Boyd, Odom, Humphreys, & Sam, 2010; Dababnah, Parish, Tuner Brown, & Hooper, 2011). Around 1% of children are recognized at school age to have ASD (Baird et al., 2006), though recent estimates from the United States have risen to near 1.5% (Centre for Disease Control, 2014). The spectrum encompasses individuals with a wide range of intellectual and developmental abilities. In the past decade, there has been an increase in ASD intervention research, with concurrent improvement in the methodological quality of these studies (Charman, 2011; Magiati, Tay, & Howlin, 2012; National Research Council, 2001; Rogers & Vismara, 2008). For young children, the focus of specific intervention approaches is usually on the core social-communication deficits, often using strategies to enhance the quality of reciprocal interaction with parents and caregivers, enabling them to support children’s development of effective communication (Kasari et al., 2005; Oono, Honey, & McConachie, 2013).
It is well recognized that progress in research on effectiveness of early intervention is hampered by the use of highly variable outcome measures (Bolte & Diehl, 2013; Cunningham, 2012; Matson, 2007; Spence & Thurm, 2010; Wolery & Garfinkle, 2002). There are a number of reasons for this state of affairs. First, studies vary in the primary goal of intervention—overall function, reducing particular ASD difficulties, improving child functional skills, or enhancing quality of life for the child and/or family. Each of these implies different intervention strategies and outcome measures. Second, some studies opt to measure proximal outcomes (i.e., close to the target of a particular intervention approach), which generally show better intervention effects than distal (i.e., more functional and generalized) outcomes. However, an emphasis on proximal outcomes can be misleading in relation to the impact of interventions in day-to-day living. The third area for consideration for intervention researchers has to do with external validity. The dilemma here is that subjective (particularly family-reported) measures are those with the greatest external validity, as it is the experience of children and families that interventions particularly want to improve; however, such ratings are prone to expectation and placebo effects within interventions as parents cannot (or almost never) be blinded. A final challenge is to determine outcome measures that are responsive to change but also stable over the different settings that children experience. Sensitivity in measures is often limited in studies involving heterogeneous samples of children by floor and ceiling effects (floor effects when children are not ready to change, and ceiling effects when they have already mastered the skill). For all these reasons, the search for an appropriate early intervention outcome measure in autism is complex.
Even if the focus of an intervention is on early social communication, an outcome measure needs to combine a series of characteristics which are hard to reconcile. It needs to be directly observed or assessed, to minimize subjectivity, unlike the existing tools which measure progress using parent report (e.g., Cohen, Schmidt-Lackner, Romanczyk, & Sudhalter, 2003; Stone, Coonrod, Pozdol, & Turner, 2003). Given the limited length of some intervention trials (e.g., Kasari, Paparella, Freeman, & Jahromi, 2008; McConachie, Randle, Hammal, & Le Couteur, 2005; Rogers et al., 2012), it should be sensitive enough to pick up small changes in behavior but also robust enough to be readministered in a fairly short period of time.
There are some apparently suitable assessment measures for key skills such as language and adaptive behavior; however, difficulties in these areas are not specific to individuals with autism, and the tools’ validity in ASD may not have been formally explored (Cunningham, 2012; Hudry et al., 2010). To measure the core aspects of autism, one solution might be to use an autism-specific diagnostic tool such as the recently updated Autism Diagnostic Observation Schedule–2 (ADOS-2; Lord et al., 2012). However, expecting a change in the overall diagnostic status may not be reasonable, as has been shown in recent reports of randomized controlled trials (RCTs) of early intervention (Dawson et al., 2010; Green et al., 2010), and researchers have explored alteration of scoring rules in an attempt to increase sensitivity (e.g., Aldred, Green, & Adams, 2008; Green et al., 2010). Furthermore, diagnostic tools such as the ADOS are not designed to show a change over a few months, even with the newer calibrated scores of severity (Gotham, Pickles, & Lord, 2009). Therefore, to be useful, it is likely that the measured behaviors will target change in specific subcomponents of the social-communication and repetitive behaviors assessed in diagnosis.
There are available directly assessed specific measures of early communicative and social behaviors such as the Communication and Symbolic Behavior Scales–Developmental Profile (CSBS-DP; Wetherby & Prizant, 2002) and the Early Social Communication Scales (ESCS; Seibert, Hogan, & Mundy, 1982). However, they have the disadvantage of having been developed and standardized on typically developing (TD) young children (CSBS-DP to 24 months, ESCS to 30 months), and are thus likely to create a ceiling effect in scores for many preschool children with ASD.
The same objection of a narrow age-band of applicability applies to many coding schemes for child social-communication skills in the context of directly observed parent–child interaction (e.g., Adamson, Bakeman, Deckner, & Romski, 2009; Feldman, Greenbaum, Mayes, & Erlich, 1997; Kasari, Freeman, & Paparella, 2006; Wan et al., 2012). Other studies bypass this issue by creating novel, bespoke codes for evaluation of direct observation (McConachie et al., in press). However, there are a number of disadvantages. Bespoke coding schemes tend to measure skills very close to the intervention target, and thus fail to evaluate generalization of treatment effect. Without adequate investigation of the measurement properties of a novel coding scheme, a finding of lack of significant change could be attributable either to a lack of intervention effect or to insensitivity of the measure. Finally, the use of novel schemes prevents comparison of results across studies.
This article presents pilot data on the use of a directly observed measure of social interaction skills in young children with ASD, which attempts to address many of the limitations of other measures cited above. The tool used is the Social Orienting Continuum and Response Scale (SOC-RS; Mosconi, Reznick, Mesibov, & Piven, 2009). This tool is an objective, directly observed measure of behavioral change. It addresses the core features of autism in the social-communication domain and has been established for use with an ASD sample from the outset. Its sensitivity to change over a short period of time and to differential change between groups is as yet untested and will be the focus of this investigation.
SOC-RS
The SOC-RS focuses on social orienting behaviors, which are key early emerging impairments in autism. The SOC-RS was developed with a longitudinal sample of children with ASD, aged 2 years old at Time 1 and seen again 2 years later at Time 2 (Mosconi et al., 2009). These were matched with two cross-sectional cohorts of chronological-age and gender-matched TD children, one cohort for each time point. The measure uses the standard assessment procedure of the ADOS (Lord, Rutter, DiLavore, & Risi, 1999) as a source of behavioral data. Videos of a full Module 1 or Module 2 ADOS are observed and scored by a trained observer according to the SOC-RS handbook (Mosconi, Fletcher-Watson, McConachie, Reznick, & Piven, 2010). Five behaviors are scored for frequency: These are social referencing, response to joint attention, initiation of joint attention, orienting to name, and social smiling. The SOC-RS scoring system takes into account the number of opportunities to produce a behavior, not just the behavior frequency alone. For example, a child can only “orient to name” when someone calls his or her name.
The principal output of the SOC-RS is the Social Orienting Composite, calculated by combining z scores for each observed behavior. The SOC-RS can also produce a combined joint attention score, which incorporates both responses to and initiation of joint attention.
Good inter-rater reliability for each behavior (all intra-class correlations [ICCs] > .78) has been reported, as well as convergent validity between the Social Orienting Composite and the socialization subscale of the Vineland Adaptive Behavior Scale (VABS; Sparrow, Cicchetti, & Balla, 2005; evidence from Mosconi et al., 2009). Mosconi and colleagues also found that the measure discriminated between the ASD and TD samples, both at each time point and in terms of pattern of change over time. These promising data indicate that the SOC-RS may be a good candidate as a sensitive measure of change in key behaviors for preschool children with autism.
In this new study, we, therefore, aimed to determine whether the SOC-RS was sensitive to differential change under two specific conditions. First, does the SOC-RS pick up change over a shorter period of time (7 months rather than 2 years) than in the original SOC-RS development study (Mosconi et al., 2009)? Second, does the SOC-RS measure a substantial treatment effect when comparing groups who have and have not shown behavioral change in response to an intervention?
Method
Design
Our method in this study was to identify a known between-group treatment effect to ascertain whether a new tool, the SOC-RS, would be sensitive enough to detect that effect. We, therefore, compare children predefined as exhibiting progress, or not, in social and communicative development over the period of interest. The groups are small to maximize the difference between the two samples (“extreme groups”). This allows us to model whether the SOC-RS is capable of measuring a differential change between groups, which we can be confident already exists.
Participants
Participants were 20 preschool-aged children (Mage = 36.1 months) with a diagnosis of ASD, confirmed by the Autism Diagnostic Interview–Revised (ADI-R; Lord, Rutter, & Le Couteur, 1994) and the ADOS (Lord et al., 1999). The children were divided into two groups (n = 10 per group; see Table 1): labeled progress and no progress, based on whether they had exhibited improvement in social-communication behaviors over a period of 7 months. The selection process is detailed below. Groups were matched at Time 1 on VABS composite score, Macarthur Communicative Development Inventory (MCDI) score, and ADOS social and communication score. There were no group differences in the average time that the children were visible on camera during each of their ADOS assessments.
Characteristics of Selected Groups of Children.
Note. Reduction = less impaired. ADOS = Autism Diagnostic Observation Schedule (Lord, Rutter, DiLavore, & Risi, 1999); CI = confidence interval; VABC = Vineland Adaptive Behavior Scales, Composite (Sparrow, Cicchetti, & Balla, 2005); MCDI = Macarthur Communicative Development Inventory (Fenson et al., 1993); CSI = Communication and Social Interaction algorithm score.
Ratio of children receiving a Module 1 versus a Module 2 ADOS at Time 2. All children received a Module 1 at Time 1.
Participant Selection
The children were selected from an original sample of 51 children involved in a controlled intervention study (see Note 1; McConachie et al., 2005). In that study, the parents of children with ASD, aged 24 to 48 months, had either immediate or delayed access to the “More Than Words” course (Sussman, 1999), group parent training, which aims to improve parents’ skills in interacting and communicating with their child. The intervention was found to have a positive effect on the immediate-access group relative to the (delayed-access) controls, increasing children’s vocabulary and parents’ use of facilitative interaction strategies. Following the 7-month follow-up assessment, control group parents attended the course, and assessment was conducted after a further 7 months.
We selected children according to whether they exhibited substantial progress or not after their parent(s) attended a “More Than Words” course (see Table 1). Progress was measured by reduction in the child’s score on the ADOS social and communication algorithm (no progress range = +8 to −1; progress range = −15 to −3) and secondarily using vocabulary increase as measured by the MCDI (Fenson et al., 1993; no progress range = −5 to +26 words; progress range = 0 to +265 words).
Procedure
Among other measures, at each assessment visit the child took part in an ADOS (Lord et al., 1999) administered by a trained researcher using standardized assessment procedures and toys. The films (about 30 min per participant, per time point) of these ADOS assessments were then used as the raw data for SOC-RS ratings collected in this study.
Scoring the SOC-RS
The SOC-RS codes behaviors in five categories. The rater codes the number of instances of the following behaviors for all times when the child’s head is visible on screen:
Referencing: episodes in which the child looks at another person’s face.
Social Smiling: events in which the child smiles at an adult with the clear intention of sharing emotion with them.
Orienting to Name: episodes in which the child’s name is called. The ratings distinguish between episodes in which the child responds or does not respond.
Joint Attention Initiation: episodes in which the child directs another person’s attention to an object, for the purposes of sharing attention in that object.
Joint Attention Responding: episodes in which the child is given a joint attention cue by another person. As above, ratings distinguish between episodes in which the child responds or does not respond.
Ratings
Two raters were trained on the SOC-RS using guidelines and advice provided by the measure’s originator, Dr. M. Mosconi. Both raters were blind to time and to group status. Based on a sample of 10 tapes (25% of the full sample), the average of inter-rater ICCs across all five measures of the SOC-RS was .855, with a range from .707 (Orienting to Name) to .966 (Joint Attention Initiation).
Analysis
The SOC-RS produces continuous variables, based on simple frequency counts, for each of the five items listed above. In addition, a Social Orienting Composite can be constructed by converting each continuous item score into a z score, and then summing these. Two items can also be used to provide categorical scores. For Orienting to Name, categorization is based on the press number at which the children responds, using only the specific ADOS presses for scoring (and ignoring other instances in which the child’s name is used). A dichotomous responder/non-responder code can be calculated for Joint Attention Responding based on whether or not the child responds to one of the early ADOS presses, using eye-gaze and head turn only. Children are scored as “responders” if they respond to one of the early presses, or “non-responders” if they do not respond until later, when pointing cues are added.
Each of these variables was calculated and explored in our analysis. We used ANOVAs on each item score and on the Social Orienting Composite to look for the main effects of time and interactions between group and time. The former result tests whether the SOC-RS is sensitive to change over a short (7 month) period, and the latter reveals whether the SOC-RS is sensitive to differential intervention effects. In addition, chi-square tests are employed to identify differential distributions of participants at each time point in Orienting to Name and Joint Attention Responding categories. Finally, we used correlations to check for relationships between the SOC-RS scores and other established measures, as a test of validity.
Results
Effects of Group and Time
ANOVAs on individual SOC-RS items revealed the main effects of Time in Referencing rate only (see Table 2) but not for other items. In addition, we found moderate effect sizes for the joint attention responding and joint attention total items, though the difference did not achieve significance. These effects of time compare favorably with the already-established change in MCDI scores, but have smaller effect sizes than for the ADOS social-communication algorithm score. There were no significant interactions between Time and Group for the individual items.
Effects of Time on SOC-RS Scores, With MCDI and ADOS for Comparison.
Note. SOC-RS = Social Orienting Continuum and Response Scale; MCDI = MacArthur Communicative Development Inventory; ADOS = Autism Diagnostic Observation Schedule; CI = confidence interval; JA = Joint Attention; CSI = Communication and Social Interaction.
Reduction = less impaired.
A main effect of time was seen for the Social Orienting Composite, with a moderate effect size (d = 0.577), though the interaction of Time and Group failed to reach significance (p = .077; see Table 3 and Figure 1). The mean change (between Time 1 and Time 2) in the Social Orienting Composite was 0.079 for the no progress group and 0.543 for the progress group. The difference in the degree of change between the two groups corresponds to an effect size of d = 0.79, a large effect, indicating that with a larger sample an interaction might be apparent. This is illustrated in Figure 1, showing the Time 1 and Time 2 Social Orienting Composite scores by group. Once again, effect sizes for the Social Orienting Composite are greater than for the MCDI but smaller than that captured by the ADOS social-communication algorithm score.
Effects of Group and Time on SOC-RS, MCDI, and ADOS Scores.
Note. SOC-RS = Social Orienting Continuum and Response Scale; MCDI = MacArthur Communicative Development Inventory; ADOS = Autism Diagnostic Observation Schedule; JA = Joint Attention; CSI = Communication and Social Interaction.
Lower score = less impaired.

The interaction between Group and Time for Social Orienting Composite score.
Categorical Differences
Categorical scores were created for both the Orienting to Name and Joint Attention Responding items, in accordance with the SOC-RS manual, and as described above. Chi-square testing indicates significant between-group differences in Joint Attention Responding at Time 2 only, χ2 = 7.5, p = .006, and in Orienting to Name at Time 2 only, χ2 = 4.34, p = .037. This indicates that these categorical variables did detect differential change over time between groups.
Relationships Between the SOC-RS and Other Measures
To check the measure’s validity, a series of correlational analyses were performed to examine how SOC-RS scores related to the other measures collected: Vineland Adaptive Behavior Scales, Composite (VABC), MCDI, and ADOS social-communication algorithm (with a Bonferroni correction to p = .004). Only the Joint Attention Total produced correlations with other measures, as shown in Table 4.
Correlations Between SOC-RS Items (Combining Time Points and Groups) and Other Measures.
Note. SOC-RS = Social Orienting Continuum and Response Scale; JA = Joint Attention; VABC = Vineland Adaptive Behavior Scales, Composite; MCDI = MacArthur Communicative Development Inventory; ADOS = Autism Diagnostic Observation Schedule.
Individual Data
One strength of the small size of the groups under analysis is the possibility of exploring individual data. Figure 2 shows degree of change by individual participant in the Social Orienting Composite, ADOS social-communication algorithm, and MCDI expressive vocabulary score. The SOC-RS and MCDI scores have been transformed (multiplied by 10 and divided by 10, respectively) to facilitate their being represented on the same y axis. This illustrates the significant variability among children in the degree to which they exhibit change according to different measures. The SOC-RS scores do not always correspond to the pattern set by the MCDI and ADOS when initially identifying the two groups. At least three children in the no progress group appear to be exhibiting a large degree of improvement measured by the SOC-RS (Participants 4, 6, and 7), and others who were preidentified as having made progress do not have this reflected in their SOC-RS scores (Participants 17 and 20).

Individual data showing degree of change between Time 1 and Time 2 in ADOS, MCDI, and SOC-RS scores.
Discussion
The SOC-RS development study (Mosconi et al., 2009) suggested that this new measure had potential to be used as a sensitive measure of treatment effect in intervention studies with young children with ASD, using blinded rating of behavior observed during assessment with the ADOS and focusing on key social targets of intervention. In this “proof-of-concept” study, the SOC-RS was used to rescore previously collected data from a sample of 20 children with ASD whose parents had taken part in an intervention. These children were identified a priori as having exhibited either a very small or very large change in their behavior between assessments at Time 1 and Time 2 approximately 7 months apart. This degree of change was identified using change in score on the ADOS social-communication algorithm and additionally by change in vocabulary using the MCDI. It was thought that comparison of these two extreme groups provided a good initial test of the suitability of the SOC-RS as an outcome measure. Our specific research questions were as follows:
In regard to the first question, the Social Orienting Composite score and one individual item (Referencing Rate) were sensitive to change over time—even though the period being studied was only 7 months, compared with 2 years in the original development study (Mosconi et al., 2009). When considering detection of treatment effect, again there were some significant results. Children in the progress group were more likely at Time 2 (and not at Time 1) to be categorized as “Responders” during both the joint attention and response to name presses of the ADOS. This effect indicates that the categorical scores which can be extracted from the SOC-RS may be more sensitive to differential change than the continuous variables which comprise the overall composite score.
Nevertheless, overall the SOC-RS was not sensitive to group by time interactions, and thus does not seem suitable as a measure of treatment effect. This lack of sensitivity is particularly striking, given the fact that our selection of groups means that we can be confident that a differential between-group effect over time does exist. Even when considering effect sizes for non-significant findings (given the small sample used in this study), we find that the SOC-RS scores are no more sensitive to change than the ADOS social-communication algorithm used in our original definition of the progress/no progress groups. Overall, evidence was not found that the SOC-RS could become a useful new measure of intervention efficacy.
One possible reason for this lack of sensitivity is that the SOC-RS draws on footage from a standardized assessment procedure (the ADOS), which may limit variability in the range of spontaneous social-communication behaviors observed. In addition, it measures only five dimensions of social and communication behavior. A more comprehensive measure assessing a broader range of spontaneous social and communication skills could have a greater chance of detecting subtle differences in change over time between groups.
Limitations of the Study
This proof-of-concept study analyzed data from a very small sample of children with ASD. As early intervention trials gain in methodological quality, it is becoming increasingly rare for studies to work with samples this small; therefore, it could be argued that this was an unfair test of the SOC-RS’s measurement sensitivity or suitability for use in intervention trials. However, our preselection of groups showing progress and no progress over the period being studied means that we can be confident that a treatment effect does exist, and that the SOC-RS failed to identify it sufficiently. Intervention and control groups in trials normally are found to have overlapping distributions of change scores, and some control group participants may make a substantial progress. Therefore, it is crucial that we can be confident that the chosen measurement tool is capable of detecting small but meaningful treatment effects, when studies are appropriately powered.
There is evidence of a difference in age between the two groups analyzed in this report, with a mean difference between groups of about 5 months. We considered whether the lack of SOC-RS effects could be partially attributable to this age difference. However, the original development of the SOC-RS and its use with a longitudinal sample studied at 2 and 4 years of age suggest that age effects should not have played a major part in our findings.
Strengths and Limitations of the SOC-RS
Are we closer to finding a reliable and valid early intervention outcome measurement tool in autism? The SOC-RS was selected for investigation because it meets some of the requirements. The behaviors targeted represent a concise summary of the key social and communicative skills, which are impaired in very young children with ASD, and there is no additional burden on the child where ADOS is already being used in assessment before and after intervention. The SOC-RS has been shown in its development study (Mosconi et al., 2009) to be a valid and reliable approach to directly measuring skill change over time in young children.
However, this current investigation using an extreme groups approach did not find that the SOC-RS showed greater sensitivity to change than the ADOS diagnostic social-communication algorithm in these key proximal intervention outcomes. Given that the SOC-RS uses the ADOS as a sample for coding, this finding has strong pragmatic implications. In clinical practice, where resources are limited, the time needed to derive SOC-RS codes from ADOS footage may not add value to evaluations of longitudinal individual change or intervention effect. The time-intensive nature of the process of training in the SOC-RS further limits its practicality as both a research and a clinical tool. Finally, the SOC-RS, by using the ADOS as a data sample, does not provide a completely naturalistic measure of social and communication behaviors.
Future Directions
So the search continues. One approach currently in development by Lord and colleagues is the Brief Observation of Social and Communication Change (BOSCC, formerly known as the Autism Diagnostic Observation Schedule–Change [ADOS-C]; Carr, Colombi, MacDonald, & Lord, 2011; Colombi, Carr, MacDonald, & Lord, 2011). This measure applies a subset of ADOS-style ratings, with an extended scale range, to play-based interaction between an adult and the young child with ASD. The use of this more naturalistic setting may provide the key to sensitive measurement of treatment effect, by permitting the child to engage in a wide range of spontaneous behaviors, rather than constraining their activities. The conclusion of the extensive work required to develop and test a new objective change-measurement tool is eagerly awaited by the autism research community.
A recent systematic review of the measurement properties of tools which monitor progress and outcomes of children below the age of 6 years with ASD has found significant limitations in the scope of available tools, while presenting evidence on those which appear the most robust (McConachie et al., in press). Significant development work on measurement of outcomes is still required to advance the field of early autism research and to allow meta-analysis across studies to support the strength of conclusions about which are the most effective interventions.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by Flexibility and Sustainability Funding awarded by Northumberland Tyne & Wear NHS Foundation Trust.
