Review of Methods to Equate Target Sets in the Adapted Alternating Treatments Design

Abstract

The adapted alternating treatments design is a commonly used experimental design in skill acquisition research. This design allows for the evaluation of two or more independent variables on responding to unique target sets. Equating target sets is necessary to ensure a valid comparison of the independent variables. To date, there is little guidance on best practice when equating target sets and it is unclear how researchers have done so previously. We reviewed the reported methods used to equate target sets in articles published using the adapted alternating treatments design in five behavior-analytic journals. Just over half of the studies published using the adapted alternating treatments design reported any method to equate target sets and the methods varied considerably. Alternative methods, such as random assignment, were prevalent. Considerations for best practice and avenues for future research are discussed.

Keywords

adapted alternating treatments design experimental design logical analysis single-subjects designs

The adapted alternating treatments design (AATD) is a commonly used single-subject experimental design for instructional research (Sindelar et al., 1985). Despite sharing the same design logic as the alternating treatments design (Barlow & Hayes, 1979), the AATD is used when the target behavior is non-reversible; that is, when the dependent variable is unlikely to return to baseline levels after the independent variable is withdrawn. As a result, the AATD is commonly used in comparative studies examining the efficiency of acquisition under two or more conditions (Holcombe et al., 1994; Wolery et al., 2014).

In the AATD, two or more independent variables are compared, typically to determine which procedure is most efficacious or efficient in promoting levels of responding that reach the mastery criterion (see Holcombe et al., 1994). Control is demonstrated through differentiated levels of responding between conditions or by including a pre-treatment baseline phase or control condition. An initial baseline phase allows for comparisons to be made between each condition and responding prior to intervention, yet evaluating the effects of the independent variable on the dependent variable in the AATD, as in the ATD, requires differentiated levels of responding across conditions. As a result, including a no-treatment control condition may reduce threats to internal validity such as history or carryover effects (Wolery et al., 2014). The AATD may also be embedded in other experimental designs to increase experimental control, such as the multiple-probe design (Horner & Baer, 1978). Incorporating additional baseline phases, control conditions, or experimental designs may increase, but do not ensure, the validity of experimental findings when using the AATD.

The defining feature of the AATD is the assignment of unique target sets to each independent variable. This feature reduces the likelihood of multiple treatment interference, a common threat to the internal validity of the alternating treatments design (Barlow & Hayes, 1979). Specifically, acquisition under one condition should not influence acquisition under another. Take for example learning another language. Learning the Italian word for dog (cane), should have no effect on the acquisition of the Italian word for rat (ratto). As with the ATD, the rapid alternation between conditions requires that responding come under control of the independent variables being manipulated by the experimenter and not other sources of control (Barlow & Hayes, 1979). This may require counterbalancing other experimental variables such as requiring therapists to vary across conditions or ensuring that the timing or sequencing of conditions alternate unsystematically. Because unique target sets are assigned to the AATD, similar counterbalancing is required to ensure that the target sets do not vary in difficulty.

Equating target sets for difficulty is paramount to the validity of the AATD (Holcombe et al., 1994; Sindelar et al., 1985). Difficulty is defined as the rate at which a set of target responses would be acquired should the same independent variable be applied to the relevant stimulus sets (Holcombe et al., 1994). When steps are not taken to ensure that target sets are sufficiently equated, it may not be possible to conclude whether any changes in the dependent variable are a result of the targeted independent variable(s) or a difference in target difficulty. Because the difficulty of each target set usually are not assessed experimentally and may differ based on participants’ history, it is commonly recommended that experimenters use and report methods of logical analysis to equate targets (Sindelar et al., 1985; Wolery et al., 2014). Sindelar et al. (1985) recommend that experimenters equate stimuli by considering baseline rates, conducting post-hoc verification, and using logical analysis. Unfortunately, the authors do not further describe these methods of equating stimuli and the articles cited as examples were never published. The lack of clarity of what constitutes a logical analysis may be concerning as it is unclear how researchers have equated target sets when using this design. Only recently have additional descriptions of logical analysis been provided.

Wolery et al. (2014) provided example components of a logical analysis using textual stimuli. The authors offered that targets may be equated based on (a) the number of syllables, (b) configuration of the words, (c) initial consonant, (d) part of speech, (e) redundant letters across words, (f) the participant’s knowledge of the words, and (g) the participant’s ability to emit the words. To the authors’ knowledge, the examples provided by Wolery and colleagues (2014) are the only described criteria available for conducting a logical analysis. This is surprising as the AATD was first described over 30 years ago. In addition, other single-subject designs, such as the parallel treatments (Gast & Wolery, 1988) and random stimulus (Matson & Ollendick, 1981; Matson et al., 1983) designs, also require target sets to be of equal difficulty, yet no method for equating has been described. In the absence of guidelines for executing logical analyses, it is unclear how past research has equated target stimuli.

Additional questions regarding the effective use of logical analyses remain. For example, in the description provided by Wolery et al. (2014), it is unclear whether the target stimulus or response should be equated. Because textual stimuli are used in the example, certain components of the logical analysis may involve controlling for features of both the antecedent stimulus and target response. Specifically, because the textual operant includes the antecedent and response sharing point-to-point correspondence (Skinner, 1957), the initial consonant, configuration of the word, and redundant letters are all topographical features of the stimulus, which is not the case for other operants. For example, when the target stimulus is a picture of a cat, none of the variables identified above are physically depicted in the picture as they would if the target stimulus was the printed word Cat. Equating both the antecedent and response seems logical, yet it is unclear whether researchers have used logical analysis to equate both stimuli and response components of a learning trial. Moreover, because the methods and criteria for equating stimuli have been inadequately described, there is little information regarding how researchers are using logical analyses, if at all. The current study presents a review of articles published in five behavior-analytic journals that report using an AATD. We describe publication trends in the use of the AATD, identify the components of logical analysis and alternative methods described by authors for equating target sets, and offer additional considerations for the valid use of the AATD.

Method

Article Identification

The focus of this review is on the methods used to equate targets across instructional sets in studies that used an AATD. Studies were identified through a search of five journals that publish applied behavior analytic research, including Journal of Applied Behavior Analysis, The Analysis of Verbal Behavior, Behavior Analysis in Practice, Behavior Modification, and Behavioral Interventions. Studies were identified via a search of each journal using the key terms adapted alternating treatments design and adapted alternating treatment design. This search occurred on July 8, 2018. We also reviewed each edition of the journals since 1985 to identify publications using the AATD that may have been missed by the initial search and conducted a forward citation search using the original article by Sindelar et al. (1985). No articles beyond the initial search were identified using these methods. We then examined each article to ensure that the AATD was used and excluded those that did not use the AATD from the review. Combined single-subject designs were included only if the AATD was used in combination with another design (e.g., multiple-probe design; Horner & Baer, 1978).

Methods to Equate Target Sets

Studies identified in the initial search varied in methods used to equate targets across sets. Data were collected on the presence of the components of logical analysis as described by Wolery et al. (2014). This included controlling for (a) the number of syllables, (b) overlapping sounds, (c) first sounds, (d) novelty or knowledge, and (e) the participant’s ability to emit the target response. Each individual method was coded if the authors endorsed assigning stimuli into separate sets using the given criteria. Data were also collected on methods of logical analysis other than those described by Wolery et al. (2014). These included controlling for (a) the number of letters in the target response, (b) the number of motor responses required to emit the response, and (c) visual properties of the antecedent stimulus (e.g., similar colors or shapes being assigned to separate conditions). Examples and non-examples of these methods are provided in Table 1. Articles were identified as having used logical analysis if any of the above-described methods were identified in the reviewed article. If an article endorsed ensuring that the participants were able to emit the target response, but no other method of equating target sets, the article was not coded as having used logical analysis. We also recorded whether alternative methods to logical analysis were used to assign targets to conditions, also described Wolery et al. (2014). These included assigning targets based on normative data (e.g., all targets were selected from a third-grade level oral reading passage), counterbalancing target sets across participants, random assignment, or expert ratings of target difficulty. Readers also recorded if a target list was provided in the manuscript and if an initial baseline phase was conducted.

Table 1.

Recommended Methods of Logical Analysis.

Method	Guideline	Examples^a	Non-examples
Number of syllables in each condition	Separate conditions should include a similar number of total syllables.	Condition 1 (2)^b: Cat and Horse Condition 2 (2): Dog and Rat	Condition 1 (2): Cat and Horse Condition 2 (5): Dog and Alligator
Number of syllables in individual targets	Individual targets across conditions should be equated for syllable length so that no condition includes targets with a discrepant number of syllables.	Condition 1: Ladybug (3) and Butterfly (3) Condition 2: Kangaroo (3) and Gorilla (3)	Condition 1: Ladybug (3) and Butterfly (3) Condition 2: Lobster (2) and Alligator (4)
Overlapping sounds
First sound	Targets with the same, or similar, first sound should be assigned to separate conditions.	Condition 1: Cat Condition 2: Cup	Condition 1: Cat Condition 2: Run
Middle sound	Targets with the same, or similar, middle sound should be assigned to separate conditions.	Condition 1: Cat Condition 2: Tap	Condition 1: Cat Condition 2: Mouse
End sound	Targets with the same, or similar, ending sound should be assigned to separate conditions.	Condition 1: Cat Condition 2: Hut	Condition 1: Cat Condition 2: Sun
Rhyme	Targets that rhyme should be assigned to separate conditions.	Condition 1: Cat Condition 2: Rat	Condition 1: Cat Condition 2: Sun
Novelty	Individual targets should be assessed to identify whether other relations are already acquired (e.g., when training tacts, assess the listener relation).	Condition 1: Monarch butterfly Condition 2: Yellow Bumble Bee	Condition 1: Monarch butterfly Condition 2: Danaus plexippus
Visual properties	When the antecedent is a visual stimulus, aspects of the stimulus that are shared between targets should be assigned to separate conditions.	Color, shape, background, shared features
Number of motor responses	Motor responses sharing a similar number of steps or movements.	Condition 1: Place hand on wall, then rubCondition 2: Place hand on knee, then pat	Condition 1: Place hand on wall, then rubCondition 2: Extend pointer finger

Note.

Examples include targets that should be equated and assigned to different conditions.

Number of syllables are shown in parentheses.

Target Responses and Publication Details

Data were collected on the specific operant targeted in each study. Target skills included listener responses, intraverbals, tacts, mands, echoics, behavior chains, textuals, visual matching, or multiple operants. Multiple operants were recorded if more than one operant was targeted in a single study. We also recorded the year of publication and the journal name for all articles.

Interobserver Agreement

A second reader independently scored 33.8% of the articles. Interobserver agreement (IOA) was then calculated for individual articles via an item-by-item comparison of scores by dividing the number of agreements by the number of agreements plus disagreements, and multiplying by 100. Mean IOA was 96.6% (89.5–100%).

Results

Figure 1 depicts the cumulative number of articles published using the AATD by year. A total of 65 articles were published using the AATD between 2006 and 2017 in the reviewed journals. The first article published using the AATD in the reviewed journals was in Behavioral Interventions in 2006, with no additional articles using this design until 2009. After 2011, the number of articles published in Journal of Applied Behavior Analysis using the AATD began to increase, with a total of 36 articles published, the most of any journal. A total of 13 articles were published in The Analysis of Verbal Behavior, eight articles in Behavioral Interventions, six articles in Behavior Analysis in Practice, and two articles published in Behavior Modification.

Figure 1.

The cumulative number of articles published using the AATD in the reviewed journals from 2006 to 2017.

Figure 2 presents the percentage of articles using each criterion of logical analysis. Of the 65 articles published using the AATD, 36 (55.4%) used at least one method of logical analysis to equate target sets. Of these articles, 24 (66.7%) equated target sets based on the number of syllables, 18 articles (50.0%) equated sets based on overlapping sounds, and ten articles (27.8%) did so based on the participant’s ability to emit the target response. An additional five articles equated sets based on the first sound (13.8%), and one article equated sets based on the novelty of the response (2.8%). Also presented in Figure 2 are methods of logical analysis reported in the reviewed articles, but not described by Wolery et al. (2014). Seven articles (19.4%) equated target sets based on the number of letters, seven articles (19.4%) equated stimuli based on visual properties, and five articles (13.9%) equated the target responses based on the number of motor responses.

Figure 2.

Percent of articles using each logical analysis method of articles reporting at least one method of logical analysis. Methods not described by Wolery et al. (2014) are denoted with an asterisk.

Figure 3 shows the percent of articles that used methods of logical analysis by the number of methods used. Of the 36 articles published using logical analyses, 25 utilised more than one method for equating targets. Thirteen articles used two methods of logical analysis, most commonly controlling for number of syllables and overlapping sounds (six articles). Ten articles used three methods, the most common of which were the number of syllables, overlapping sounds, and participants’ ability to emit the target response (four articles). Finally, two articles used four methods of logical analysis, both of which controlled for the number of syllables, overlapping sounds, and first sound. The fourth method varied between the two articles and included controlling for the participants’ ability to emit the target response (Haq et al., 2017) or the visual features of the stimuli (Grow et al., 2014).

Figure 3.

Percent of articles reporting at least one method of logical analysis displayed by the total number of methods.

Alternative methods to logical analysis for assigning targets to conditions were common and occasionally combined with logical analysis procedures. Of the 65 articles, seven (10.8%) selected targets from a pool of norm-referenced stimuli. Two of these articles did not use any other method of logical analysis. Ten articles (15.4%) counterbalanced the assignment of target sets across participants, five of which did not use any method of logical analysis for equating target sets. Random assignment was the most commonly used alternative method for assigning targets to conditions, used in 23 articles (35.4%), 12 of which did not include any method of logical analysis. Three additional articles (4.6%) described having an expert rate the difficulty of target sets, with only one also using some method of logical analysis. Target lists were included in 35 articles (53.9%), 15 of which did not report using logical analysis. Finally, 59 articles (90.8%) included an initial baseline phase.

The type of operant included in the reviewed studies are shown in Figure 4. Across the reviewed studies, twenty articles targeted multiple operants (30.8%), 12 articles targeted listener responses (18.5%), eleven articles targeted intraverbals (16.9%), and seven articles targeted tacts (10.8%). Echoic, behavior chains, textual, mand, and visual matching were all reported in four or fewer studies.

Figure 4.

Percent of articles targeting each operant.

Discussion

We reviewed the use of the AATD across five behavior-analytic journals. Although first described in 1985, the first article using this design in the behavior-analytic literature was not published until 2006, with no additional studies until 2009. The cumulative number of articles published using the AATD across these journals began steadily increasing in 2010 with the Journal of Applied Behavior Analysis publishing the greatest number.

Despite the growing use of the AATD, just over half of the articles used logical analysis to equate stimuli. Of those that used logical analysis, 11 articles did so using just a single method with two articles using four methods to equate stimuli. The most commonly used method for equating target sets was controlling for the number of syllables. It is unclear whether this includes the total number of syllables in each condition or individual targets. This distinction is necessary as two sets may contain the same number of syllables, yet they may include individual targets that vary considerably in the number of syllables (see Table 1). The number of syllables is likely related to difficulty; however, other variables are also relevant. Take for example, history with specific targets. If “monarch butterfly” was assigned to one condition and the species name for the same insect, “danaus plexippus,” was assigned to another, the former target would likely be acquired in less time. Although these targets share the same number of syllables, many have encountered “monarch butterfly,” but not “danaus plexippus.” To ensure equal difficulty, combining methods of logical analysis may be necessary.

The most frequently combined methods for equating target sets in the reviewed studies were the number of syllables and overlapping sounds; yet, the effect of overlapping sounds on acquisition is not known. Overlapping sounds may include the first sound (onset), middle sound, end sound, or rhyming sounds (Table 1). The effect of the same first sound, which was utilised by five articles in the current review, is not known. It is unclear whether sets that include targets sharing the same first sound may have a facilitative or detrimental effect on acquisition. Without additional evidence, it is likely best practice to assign targets with the same first sound to separate conditions. Further, similar sounds, such as /m/ and /n/ or /p/ and /b/ might also be assigned to unique conditions. In their description, Wolery et al. (2014) suggested equating based on initial consonants instead of initial sounds. It is unclear how these authors equated target words that did not begin with consonants, such as apple or eat. That is, perhaps apple and pear would be assigned to different conditions as the first consonant is “p” for both targets. Moreover, “eat” and “toe” would be assigned to different conditions. Future research might identify whether the initial consonant, sound, or both should be considered when using logical analyses.

Rhyming sounds might also be assigned to separate conditions to ensure equal difficulty. For example, researchers may commonly place words such as “pat” and “bat” in separate conditions (e.g., Cariveau et al., 2016). A facilitative or detrimental effect of rhyming words may depend, in part, on the antecedent stimulus. That is, if the target response is the vocal response “cat” and “rat,” the effects on acquisition may vary if the antecedent is a textual, visual, or auditory stimulus. Aguirre et al. (2019) evaluated the acquisition of intraverbal relations for three participants with ASD across sets of targets that differed based on the degree of overlap in the antecedent verbal stimuli. In the overlapping condition, the antecedent verbal stimuli shared some feature (e.g., “what color is a basketball” “what color is the sun”), while the nonoverlapping condition included targets with no shared features (e.g., “What do you smell with” “what says woof woof”). The authors found that nonoverlapping stimuli produced more rapid acquisition in five of the six comparisons. Future research might consider how the auditory features of the antecedent stimuli, target responses, or both may be equated for difficulty.

Prior research in experimental psychology and speech-language pathology may be informative when considering other aspects of antecedent stimuli that may affect acquisition. Phonological similarity has received some attention in these literatures, often being discussed as similarity neighborhoods (Luce & Pisoni, 1998) or phonological neighborhoods (Munson & Solomon, 2004). Luce and Pisoni (1998) define a similarity neighborhood as a “. . .collection of words that are phonetically similar to a given stimulus word” (p. 4). The authors suggest that the number of overlapping sounds (i.e., phonemes) of a given stimulus and the frequency of other words in the lexicon with similar sounds (i.e., a dense phonological neighborhood) affects accurate responding. This is often restricted to those words that differ by a single phoneme (e.g., hat, cut, and scat for the target word cat). As such, the recommendation to equate target stimuli based on overlapping sounds in logical analyses may be supported by research on phonological similarity. An additional consideration raised in descriptions of phonological neighborhoods is the effect of infrequently used words that are part of dense phonological neighborhoods. That is, acquisition of “scat” may be affected by the number of phonologically similar responses (e.g., cat, hat) in the individual’s repertoire and past exposure to the word itself. This may suggest that behavior analysts should consider the number of overlapping sounds, but also the number of stimuli already controlling responses similar to the target response and the novelty of these stimuli.

Wolery et al. (2014) referred to “the participant’s knowledge of the referent of the word” (p. 325) as one dimension that should be considered when using logical analysis. In the current review, a single study reported ensuring that stimuli were equated based on novelty. Marchese et al. (2012) reported that target materials were “objects comparable in size and novelty (e.g., toys, books, hat) that were paired based on the same number of phonemes” (p. 541). In this description, it is unclear whether the authors defined novelty based on participants’ past exposure to a stimulus or in reference to the objects being small or inexpensive items (e.g., knickknacks). Behavior analysts should consider what methods should be used to best assess novelty. Responding during an initial baseline phase may serve as one assessment of novelty. Of the reviewed studies, 90.8% included an initial baseline phase. As noted previously, this initial baseline allows for a stronger demonstration of internal validity, but would not serve as an assessment of past exposure to the stimuli. That is, a participant may have substantial exposure to some stimulus, but not emit the target response in the presence of that stimulus. Despite the added benefits, including a baseline phase does not ensure equal difficulty across target sets and additional measures of novelty may be required. Some potential methods may include asking a caregiver about a participant’s exposure to a specific stimulus (e.g., “How much have you talked to Susan about Clifford?”) or classes of stimuli (e.g., “How much does Susan know about trees?”), or specifically assessing other relations with the same target stimuli (e.g., listener, tact, or intraverbal relations). Future research should consider the potential effects of a history with particular antecedent stimuli or topography of a response on the efficiency of acquisition. Research on stimulus equivalence (Fienup & Critchfield, 2011; Sidman, 1971) and instructive feedback (e.g., Holcombe et al., 1993; Wolery et al., 1991, 2000) would suggest that certain histories may facilitate acquisition and future research should consider these histories when conducting logical analyses.

Visual properties of a stimulus should also be considered when equating target sets. Six articles equated target sets based on visual properties; however, the methods for doing so were unclear. Cummings and Carr (2009) reported that targets were “matched in complexity (e.g., number of visual elements in a photograph)” (p. 62). Grow et al. (2014) reported equating visual stimuli by making them “as distinct as possible and using similar visual presentation (e.g., all clothing items presented in isolation on a white background)” (p. 601). Removing irrelevant features of a stimulus by placing each target on a white background is an important arrangement to ensure equal difficulty, yet additional considerations may be warranted. For example, researchers might assign stimuli that share the same color or shape (e.g., ladle and spatula) to separate conditions or, if assigned to the same condition, ensure that similarly colored or shaped stimuli are assigned to the other conditions. Future research should further delineate methods to equate visual stimuli to allow for valid comparisons and replication of procedures.

Ten of the articles using logical analyses also ensured that participants were able to emit the target response (27.8%); however, an additional nine articles described ensuring that participants were able to emit the target response without additional methods of logical analysis. Although this method was identified by Wolery et al. (2014), it is insufficient to equate targets solely based on a participants’ ability to emit a response. That is, participants may be able to emit the vocal response “cat” and “felus catus,” but the former response would likely be acquired more quickly than the latter. The participant’s ability to emit a response may be an important requisite for many studies of skill acquisition; however, studies may also include targets that are not currently emitted by participants, such as imitative responses. These studies may depend on the responses not currently being emitted by the participants. In these instances, controlling for the number of motor responses or the topography of a response may be critical to ensuring equal difficulty (Romer et al., 1988).

The ability to emit a response is not the only method of logical analysis that may be insufficient when used alone. In fact, it may be the case that most of the methods described in this review would be inadequate when used as a sole method of equating stimuli. It is promising that of the 36 articles that used logical analysis, 25 did so using more than one method. Nevertheless, just under half of the reviewed articles did not use logical analyses to equate target sets, which may be due to a reliance on other methods such as random assignment.

The random assignment of targets to conditions is recommended when using the random stimulus design (Matson & Ollendick, 1981; Matson et al., 1983); however, this design also requires that all targets be equated before randomly assigning them to conditions. Thus, the articles that randomly assigned targets to conditions without conducting a logical analysis may experience substantial threats to internal validity. Of the 23 articles using random assignment, 11 included target lists, with only three of these articles using some method of logical analysis. This may suggest that target lists may be provided more commonly when methods to equate target sets are unclear or poorly defined, leaving it up to the reader to determine whether the stimuli are adequately equated for difficulty.

Target lists were included in the text of 36 articles, which may be commonly recommended by reviewers and editors when the procedures used to equate target sets are inadequate. Although this review did not include an in-depth analysis of the target sets when reported in text, a cursory analysis of these articles illustrates the lack of explicit equating of targets across numerous studies. Moreover, we did not evaluate target lists to determine whether the reported methods of logical analysis were accurately described or whether the target sets were equated across domains not endorsed by the authors. For example, Marchese et al. (2012) taught tacts to four children with autism. The authors reportedly equated targets based on the number of phonemes and randomly assigned these targets to one of two sets. For one participant (Michael), the targets in one condition included whistle, slinky, and octopus, while the second condition included recorder, submarine, and maraca. Although it is unclear how Marchese et al. defined a phoneme, the first condition included targets with a total of seven syllables, while the second condition included nine syllables. It is not known whether these differences in the number of syllables influenced Michael’s acquisition; yet, the condition with the greater number of syllables required three times as many sessions to reach the mastery criterion. In another example, Leaf et al. (2016) taught three children with autism tacts and randomly assigned targets to one of three conditions. In one example, the targets assigned to the three sets for one participant (Kim) included a total of four syllables in the first condition, six syllables in the second condition, and seven syllables in the control condition. Kim met the mastery criterion in the same number of sessions across the two test conditions and did not emit any correct responses during the control condition. Because of the potential differences in the number of syllables, it is unclear whether the second condition would have reached the mastery criterion more rapidly had methods to equate the target sets been used. Moreover, as the control condition included more syllables than either test condition, comparisons between these conditions and the control are likely inappropriate as acquisition in the control condition may have been disproportionately affected by target difficulty. Future research might review those articles that do not use logical analysis, but report target sets, to identify whether target difficulty may have affected responding across conditions.

One final method to increase the validity of the AATD includes counterbalancing target sets. This strategy is commonly recommended for use in the AATD and other single-subject designs (e.g., Gast & Wolery, 1988) and includes assigning identical target sets to different conditions across participants (Wolery et al., 2014). Counterbalancing may be infrequently used in applied studies as it may be difficult to identify two or more participants for which identical targets may be trained. Allan et al. (2015) were able to counterbalance target sets for the majority of comparisons across four participants; however, the sets for two participants differed by a single target. The target sets for an additional participant were not included in any other participants’ evaluations. These targets may have already been mastered or not appropriate for a participant’s programming. Although these issues are likely when studies are conducted in therapeutic settings, the potential advantage of counterbalancing is noteworthy. In particular, counterbalancing allows for an additional method of controlling for difficulty when using the AATD. That is, assigning the same set of targets to different conditions across participants allows for greater assurance of equal difficulty, assuming that idiosyncratic results are not observed across participants. Counterbalancing target sets also requires greater analysis of the experimental findings to ensure that any differences observed across participants are a result of the independent variable and not the target sets. Despite the potential strengths of counterbalancing targets sets across conditions, this method does not control for participants’ history with specific targets. Like the other methods of logical analysis, counterbalancing should likely be combined with other procedures to ensure target sets are equal difficulty.

There are a number of areas for future research. First, the ability to emit a target response may be a necessary requirement for many studies; however, if a novel topography is being targeted, there are additional variables that should be considered. For example, if the targets are motor responses not currently in the participant’s repertoire, the researchers should ensure that the target responses include similar fine or gross motor skills or other aspects of a response that may be related to response effort (e.g., crossing the midline). In addition, responses that include similar components may be assigned to separate conditions (e.g., rub arm and rub leg). Moreover, if the participant is able to emit responses (e.g., holding up one finger) that are similar to one response (e.g., holding up two fingers), but not another (e.g., clasping hands together), these responses may not be of equal difficulty. Thus, future research might report responses that may serve as prerequisites for topographies not currently in a participant’s repertoire.

In addition to the participant’s ability to emit a response, other methods of equating stimuli may be necessary when employing the AATD. For example, when the target response is vocal, assigning target stimuli to separate conditions that share the same number of syllables, first sound, and overlapping sounds may require little additional effort and should be an expectation of studies using the AATD. If the target is a visual stimulus, controlling for visual properties of the stimulus is also warranted. Novelty should also be considered for potential impact on acquisition. Although responding under baseline conditions may be similar, acquisition may be affected by other relations. Finally, preference for particular stimuli may also be related to rate of acquisition, although no study reported preference for targets prior to intervention.

Future research should address some limitations of this review. Because we sought to describe the methods for equating target sets in the AATD, we did not comprehensively review other features of the studies published using these designs. Future research might conduct a more thorough content analysis, examining the specific independent variables, participant characteristics, or other features of these articles that may inform future research or clinical work. For example, the AATD may be commonly used in studies with individuals with developmental disabilities. In fact, some studies have used the AATD as an assessment to inform individualised clinical interventions for children with autism spectrum disorder or other developmental disabilities (e.g., Carroll et al., 2018). Greater analysis of these studies may further elucidate the value of the AATD in clinical settings.

We were unable to code articles based on whether the authors equated the target stimuli or the target response. This remains an area that will require greater consideration in the AATD literature. In one example, Ingvarsson and Hollobaugh (2011) targeted intraverbals and reported the antecedent verbal stimulus, but not the target response across conditions. Other operants, such as tacts or textuals do not require the authors to distinguish between the antecedent or response in a target list (e.g., Carroll et al., 2015; Reichow & Wolery, 2011), yet the features of an antecedent stimulus for tacts, will require other methods of equating (e.g., by color, size, shape, etc.). In addition, the inclusion of conditional discriminations in the intraverbal or listener response relations should be equated across conditions consistent with recommendations by Axe (2008) and Sundberg (2016). Future research using the AATD should report methods to equate both antecedent and target stimuli when they differ.

Finally, we limited our review to five behavior analytic journals and only included articles that identified the AATD as the experimental design used. Although other behavioral and educational journals may also publish studies using the AATD, we chose to include outlets that typically require strong demonstrations of experimental control. Moreover, the AATD may commonly be misidentified as an alternating treatments design, or other designs that share similar features (e.g., parallel-treatments design; Gast & Wolery, 1988). Although including these studies would help answer more broad questions about comparative research on skill acquisition, we were interested in describing the AATD and methods used to equate targets in this design. Including studies that misidentified the AATD, such as when an alternating treatments design with different target sets is reported, may inappropriately underestimate researchers’ use of methods for equating target sets in the AATD. Specifically, it may be unlikely that authors would be familiar with the requirements for equating target sets if they did not identify their design as the AATD. Future research might review the extant literature to determine how alternatives to the AATD have equated targets across conditions.

Conclusion

The findings of the current review suggest that a number of articles using the AATD have done so without adequately equating target sets across conditions. This is troublesome as the assignment of unique target sets to separate conditions is the defining feature of this design. Although the ideal methods for equating target sets are unknown, it is likely that a combination of methods should be used in logical analyses. Moreover, the preponderance of articles using random assignment without logical analyses should no longer be accepted in behavior-analytic journals unless the target sets are included in the text, allowing the reviewers and readers of the journal to reasonably determine whether the sets are of equal difficulty. There are many areas that researchers should pursue in refining the use of the AATD, the results of which will undoubtedly improve the quality of research using this design.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Tom Cariveau

Author Biographies

Tom Cariveau is currently as assistant professor at the University of North Carolina Wilmington. His research interests include methods to increase instructional efficiency for children with developmental disabilities, social behavior, and training in behavior analysis.

Sydney Batchelder is currently a doctoral student at the University of North Carolina Wilmington. Her research interests include the treatment of health behaviors, substance use, and dissemination and cross-discipline expansion of behavior analysis.

Sydney Ball received her master’s degree from the University of North Carolina Wilmington and currently works as a Board Certified Behavior Analyst. Her primary research interest is in the treatment of pediatric feeding disorders.

Astrid La Cruz Montilla received her master’s degree from the University of North Carolina Wilmington and currently works as a Board Certified Behavior Analyst. Her primary research interest is in stimulus equivalence.

References

Aguirre

A. A.

LeBlanc

L. A.

Reavis

Shillingsburg

M. A.

Delfs

C. H.

Miltenberger

C. A.

Symer

K. B.

(2019). Evaluating the effects of similar and distinct discriminative stimuli during auditory conditional discrimination training with children with autism. The Analysis of Verbal Behavior, 35(1), 21–38. https://doi.org/10.1007/s40616-019-00111-3

Allan

A. C.

Vladescu

J. C.

Kisamore

A. N.

Reeve

S. A.

Sidener

T. M.

(2015). Evaluating the emergence of reverse intraverbals in children with autism. The Analysis of Verbal Behavior, 31(1), 59–75. https://doi.org/10.1007/s40616-014-0025-8

Axe

J. B.

(2008). Conditional discrimination in the intraverbal relation: A review and recommendations for future research. The Analysis of Verbal Behavior, 24(1), 159–174.

Barlow

D. H.

Hayes

S. C.

(1979). Alternating treatments design: One strategy for comparing the effects of two treatments in a single subject. Journal of Applied Behavior Analysis, 12(2), 199–210.

Cariveau

Kodak

Campbell

(2016). The effects of intertrial interval and instructional format on skill acquisition and maintenance for children with autism spectrum disorders. Journal of Applied Behavior Analysis, 49(4), 809–825. https://doi.org/10.1002/jaba.322

Carroll

R. A.

Joachim

B. T.

St. Peter

C. C.

Robinson

(2015). A comparison of error-correction procedures on skill acquisition during discrete-trial instruction. Journal of Applied Behavior Analysis, 48(2), 257–273. https://doi.org/10.1002/jaba.205

Carroll

R. A.

Owsiany

Cheatham

J. M.

(2018). Using an abbreviated assessment to identify effective error-correction procedures for individual learners during discrete-trial instruction. Journal of Applied Behavior Analysis, 51(3), 482–501. https://doi.org/10.1002/jaba.460

Cummings

A. R.

Carr

J. E.

(2009). Evaluating progress in behavioral programs for children with autism spectrum disorders via continuous and discontinuous measurement. Journal of Applied Behavior Analysis, 42(1), 57–71. https://doi.org/10.1901/jaba.2009.42-57

Fienup

D. M.

Critchfield

T. S.

(2011). Transportability of equivalence-based programmed instruction: Efficacy and efficiency in a college classroom. Journal of Applied Behavior Analysis, 44(3), 435–450. https://doi.org/10.1901/jaba.2011.44-435

10.

Gast

D. L.

Wolery

(1988). Parallel treatments design: A nested single subject design for comparing instructional procedures. Education and Treatment of Children, 11(3), 270–285.

11.

Grow

L. L.

Kodak

Carr

J. E.

(2014). A comparison of methods for teaching receptive labeling to children with autism spectrum disorder: A systematic replication. Journal of Applied Behavior Analysis, 47(3), 600–605. https://doi.org/10.1002/jaba.141

12.

Haq

S. S.

Zemantic

P. K.

Kodak

LeBlanc

Ruppert

T. E.

(2017). Examination of variables that affect the efficacy of instructive feedback. Behavioral Interventions, 32(3), 206–216. https://doi.org/10.1002/bin.1470

13.

Holcombe

Wolery

Gast

D. L.

(1994). Comparative single-subject research: Description of designs and discussion of problems. Topics in Early Childhood Special Education, 14(1), 119–145.

14.

Holcombe

Wolery

Werts

M. G.

Hrenkevich

(1993). Effects of instructive feedback on future learning. Journal of Behavioral Education, 3(3), 259–285. https://doi.org/10.1007/bf00961555

15.

Horner

R. D.

Baer

D. M.

(1978). Multiple-probe technique: A variation on the multiple baseline. Journal of Applied Behavior Analysis, 11(1), 189–196.

16.

Ingvarsson

E. T.

Hollobaugh

(2011). A comparison of prompting tactics to establish intraverbals in children with autism. Journal of Applied Behavior Analysis, 44(3), 659–664. https://doi.org/10.1901/jaba.2011.44-659

17.

Leaf

J. B.

Townley-Cochran

Mitchell

Milne

Alcalay

Leaf

Taubman

McEachin

Oppenheim-Leaf

M. L.

(2016). Evaluation of multiple-alternative prompts during tact training. Journal of Applied Behavior Analysis, 49(2), 399–404. https://doi.org/10.1002/jaba.289

18.

Luce

P. A.

Pisoni

D. B.

(1998). Recognizing spoken words: The neighborhood activation model. Ear and Hearing, 19(1), 1–36.

19.

Marchese

N. V.

Carr

J. E.

LeBlanc

L. A.

Rosati

T. C.

Conroy

S. A.

(2012). The effects of the question “what is this?” on tact-training outcomes of children with autism. Journal of Applied Behavior Analysis, 45(3), 539–547. https://doi.org/10.1901/jaba.2012.45-539

20.

Matson

J. L.

Ollendick

T. H.

(1981). The random stimulus design. Child Behavior Therapy, 3(4), 69–75.

21.

Matson

J. L.

Ollendick

T. H.

Breuning

S. E.

(1983). An empirical demonstration of the random stimulus design. American Journal of Mental Deficiency, 87(6), 634–639.

22.

Munson

Solomon

N. P.

(2004). The effect of phonological neighborhood density on vowel articulation. Journal of Speech, Language, and Hearing Research, 47(5), 1048–1058.

23.

Reichow

Wolery

(2011). Comparison of progressive prompt delay with and without instructive feedback. Journal of Applied Behavior Analysis, 44(2), 327–340. https://doi.org/10.1901/jaba.2011.44-327

24.

Romer

L. T.

Billingsley

F. F.

White

O. R.

(1988). The behavior equivalence problem in within-subject treatment comparisons. Research in Developmental Disabilities, 9(3), 305–315.

25.

Sidman

(1971). Reading and auditory-visual equivalences. Journal of Speech and Hearing Research, 14(1), 5–13.

26.

Sindelar

P. T.

Rosenberg

M. S.

Wilson

R. J.

(1985). An adapted alternating treatments design for instructional research. Education and Treatment of Children, 8(1), 67–76.

27.

Skinner

B. F.

(1957). Verbal behavior. Prentice Hall.

28.

Sundberg

M. L.

(2016). Verbal stimulus control and the intraverbal relation. The Analysis of Verbal Behavior, 32(2), 107–124. https://doi.org/10.1007/s40616-016-0065-3

29.

Wolery

Doyle

P. M.

Ault

M. J.

Gast

D. L.

Meyer

Stinson

(1991). Effects of presenting incidental information in consequent events on future learning. Journal of Behavioral Education, 1(1), 79–104.

30.

Wolery

Gast

D. L.

Ledford

J. R.

(2014). Comparison designs. In Gast

D. L.

Ledford

J. R.

(Eds.), Single case research methodology: Applications in special education and behavioral sciences (pp. 297–345). Routledge.

31.

Wolery

T. D.

Schuster

J. W.

Collins

B. C.

(2000). Effects on future learning of presenting non-target stimuli in antecedent and consequent conditions. Journal of Behavioral Education, 10(2–3), 77–94. https://doi.org/10.1023/A:1016679928480