Abstract
The present study evaluated the effectiveness of two variations of a token economy for reducing disruptive behavior within a general education classroom. One variation involved a group contingency in which tokens were removed contingent on disruptive behavior (response cost), and the other variation involved a group contingency in which tokens were gained according to a differential reinforcement of other behavior schedule. Two elementary school teachers and their students participated. Results indicated that both procedures were effective in reducing the overall number of students disrupting; however, both teachers and students indicated a greater preference for the response cost condition. Implications for the use of these behavior management strategies in the classroom are discussed in terms of effectiveness and ease of implementation.
Disruptive behavior is a common concern among teachers at the middle and high school level (E. Little, 2005; Macciomei, 1999), specifically “talking out of turn” (TOOT) and “hindering other children” (HOC; E. Little, 2005), which we have broadly defined as students speaking while the teacher is instructing, and attempting to engage with other students during independent work activities or engaging in behaviors that are disruptive to other students, respectively.
Due to the nature of disruptive behavior among multiple students in the classroom, it has been suggested that the most reasonable way of handling such behavior is through group contingencies (Axelrod, 1998; Heering & Wilder, 2006; Hulac & Benson, 2010; Macciomei, 1999). Of the three different types of group contingencies (independent, interdependent, and dependent), interdependent group contingencies was the focus of the current study. Litow and Pumroy (1975) first defined interdependent group contingencies as contingencies in which a reward is delivered to each student in the classroom or each member of a group so long as each student or group member, or a predetermined number of students or group members, meet a specified performance criterion. For example, for the class to earn 5 extra min of recess, 80% of students must turn in their homework. If more than 20% of students fail to turn in their homework, then none of the students in the class receive the reward. Given interdependent group contingencies make each group member interdependent on the others to attain the reward, utilizing an interdependent group contingency in a classroom setting may reduce the motivation for students to attend to peers who engage in disruptive behavior. For example, students may be more likely to encourage their peers to be less disruptive and may provide social reinforcement for behavior that helps the group meet their goal (i.e., performance criterion).
The effectiveness of group contingencies in managing classroom behavior has been well established in the literature, starting with the “good behavior game” evaluated by Barrish, Saunders, and Wolf (1969) and Medland and Stachnik (1972). In these studies, students were separated into two groups and competed to earn rewards such as free time and preferred classroom activities (e.g., lining up first for recess, being released first to go home). The losing team who engaged in the most disruptive behavior during identified classroom activities, such as math or reading time, missed out on the reward.
To establish the “good behavior game,” the teacher stated a list of rules that the students were required to follow. Next, when students engaged in disruptive behavior (e.g., broke the rules), a point was written on the board. The group with the fewest number of points at the end of the game was allowed access to the preferred classroom activity. Results indicated that the “good behavior game” reduced disruptive classroom behavior during both math and reading activities. In addition, it allowed the teacher to administer one consequence based on the behavior of each member of the group, rather than administering consequences for individual students.
In these examples of the “good behavior game,” on-task behavior may have been strengthened through either punishment or reinforcement. Although students accessed preferred activities (positive reinforcement), they were also receiving marks on the board that may have functioned as punishment for disruptive behavior. No analysis was conducted to assess differences between presenting marks on the board as indicators of “good” behavior versus using marks as indicators of disruptive behavior.
Recording marks on a board to indicate that students have earned rewards is consistent with token economy procedures recommended for classroom management (Axelrod, 1998; Hulac & Benson, 2010). Tokens are delivered contingent on a target behavior or the absence of disruptive behavior (e.g., on-task behavior; behavior other than TOOT or HOC) and then exchanged for backup reinforcement (Axelrod, 1998; Cooper, Heron, & Heward, 2007; Macciomei, 1999). Backup reinforcement may include a variety of preferred classroom activities such as extra recess time, early dismissal, or access to additional materials (e.g., stickers and glitter for art time). Within the literature, token economies have often been evaluated at the individual level (Conyers et al., 2004; McGoey & DuPaul, 2000); however, it is also possible to administer group-wide token economies (Cooper et al., 2007; Macciomei, 1999; Ward, 1991) in which earning a token is contingent on the behavior of each member of the group.
Token economies have become more popular in public education systems (Hulac & Benson, 2010) and have frequently been implemented in the classroom to both increase appropriate behavior (Heering & Wilder, 2006; Putnam, Handler, Ramirez-Platt, & Luiselli, 2003) and decrease disruptive behavior (Drabman, Spitalnik, & Spitalnik, 1974; Hulac & Benson, 2010). The educational literature on classroom management now includes recommendations on how to implement token economies for individual students in a classroom setting (Hulac & Benson, 2010; Macciomei, 1999; Ward, 1991) and for the entire classroom (Axelrod, 1998; Hulac & Benson, 2010).
It has been suggested that tokens should be both given for appropriate behavior and removed for inappropriate behavior (Axelrod, 1998; Macciomei, 1999); however, exact procedures may vary. For example, tokens can be given and never removed (Drabman et al., 1974; Hulac & Benson, 2010; Kirby, Kerwin, Carpenedo, Rosenwasser, & Gardner, 2008), or the teacher may give a certain number of points to each student at the beginning of an interval and then deduct those points contingent on unsatisfactory behavior (Sulzbacher & Houser, 1968).
Results of studies that have compared the direct effectiveness of different variations of token economies when administered to children in school settings have been mixed. For example, McGoey and DuPaul (2000) compared gain and response cost procedures using individualized token economies for preschool children diagnosed with attention deficit hyperactive disorder (ADHD). Results indicated that the two procedures had a similar effect in decreasing inappropriate behavior in the classroom. However, teacher ratings of the procedures indicated that the response cost procedure was viewed as more acceptable than the gain procedure. This may have been due to ease of implementation. For example, the teachers did not have to “catch the child being good,” as they did during the gain procedure. Instead, teachers were cued to react when children engaged in inappropriate behavior.
In a later study, Conyers et al. (2004) evaluated the effectiveness of response cost and differential reinforcement of other behavior (DRO) token economies, administered individually for preschool children. Results indicated that DRO was initially more effective in reducing disruptive behavior and that response cost was more effective in maintaining lower rates of disruptive behavior.
Although previous research supports the effectiveness of token economies in reducing disruptive behavior, continued evaluation of procedural variations (i.e., response cost procedures vs. reinforcement procedures) is needed; furthermore, more research is needed on the effectiveness of group token economies as individualized token procedures may not be realistic in common elementary, middle, or high school classrooms. Educators are typically responsible for 20 or more students of whom they must teach and provide rewards and consequences for appropriate and inappropriate behavior, respectively. It would be more feasible for teachers to set up one classroom-wide token economy (Hayes, 1976).
The present study aimed to directly evaluate the effectiveness of token economies that employed either response-cost-only or reward-only procedures within an educational setting using single-subject research methodology. Each student was exposed to both the response cost and reward conditions. One token economy was established for each group (class) of students. Consequences were contingent on the behavior of each student to further evaluate the effectiveness of interdependent group contingencies in the classroom. Free time was granted in exchange for each token, similar to previous research.
The schedule in which tokens were delivered or removed was reasonable for the teacher to implement and provided adequate opportunities for students to earn tokens (Harlan & Rowland, 2002). A variable interval schedule was used to ensure opportunities for token loss/gain were equal across conditions and to prevent cuing students to attend to interval duration.
Method
Participants, Setting, and Materials
Students from one fifth- or sixth-grade split general education class and one sixth-grade general education class and their teachers participated in the current study. Teachers and their corresponding students were recruited from the same public elementary school, located in a lower economic region. The classrooms were located directly beside one another on the same wing of the elementary school. The teachers and vice principal reported that the majority of the students were of lower socioeconomic status (SES) and would attend a low-achieving middle school upon graduation.
Sessions were conducted in the students’ regular classroom with all typical classroom materials present. Sessions were conducted at the same time each day, correlating with the same subject (math for each classroom, unless otherwise noted). Materials used were pens and data sheets, teaching materials appropriate for the subject, a Droid cell phone, dry erase markers, and the white board in the classroom. Data sheets were created to resemble the layout of the classrooms such that data collectors could easily score students’ behavior (described below).
Response Measures, Data Collection, and Interobserver Agreement (IOA)
The dependent variables were the percentage of intervals with students engaged in disruptive behavior, the average number of students engaged in disruptive behavior per session, and the number of tokens earned or retained per session. Momentary time sampling was used to measure the number of students engaged in disruptive behavior (i.e., TOOT, touching other students, out of seat behavior, and/or gesturing) and the number of students engaged in on-task behavior (e.g., students looking at their worksheet, pencils moving, students counting on fingers, students raising their hand to ask questions) at the end of each interval that varied in duration around an average value of 5 min. Observation intervals were separated by a recording interval that was approximately 4 min in duration. After the first observation interval, the timer for subsequent observation intervals began at the end of the 4-min data-recording interval. The researchers collected data using prepared data sheets that mirrored each classroom floor plan arrangement; this allowed for data on individual student behavior to be collected throughout the study. In addition, verbal behavior emitted by the participants was anecdotally assessed. Vocal statements about the procedures (e.g., “That’s not fair!” following the removal of a token) were recorded throughout all conditions. Definitions of disruptive, on-task, and off-task behaviors are provided in Table 1.
Definitions of Disruptive, On-Task, and Off-Task Behaviors for Classrooms 1 and 2.
A second observer was present for 69% of all sessions. The number of sessions during which a secondary observer was present was based on availability of the second observer. The second observer received training on data collection and procedural integrity prior to collecting session data, using Behavioral Skills Training (BST). The primary and secondary observers collected data independently. IOA for number of students engaged in disruptive behavior was calculated using a frequency within interval method (Miltenberger, 2012). First, agreement for each interval was calculated by dividing the smaller frequency by the larger frequency and then multiplying by 100. The resulting quotients were then summed and divided by the total number of intervals. Mean agreement across sessions for the number of students engaged in disruptive behavior was 80% (range = 60%-100%) for Classroom 1 and 89% (range = 70%-100%) for Classroom 2. Graphical depictions of the number of students engaged in disruptive behavior are based on data collected by the primary observer.
Treatment integrity data were collected during each session by the first author. Data were collected on the teacher’s classroom announcement of the contingencies in effect each day. For Classroom 1, the teacher never failed to announce the correct contingency and procedures; thus, treatment integrity was 100%. For Classroom 2, the teacher did not announce the contingency during one out of 19 sessions; thus, treatment integrity was 95%. Treatment integrity data also included data on whether the teacher correctly administered consequences consistent with the stated contingencies for classroom behavior for that day, as instructed by the first author (i.e., removed tokens or gave tokens following text message prompts). A correct implementation was scored if the teacher provided or removed a star when the text message indicated to do so (e.g., “erase one star”). An incorrect implementation was scored if the teacher failed to either give or remove a star when the text message prompt indicated that either was required. The number of correct implementations was divided by correct plus incorrect implementations to obtain a percentage. For Classroom 1, mean treatment integrity across sessions was 99% (range = 80%-100%). For Classroom 2, mean treatment integrity across sessions was 94% (range = 60%-100%).
Social Validity
To assess social validity, teachers and students completed questionnaires at the end of the experimental conditions. Teachers answered questions regarding ease of implementation, preference for procedures, and overall satisfaction with the experiment. These questions were developed by the researchers and are available upon request. Students answered one question regarding preference for procedures.
General Procedures
Experimental Design
A simultaneous multiple treatment design combined with a multiple-baseline across-classrooms design was used (Kazdin, 2011). Each group (class) was first exposed to baseline conditions. After baseline, treatment conditions were implemented and staggered across classrooms. The order of the conditions was selected at the beginning of each week by random number selection. The teacher was informed of which condition to run each day of the experiment.
Baseline
During baseline, typical classroom contingencies were in place. The researcher stood on the far left side of the classroom. At the end of each interval, the researcher collected data on disruptive behavior for approximately 2 min. Immediately following data collection for disruptive behavior, the researcher collected data on on-task behavior for approximately 2 min. Note, during baseline conditions for Classroom 1 (the split fifth- or sixth-grade class), the researchers collected data on disruptive behavior for the front half of the class for approximately 2 min (1 min disruptive behavior, 1 min on-task behavior) and then collected data for the back half of the class for 2 min (1 min disruptive behavior, 1 min on-task behavior).
Criterion for rewards was determined for each classroom following baseline conditions. For Classroom 1, rewards (described below) were kept (response cost condition) or given (DRO condition) if less than three students were engaged in disruptive behavior (i.e., TOOT, touching other students, out of seat behavior, and/or gesturing) at the end of each interval. For Classroom 2, the criterion was set to no more than four students engaged in disruption at the end of each interval.
Response Cost
During the response cost condition, the teacher placed five stars on the white board at the front of the class. The teacher then stated the contingency below or something to a similar effect:
Today after math you will have 5 minutes of extra recess (Classroom 1)/Preferred Activity Time (PAT; Classroom 2) if you stay on task. If you engage in disruptive behavior (DEFINES) then you will lose a star. For each star that you lose, you will lose one minute of extra recess/PAT. Let’s get started.
At the end of each interval, the researcher collected data on disruptive behavior for approximately 2 min. Following data collection for disruptive behavior, the researcher cued the teacher by sending an electronic text message to his or her cell phone, indicating whether the class lost a star. If the criterion to lose a star was met, the teacher approached the white board and erased a star. The teacher made no vocalizations about the consequences.
Gain (DRO)
During the gain condition, procedures were identical to the response cost condition except for the following changes. The teacher did not put the stars on the board at the beginning of the session and stated the contingency below or something to a similar effect:
Today during math you will have an opportunity to earn 5 minutes of extra recess (Classroom 1)/PAT (Classroom 2). For each star that you receive, you will get 1 minute of extra recess/PAT. If you engage in disruptive behavior (DEFINES), you will not receive a star. Let’s get started.
At the end of each interval, the researcher collected data on disruptive behavior for approximately 2 min. Following data collection for disruptive behavior, the researcher cued the teacher by sending an electronic text message to his or her cell phone, indicating whether the class had earned a star. If the criterion to earn a star was met, the teacher approached the white board and drew a star. For Classroom 1, the teacher occasionally asked the student sitting closest to the white board to draw a star on the board for the class. The teacher made no vocalizations about the consequences. The criterion to gain a star was identical to the criterion listed above for the cost condition.
Doubling Rewards
The purpose of this condition was to determine whether increasing the magnitude of rewards available would enhance the effectiveness of either condition. During this condition, procedures were identical to those described above, except that the value of each star was doubled to 2 min; thus, students could earn up to 10 min of extra recess/preferred activity time (PAT).
Results
Figure 1 shows results of the interdependent group contingencies on the percentage of disruptive behavior for Classrooms 1 (in the top panel) and 2 (in the bottom panel). During baseline, Classroom 1 averaged 90% (range = 80%-100%) of intervals with disruptive behavior, and steady responding (i.e., little variability and no downward trend) was observed. During treatment conditions, the percentage of intervals with disruptive behavior decreased across both conditions; however, responding was variable. Classroom 1 averaged 60% (range = 0%-80%) of intervals with disruptive behavior during DRO and 26% (range = 0%-100%) during response cost.

Percentage of intervals with disruptive behavior.
During baseline, Classroom 2 averaged 97% (range = 80%-100%) of intervals with disruptive behavior, and steady responding (i.e., no trend and little variability) was observed. Similar to Classroom 1, the percentage of intervals with disruptive behavior in Classroom 2 decreased when treatment conditions were implemented, although responding was variable. The average percentage of intervals with disruptive behavior was 52% (range = 0%-100%) and 60% (range = 40%-80%) during DRO and response cost, respectively.
Figure 2 depicts the average number of students engaged in disruptive behavior across intervals for each session. During baseline, the number of students engaged in disruptive behavior followed an upward trend with an overall average of 7.65 students engaged in disruption across intervals. Once treatment conditions were introduced, the number of students engaged in disruption decreased and remained lower than baseline throughout treatment conditions that utilized the prompt procedure. During the DRO condition, Classroom 1 had an average of 3.21 students engaged in disruptive behavior and an average of 2.66 students engaged in disruptive behavior during the response cost condition. Similar results were observed with Classroom 2, where the average number of students disrupting across intervals greatly decreased during treatment conditions relative to the baseline condition. During the baseline condition, an average of 11.38 students were observed to engage in disruptive behavior, and an upward trend was again observed. When treatment conditions were introduced, the number of students engaged in disruptive behavior decreased in both conditions and, with the exception of Session 20, remained lower than baseline. During the DRO condition, Classroom 2 had an average of 5.24 students engaged in disruptive behavior and an average of 5.48 students engaged in disruptive behavior in the response cost condition. Effect size was calculated across each condition. Within Classroom 1, DRO conditions had an overall Tau-U = .89 and cost conditions had an overall Tau-U = .95. Within Classroom 2, DRO conditions had an overall Tau-U = .78 and cost conditions had an overall Tau-U = .96.

Average number of students disrupting per session.
For both classrooms, there were no observed differences in on-task behavior across experimental conditions. In other words, neither gain nor cost procedures resulted in an increase in on-task behavior compared with baseline; data for on-task behavior are available from the first author upon request.
In Classroom 2, differential responding between DRO and response cost conditions was not observed; thus, individual student data were also analyzed to assess for any differential effects of DRO and response cost for those students who exhibited relatively high levels of disruption compared with their peers. Within Classroom 2, analysis of individual student data revealed three variations of effect. Figure 3 depicts the percentage of intervals that two students (Tim in the top panel and Natalie in the bottom panel) were engaged in disruption. During baseline, Tim and Natalie averaged 42% (range = 0%-80%) and 64% (range = 0%-100%) of intervals with disruptive behavior, respectively. After experimental procedures were introduced, Tim’s behavior did not appear to be affected by either procedure and Natalie’s behavior was only minimally affected.

Percentage of intervals with disruption for Tim and Natalie.
Figure 4 depicts the percentage of intervals that two students (Paul in the top panel and Sally in the bottom panel) were engaged in disruption. Both Paul and Sally engaged in high levels of disruptive behavior during the baseline condition averaging 75% (range = 60%-100%) and 50% (range = 20%-100%) of intervals, respectively; after the implementation of treatment procedures, a marked decrease in disruptive behavior was observed in both conditions.

Percentage of intervals with disruption for Paul and Sally.
Figure 5 shows data collected for Betsy (top panel) and Jim (bottom panel). During baseline, Betsy engaged in a moderate amount of disruptive behavior, averaging 24% (range = 0%-60%) of intervals. During the DRO condition, disruptive behavior remained the same until stars were doubled. However, data collected for Betsy show that during the response cost condition, disruptive behavior was completely suppressed.

Percentage of intervals with disruption for Betsy and Jim.
During the baseline condition for Jim, disruptive behavior averaged 82% (range = 40%-100%) of intervals. Similar to Betsy, after experimental procedures were introduced, there was a greater decrease in disruptive behavior in the response cost condition relative to the DRO condition. During the DRO condition, disruptive behavior reliably occurred during 40% to 80% of intervals, whereas during response cost, disruptive behavior reliably occurred during 20% to 40% of intervals.
Figure 6 depicts social validity results for Classroom 1 in the top panel and Classroom 2 in the bottom panel. For Classroom 1, 17 students preferred the cost condition, no student preferred the DRO condition, and seven students preferred both conditions equally. The teacher for Classroom 1 indicated a preference for the cost condition. For Classroom 2, 13 students preferred the cost condition, seven students preferred the DRO condition, seven students preferred both conditions equally, and two students had no preference. The teacher for Classroom 2 indicated a preference for the cost condition.

Students’ preference for experimental procedures.
In addition to answering questions about preference for procedures, both teachers completed satisfaction surveys regarding their participation in the study. Both teachers indicated that they were satisfied with the way they and their students were treated throughout the study and felt that all procedures were fair to the students. In addition, both teachers were satisfied with the duration of baseline conditions and the overall duration of the study. The teacher for Classroom 1 indicated that she would have liked to participate more in the study and that her students benefited from participating in the research study. Although both teachers indicated that they were satisfied with the study overall, neither teacher indicated that the procedures were effective in reducing the overall disruption in their classroom.
The total number of stars earned per condition is depicted in Figure 7. Vocal verbal behavior emitted by students during the conditions is depicted within Table 2.

Total number of stars earned.
Verbal Behavior Emitted by Students During Response Cost and DRO Conditions.
Note. DRO = differential reinforcement of other behavior.
Discussion
Results of the current study corroborate previous research that has suggested that cost and gain procedures are equally effective in reducing disruptive behavior at the individual level (Conyers et al., 2004; Kaufman & O’Leary, 1972; McGoey & DuPaul, 2000). The current study found similar results when cost and gain procedures were implemented in the context of interdependent group contingencies. Both cost and gain procedures were moderately effective in reducing the overall amount of disruption in two typical sixth-grade classrooms. For Classroom 1, the cost procedure was slightly more effective in reducing the total number of intervals with disruption than the DRO procedure. For Classroom 2, neither procedure proved more effective than the other in reducing the overall amount of disruption. In addition, both procedures resulted in a reduction of the average number of students engaged in disruption in comparison with baseline. In Classroom 1, the cost procedure reduced the average number of students engaged in disruption slightly more than the DRO condition. However, for Classroom 2, both procedures were equally effective in reducing the average number of disrupting students.
For Classroom 2, there may be several explanations for the lack of differentiation between conditions. First, some students in Classroom 2 were unaffected by either procedure. Individual data (see Figure 3) show that for students who engaged in the highest levels of disruption, disruptive behavior did not consistently decrease during either procedure. This may speak to the need to individualize interventions and procedures for specific students who engage in the highest levels of disruption. Second, the definition for disruptive behavior may not have been sensitive enough for this classroom. For example, “chair leaning” was included in the original definition due to the observation of several students tipping their chairs backward into other students. As the experimental conditions were introduced, the researchers noted anecdotally that as TOOT and HOC decreased, chair leaning appeared to be increasing; however, the interference with other students was minimal compared with the other two types of disruptive behavior. A difference in conditions may have been observed had “chair leaning” been parsed out from previous data collection. Third, although the contingency was stated each day, students may not have clearly understood the differences between the conditions. Students within both classrooms had a previous history with classroom procedures in which rewards could be both earned and lost. Although stars were never removed in the DRO condition, it may have been unclear to the students if stars could be removed, until several sessions of contacting the contingencies associated with this condition. This point was exemplified during the third DRO session, when a student shouted out “Be quiet, we are going to lose a star.” The point was again made when, during a cost session, a student asked, “Can we earn those stars back?” It is possible that stating within the contingency that stars could not be removed during DRO, and could not be gained during cost, could have enhanced differentiation between the procedures. Another possible reason that differentiation did not occur between the conditions was that reinforcement might not have been salient enough. Only after star values were doubled did clear differentiation begin to occur, and then, due to time constraints, the study was ended. It is possible that had the Doubled Reward condition continued, differentiation would have been observed further, with one condition producing a greater decrease in disruptive behavior than the other.
The observed decrease in disruptive behavior during the DRO condition for Classroom 2, within the Doubled Reward phase, should be further discussed. It is possible that the loss of stars during the cost condition (when stars were worth 2 min of PAT) evoked negative behavior from the students, which then lead to the further loss of stars. For example, during the first cost session of the Doubled Reward phase, four stars were lost due to “TOOT” about the star loss. When a star was erased, one student began to threaten another student by saying, “I’m going to kick your a$$, stop talking.” This behavior met the definition for disruptive behavior, and thus resulted in the loss of more stars. It should be noted that this was the only day that negative vocalizations about the procedures were observed. All other comments were neutral or positive statements about the procedures.
Another consideration is that when rewards were doubled, students could earn 10 min of PAT if they retained all five of their stars, whereas previously it was only possible to earn 5 min of PAT for retaining all five stars; thus, when the value of the stars was doubled, this may have given students more leeway to engage in disruptive behavior in that they could still earn approximately the same amount of PAT they were previously getting even if they lost two to three stars. In contrast, the gain procedure may have been enhanced when the value of stars was doubled; in this condition, students started out with nothing but could then earn 2 min of PAT for every star earned as opposed to only 1 min; increasing the value of the stars may have increased their reinforcing efficacy. Future research should continue to explore whether variations in the value of tokens affect how effective one procedure is over the other.
In the current study, it is possible that neither procedure completely reduced disruptive behavior due to students’ disruption being under control of different functions. It is possible that certain procedures are more effective for specific functions of disruptive behavior. For example, for students whose behavior is maintained by social positive reinforcement in the form of attention, the cost condition might have been more effective in reducing their disruptive behavior. Students who are sensitive to attention as reinforcement may be less likely to engage in disruptive behaviors if they believe that other students will be upset or angry if rewards are removed because of their behavior. On the contrary, the cost condition may not have been as effective for students who engage in disruption as a form of escaping task demands such as math worksheets, in that the reward of free time along with social disapproval for losing the reward may not effectively compete with escape from the task.
Following the completion of data collection, teachers’ and students’ preferences were assessed. In McGoey and DuPaul’s (2000) study, teachers preferred the cost condition to the reward condition. This may have been because the cost condition was easier to implement in that teachers were prompted by the active behavior of students being disruptive, rather than having to remember to give tokens while the students were not engaged in disruptive behavior. The current study controlled for ease of implementation across conditions such that preference could be evaluated independent of response effort. For example, in both conditions, teachers were prompted by the researcher to ensure that opportunities for tokens to be removed or given were equal across conditions. Interestingly, both teachers in this study preferred the cost condition despite controlling for differences in response effort between conditions. Both teachers reported that they felt the visual presence of the stars made the cost procedure more effective. It is possible that the teachers preferred the cost conditions because it more closely resembled the procedures that both teachers had implemented in their classroom prior to the start of the study. While both teachers had “gain” classroom management procedures prior to the start of this study, they also removed rewards if undesirable behavior occurred. It should be noted that the teacher from Classroom 2 reported that he felt the cost procedure was most effective, although data indicate no differences between DRO and response cost. It is also important to note that social validity measures indicated that neither teacher felt that either procedure reduced the overall amount of disruptive behavior in their classroom. This may be due to the fact that although procedures reduced the total number of students disrupting, there were still a few students who continued to engage in disruptive behavior throughout the study (see Figure 3).
Social validity measures collected from Classroom 1 indicated that the students preferred the cost procedure to the DRO procedure. This may be due to the higher number of stars earned during the cost procedure. The cost condition may also be more preferred because students saw themselves as actively participating in the loss of rewards. For example, when students engaged in the disruptive behavior, they lost a star (active participation) versus the alternative, not engaging in disruptive behavior to earn a star (passive participation). Social validity measures collected from Classroom 2 indicated no preference for one procedure over the other.
This study aimed to answer an interesting question: Should classroom procedures be set up in a way that students earn tokens when not engaging in disruption, or should classrooms set up contingencies in which preferred activities are removed contingent on undesirable behavior? Results suggest that DRO and response cost procedures were equally effective in decreasing classroom disruption; however, disruptive behavior still continued to occur at some level. For example, in Classroom 2, there were at least two students (see Figure 3) whose disruptive behavior was not affected by either procedure suggesting that individualized interventions may have been more appropriate for them.
There were a few limitations of this study that should be noted. First, data collection was ended before data stabilized during the Doubled Reward condition. Due to school constraints (i.e., star testing) and per teacher request, the researchers had to end data collection early. Had the last phase been extended, it is possible that a greater differentiation between conditions may have been observed.
Second, a follow-up was not conducted. Follow-up data may have shown if either intervention had lasting effects. A follow-up with either classroom could not be completed due to the above mentioned school constraints. Furthermore, after procedures and data collection were stopped, typical classroom activities never resumed. Following star testing, both classes traveled to an end-of-the-year field trip and then returned to end-of-the-year activities that greatly disrupted the typical classroom schedule.
Third, no functional analysis was conducted to determine the function of any of the students’ disruptive behavior in the classroom. It may have been interesting to conduct a functional analysis for students who engaged in the most disruptive behavior, such that the effectiveness of each condition could then be individually compared for those students. Given that this study aimed to identify the effectiveness of DRO and response cost on a class-wide scale, a functional analysis was not considered necessary to implement the interdependent group contingencies. Future research should evaluate the effectiveness of each procedure at the individual level, after conducting a functional analysis, to determine whether either is more or less effective, given the function of the disruptive behavior.
A fourth limitation was that teachers did not determine criteria for giving or receiving tokens on their own. During piloting, treatment integrity dropped to 0% when the teacher in Classroom 1 was asked to identify when students met criteria for rewards and then issue those rewards contingently. Given that this study aimed to evaluate the effectiveness of each procedure, it was imperative that treatment integrity be at or above 80%. Future studies should evaluate the feasibility of teachers implementing these procedures without trained researchers prompting their behavior.
Footnotes
Acknowledgements
This research was completed in partial fulfillment of thesis requirements by the first author. A special thanks to Kelli Hill for her help with data collection and Jonathan Fernand for his feedback on a previous version of this article.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
