Abstract
The National Commission on Writing called for a reform in writing instruction over a decade ago. However, teacher preparation programs still rarely provide sufficient training in writing instruction for teacher candidates. The purpose of this study was to improve the writing instruction of preservice teachers. Participating preservice teachers (N = 166) from three universities were randomly assigned to learn essential components of the self-regulated strategy development (SRSD) and the “model it” stage via content acquisition podcast (CAP)-TVs, lecture, or a practitioner-oriented article. This randomized control trial found that students in the CAP-TV condition outperformed peers in the article condition on a researcher-created measure of SRSD knowledge. Additionally, participants in the CAP-TV condition outperformed peers in both comparison groups article on a measure of modeling instruction. Results from a perceived cognitive load survey indicated that perceived cognitive load was significantly correlated with outcomes on the knowledge and performance measure for all participants. These results suggested that multimedia tools designed using Mayer’s (2009) cognitive theory of multimedia learning can reduce cognitive load and increase learning outcomes. Teacher educators should consider incorporating CAP-TVs into their coursework when teaching complex instructional strategies.
Keywords
Approximately 75% of 8th and 12th graders scored below the proficient level on the National Assessment of Educational Progress (NAEP) writing assessment (National Center for Education Statistics, 2012). This assessment measures students’ ability to communicate competently through their writing for academic and workplace writing tasks. Poor performance on the NAEP writing assessment is not a new finding. Before the 2011 NAEP administrations, the previous three administrations (1998, 2002, and 2007), all found approximately 70–75% of students in each grade level did not reach proficiency in writing (National Center for Education Statistics, 1999, 2003, 2008). Poor performance on these assessments is especially troubling in light of findings from Carnevale (2001) indicating that students in the lowest quartile on achievement measures are 20 times more likely than those in the highest quartile to drop out of school. Students with disabilities often struggle with writing more than their peers with disabilities (Graham & Harris, 2002). Researchers found that the writing of students with disabilities is often less polished, expansive, and coherent (Graham, Harris, MacArthur, & Schwartz, 1991). Potential employers have recognized the lack of writing skills/proficiency among America’s students, with a vast majority (81%) of employers nationwide reporting that high school graduates are deficient in their written communication skills (Casner-Lotto & Benner, 2006). Students who struggle with writing have a significant barrier to their future academic and employment opportunities (Troia, 2014).
State of Writing Instruction
A possible cause for the lack of proficiency in writing achievement is the quantity and quality of writing instruction in schools. High-quality writing instruction is often hard to find in schools. For example, Grisham and Wolsey (2011) studied writing instruction experiences of preservice teachers and found that preservice teachers rarely observed writing instruction in their student teaching internships. Applebee and Langer (2011) found similar results when examining middle and high school English classes. In the best-case scenarios, they found that only 6.3% of instructional time in English classes was spent on writing instruction. Of this small portion of time, much of it was often devoted to very low-level writing activities.
Survey data of high school teachers across content areas corroborated these observational studies and indicated that less than half felt adequately prepared to incorporate writing activities to support instruction (Kiuhara, Graham, & Hawken, 2009). This perception might be explained by the amount of time spent on writing instruction in their preservice teacher preparation programs (Myers et al., 2016). A majority of high school teachers (Gillespie, Graham, Kiuhara, & Hebert, 2014) and middle school teachers (Graham, Capizzi, Harris, Hebert, & Morphy, 2014) reported receiving little to no preparation for supporting learning through writing from their teacher preparation program. In a survey of 63 teacher educators representing 50 institutions, Myers and colleagues (2016) found that few offered a stand-alone course in writing methods, and in most cases, the limited writing methods instruction offered was included in reading methods instruction. Further, teacher educators in this study reported low self-efficacy for teaching writing methods.
Recommendations From the National Commission on Writing
In response to the state of writing achievement and instruction in the United States, the National Commission on Writing (2003) called for a writing reform and made several recommendations for improving the writing performance of children in the United States. One recommendation was to make writing instruction the responsibility of all teachers across grade levels and content areas. Improving writing is not easy and cannot be the responsibility of one subset of teachers in a school. Another recommendation was to improve the preparation of teachers by requiring teachers in all subjects and grade levels to be exposed to quality writing theory and practice in their teacher preparation program.
Over a decade after the call for increasing writing instruction in teacher preparation programs, Myers et al. (2016) found that few teacher preparation programs include stand-alone courses in writing instruction. Because widespread implementation of writing instruction courses has proved difficult to achieve, the current study tested an efficient multimedia instructional tool (i.e., content acquisition podcasts [CAPs]) to embed an evidence-based approach to writing instruction (i.e., self-regulated strategy development [SRSD]) into a preexisting course of preservice teachers from a wide variety of content areas and grade levels. In doing so, the study has implications for teacher preparation and cognitive load (CL) theory more broadly.
SRSD
The Writing Next report recommended strategy instruction as a practice with large effects for improving student writing (Graham & Perin, 2007). SRSD is an evidence-based approach to strategy instruction and one of the most widely researched instructional practices for writing. Studies have consistently demonstrated that SRSD is an effective intervention with large effect sizes for improving the quality and structure of students’ writing (Graham & Harris, 2003; Graham, Harris, & McKeown, 2013; Troia, 2014). Perhaps most importantly, SRSD has strong research support for improving the writing of students with disabilities (Graham, Harris, MacArthur, & Schwartz, 1991; Graham & Harris, 2003; Troia, 2014). SRSD was chosen as the content for the intervention for two primary reasons. First, it is flexible in terms of application. It can be used across age, grade, ability levels (including students with disabilities), and content areas. The second reason was the level of research that has identified the essential components of the instructional model (Graham et al., 2013). The structure of the intervention lends itself to practices with clear, explicit steps for implementation.
CAPs
Kennedy and Thomas (2012) coined the term content acquisition podcasts (CAPs) to refer to brief instructional vignettes designed using Mayer’s (2008) cognitive theory of multimedia learning (CTML). CAPs are audio recordings of instructional content enhanced and illustrated with still images and minimal text, adhering to Mayer’s (2009) instructional design principles. Empirical evidence has supported the use of CAPs for K–12 students, preservice, and in-service teachers. To distinguish between CAPs for K–12 students and preservice and in-service teachers, researchers use the term CAP-S (CAP for Students) when referring to those meant for K–12 students and the term CAP-T (CAP for Teachers) when referring to those meant for preservice and in-service teachers.
Research on CAP-Ts indicated they are a powerful tool for conveying declarative and applied knowledge when compared to text-based learning (e.g., Kennedy & Thomas, 2012, Kennedy, Thomas, Aronin, Newton, & Lloyd, 2014; Driver, Pullen, Kennedy, Williams, & Ely, 2014). Effect sizes ranged from moderate (e.g., Driver et al., 2014; ω2 = .27) to large (e.g., Sayeski et al., 2015; d = 0.93). Although effect sizes were generally lower when learning via CAPs was compared to a live-lecture condition (e.g., Ely, Pullen, Kennedy, & Williams, 2015; d = 0.45), students who learned information via CAP-Ts consistently outperformed comparison groups (Hirsch, Kennedy, Haines, Thomas, & Alves, 2015; Kennedy, Alves et al., 2016) when learning was measured using researcher-created measures of declarative and applied knowledge.
Ely, Kennedy, Pullen, Williams, and Hirsch (2014) first explored the use of embedded video content within the CAP-T format. Citing Bandura’s (1986) social cognitive theory, the researchers argued for the use of modeling videos within CAPs for improving teacher practice. They found that the participants who watched the CAP with embedded videos, dubbed CAP-TVs for CAP-Teacher Videos, included significantly more components of evidence-based vocabulary instruction in their teaching than peers who read a peer-reviewed article describing the same content as measured by a fidelity checklist. Effect sizes for this study (d = 0.65–1.14) were similar to previous CAP-T studies. However, previous studies used less visually dependent content for the videos (e.g., characteristics of students with learning disabilities). Ely et al. (2014) argued that if the goal of instruction is simply to explain information, the CAP-T format is appropriate. However, for technical descriptions of complex teaching actions, the CAP-TVs seem to be more appropriate.
Mayer’s Cognitive Theory of Multimedia Learning
Mayer’s (2009) cognitive theory of multimedia learning aims to reduce CL of learners in multimedia learning scenarios through 12 instructional design principles. For this study, we followed this model of instructional design in three primary ways. First, we divided the CAP-TV into two separate videos, each with its own specific focus, to reduce the amount of information students processed at any one time (i.e., segmenting principle—learner-sized chunks). Second, we presented essential information and limited the complexity of language and presentation of information (i.e., coherence principle). Third, we inserted prompts for learners to know when essential information was being presented (i.e., signaling principle). Finally, all images and text were presented in accordance with the CTML. Text was presented sparingly and only when necessary to enhance learning or emphasize information. Visual text never competed with auditory narration (i.e., spatial contiguity principle).
Mayer’s CTML (2009) hypothesizes that the instructional design principles lower CL when used collectively. However, few empirical studies have explored the extent to which using Mayer’s model actually lowers CL and is a predictive covariate of learning. More research is necessary to confirm the various elements of the CTML. The current study contributes to this effort by examining the effect of instruction guided by the CTML on self-reported perceived CL and learning outcomes.
Measuring CL
CL is “generally considered a construct representing the load that performing a particular task imposes on the cognitive system” (Sweller, van Merrienboer, & Paas, 1998, p. 266). Leading scholars theorize there are three types of CL that impact all humans’ performance during learning; (a) extraneous, (b) intrinsic, and (c) germane load (Sweller, Ayers, & Kalyuga, 2011). Extraneous load is created when instructional features that are not related to the learning goal at hand are included in instruction (Kalyuga & Hanham, 2011). For example, a lesson that was excessively long without breaks and contained high volumes of technical information could increase extraneous CL.
Intrinsic load is generated by the interaction of the learner (including their inherent skills, prior knowledge, and other personal traits) with the inherent complexity of the learning task (Kalyuga, 2009). In other words, intrinsic load is determined by the complexity of the task to be learned and its interaction with the learner (Chandler & Sweller, 1991). For example, learning about SRSD instruction is likely to have a high intrinsic load for several reasons. First, the elements of SRSD instruction are likely absent from most preservice teachers’ current knowledge. Second, they likely have little experience observing others use SRSD or learning via SRSD. Finally, SRSD is a complex set of skills and teaching maneuvers that are not simple to learn. The extraneous and intrinsic loads are “added together” to create the total load that a learner must deal with in order to learn. If all available CL capacity is consumed only with handling the extraneous and intrinsic loads, no capacity remains for the active tasks needed for learning. Leppink, Pass, Van Der Vleuten, van Gog, and van Merrienboer (2013) succinctly explained the relationship among the types of CL: Intrinsic load should be optimized in instructional design by selecting tasks that match learners’ prior knowledge (Kalyuga, 2009), whereas extraneous load should be minimized to reduce ineffective load (Kalyuga & Hanham, 2011) and to allow learners to engage in activities imposing germane load (van Merriënboer & Sweller, 2005). (p. 1058)
CAP Research’s Contribution to CL Theory
Aside from CAP research, very few studies have studied the impact of instruction guided by the CTML in its entirety. Previous CAP research demonstrated that CAP-Ts developed from the CTML can lead to superior declarative and applied knowledge when compared to text-based learning (Driver et al., 2014; Ely et al., 2014; Kennedy et al., 2014) and traditional lectures (Hirsch et al., 2015; Kennedy et al., 2016). Additionally, research has indicated that students who learn via CAP-Ts report having lower perceived CL than their peers in more traditional learning conditions (Kennedy et al., 2016). However, in previous studies, the lecture comparison group used a PowerPoint presentation that was more traditional in nature (i.e., more text on screen and fewer images). No studies have examined the effect of using the CTML to create lecture presentations in comparison to CAP-Ts or CAP-TVs. The current study further explored CL theory by having a lecture condition that was designed using the CTML.
Purpose of Study
The National Commission on Writing (2003) called for improving the exposure to writing theory and practice for all prospective teachers. CAP-Ts are an efficient and effective way to communicate knowledge to learners; however, questions remain about the role of CL theory in preservice teacher learning, the ability of CAP-TVs to improve preservice teachers’ practice, and the social validity of this intervention. The current study was guided by the following research questions. What are the effects of (a) watching a CAP-TV, (b) participating in a live lecture, or (c) reading a peer-reviewed article on preservice teachers’ perceived CL using the National Aeronautics and Space Administration Task Load Index (NASA-TLX)? What are the effects of (a) watching a CAP-TV, (b) participating in a live lecture, or (c) reading a peer-reviewed article on preservice teachers’ knowledge of SRSD instruction using a delayed posttest? What are the effects of (a) watching a CAP-TV, (b) participating in a live lecture, and (c) reading a peer-reviewed article on preservice teachers’ implementation of modeling instruction in role play scenario using an observation checklist? What are preservice teachers’ perceptions of social validity for each of the three instructional conditions as measured by a survey of their learning experience?
Method
Participants and Setting
Before conducting the experiment, researchers at each institution received approval from the appropriate institutional review board and received informed consent from each participant. A total of 166 participants from three universities participated in the study. University 1 was a large, flagship university in the mid-Atlantic region. Sixty-nine percentage of the total sample (n = 115) attended University 1. University 2 is a large major land grant institution in the Midwest. Sixteen percentage (n = 27) of the total sample attended University 2. University 3 is a medium-sized research-oriented university in the Western United States, and with about 12,000 students, it is the largest producer of teachers in their state. Fifteen percentage (n = 24) of the total sample attended University 3. Universities 1, 2, and 3 were demographically similar to comparable institutions with a White population of 72%, 79%, and 81%, respectively.
Participants from each institution were enrolled in their university’s version of an introductory special education course intended to teach students basic law, characteristics, and practices pertaining to students with disabilities. All students attending targeted classes elected to participate in the study. See Table 1 for a summary of the participants self-reported demographic information, including age, gender, year, and status in the school of education.
Demographic Information for Participating Preservice Teachers.
Note. Two participants did not report demographic data.
Procedures
Researchers used stratified random assignment to create three instructional conditions at each university. Participants were stratified based on whether they were currently enrolled in the school of education or not. Researchers stratified assignment using this designation to avoid unintentionally assigning students who may have lower motivation to learn the material or less background knowledge disproportionately to any of the treatment conditions. After taking a pretest, students participated in one of the three randomly assigned instructional activities during a regularly scheduled class period: (a) reading a peer-reviewed, practitioner-oriented article (i.e., Lane, Graham, Harris, & Weisenbach, 2006); (b) participating in a live, in-person lecture; or (c) watching a two-part CAP-TV.
During the class period of the experiment, participants took an 11-item pretest, participated in their assigned activity, and completed the perceived CL measure (NASA-TLX; Hart, 1988). One week after the intervention, participants completed an 11-item posttest that was identical to the pretest. Within a week of the experiment, participants submitted a <10-min self-recorded video of a sample lesson taught using the content presented in the experiment. Due to scheduling and syllabus constraints, only participants at University 1 were able to submit the teaching videos. Although missing videos from the other two universities limit the claims made in this study, we decided to include the videos from University 1 as a preliminary examination of the effect of CAP-TVs on teacher practice.
Article condition
In the article condition, participants read a hard copy of Lane, Graham, Harris, and Weisenbach (2006). This article discussed using SRSD in a tiered support system, identifying students with writing concerns, and using SRSD to write better stories. When explaining how to teach using SRSD instruction, the article used hypothetical students as case studies and explained how a teacher would progress through the various phases of SRSD instruction. We chose this article due to its introductory description of the SRSD instructional sequence and the nature of instruction in the model it stage. Also, the article was written in a way that researchers determined accessible for preservice teachers with little background knowledge of SRSD. Before the intervention began, the proctor instructed participants that they would not be able to keep the hard copy of the article at the conclusion of the intervention. However, participants were told that they could take and would be able to keep as many notes as they liked. Participants were not allowed to keep a print of the article because participants in the lecture and CAP-TV conditions did not have access to their respective materials after the conclusion of the experiment.
CAP-TV condition
In the CAP-TV condition, participants watched a two-part CAP-TV that was developed from the content in Lane et al. (2006). To develop content for the CAP, researchers made an outline of Lane et al. (2006) summarizing each paragraph. To develop the looks and sounds of the CAP-TV in adherence to Mayer’s (2009) model, researchers followed the process described in Kennedy and Thomas (2012). First, researchers made traditional PowerPoint slides corresponding to the outline. Then, researchers added a script to the notes section of each slide. Third, researchers replaced the text on each PowerPoint slide with an image illustrating the content on that slide. Next, researchers recorded and added the narration using the text in the notes section as the script. Finally, researchers added comprehension questions throughout the videos. The comprehension questions were not identical to the posttest items, but they did assess similar content. Part 1 provided an overview of SRSD and included a rationale for its use. It lasted 9 min and 30 s. Part 2 focused specifically on the model it stage of instruction and included videos of a teacher demonstrating each component of instruction in the model it stage. Part 2 lasted 20 min and 53 s. A proctor instructed participants that they could pause and rewind the CAP-TV and take notes as they desired. They were also instructed that the CAP-TV would not be available after the experiment’s conclusion.
Live-lecture condition
In the live-lecture condition, a lecturer at each institution gave a lecture using a PowerPoint identical to the one used in the CAP-TV condition (minus the narration, timed recordings on each slide, and modeling videos). This condition built on feedback received from peer reviewers of similar studies using a traditional text-based PowerPoint presentation in the lecture condition. Building on previous CAP research, we improved the quality of the lecture comparison condition by making the PowerPoint as similar as possible to the CAP-TV condition to avoid confounding variables that could be contributing to the results. The lecturer at each university had a script for each slide (identical to the one used in the CAP-TV) that could be used for each slide, but we encouraged lecturers to present the material in a manner natural to their styles of instruction.
At University 1 a graduate student with experience teaching SRSD in secondary settings provided the lecture. At Universities 2 and 3 associate professors with experience teaching SRSD K–12 settings and as a component of undergraduate teacher preparation coursework provided the lecture condition at University 2.
Treatment integrity
To ensure the article, lectures and CAP-TV, delivered the same content, we used the outline derived from Lane et al. (2006) as a fidelity checklist to score the lectures and CAP-TV. Two authors scored the lectures and CAP-TV independently and achieved 100% inter-rater reliability. Fidelity ratings for the CAP-TVs and the lectures were 100%.
Dependent Measures
As part of the study, participants completed five measures: (a) a researcher-created pretest and posttest of SRSD knowledge, (b) a self-recorded video of <10 min of writing instruction, (c) the NASA-TLX, a perceived CL measure, (d) a social validity survey, and (e) a self-recorded duration of time spent in the experimental condition. Participants completed the perceived CL measure immediately after the experiment and social validity survey 1 week after the experiment. They submitted their self-recorded video lesson any time up to 1 week after the experiment.
SRSD knowledge measure
To examine participant learning, before and after the experiment, we developed an 11-item knowledge measure corresponding to critical information within the article, CAP-TV, and lecture. See Figure 1 for the 11 items in the measure. All 11 items were fill-in-the-blank type questions. Two asked students to list information (e.g., the six stages of the SRSD sequence). Six questions described an SRSD stage and asked students to name the stage. For listing questions, participants could earn one point for each correct response. For all other items, students received 1 or 0 points. The measure was developed in a way so that participants could not refer back to previously answered questions. This measure had moderate internal consistency as indicated by Cronbach’s α of .63.

Pretest and posttest knowledge measure.
A graduate research assistant scored 20% of posttest responses for inter-scorer reliability. Inter-scorer reliability was calculated by dividing the number of agreements by the sum of agreements and disagreements and multiplying by 100 (agreements/[agreements + disagreements] × 100). Reliability was 95.3%. We discussed disagreements until we reached 100% agreement.
Social validity measure
Participants responded to a modified version of the social validity survey developed by Hirsch et al., (2015). The survey included six statements assessing the extent to which participants perceived that their activity (article, CAP-TV, or lecture): (a) worked well for their learning preferences, (b) was appropriate for teaching teachers about SRSD, (c) was worth recommending to other students, (d) provided them with confidence in their entry-level knowledge of SRSD, (e) provided them with confidence in their entry-level ability to apply SRSD in the classroom, and (f) was an effective way to learn new content. Participants responded to each statement using a 6-point, forced choice Likert-style scale where 1 = strongly disagree and 6 = strongly agree. See Table 4 for the specific items on the measure.
Observation checklist
To evaluate participants’ ability to incorporate modeling into their practice, we created a role-play scenario for participants to demonstrate their implementation of the SRSD model it phase of instruction. Instructions for the activity told participants to teach a lesson focused on the model it phase of instruction using either the POW strategy or the WWW, What = 2, How = 2 strategies. The instructions explained that video had to be less than 10 min in duration. Participants were told the emphasis of the role-play scenario was not on modeling the entire strategy in ten min, but rather they were to focus on including as many elements of high-quality modeling as they could. The instructions also included a picture prompt for students to use in the lesson. The picture was a pig wearing aviator goggles looking over a fence. Not all students were currently participating in a field-based practicum experience. Therefore, to standardize the teaching conditions for the lessons, students were instructed to record their lesson without K–12 students present. They could record the lesson in a room of their choosing (e.g., dorm room, empty classroom, and conference room).
To evaluate student performance in the role-play scenario, researchers developed an observation checklist consisting of 10 items taken from Lane et al. (2006)’s description of the model it stage. For example, Lane et al. (2006) gave an example of a teacher using SRSD instruction with two students that included “talking out loud while planning and writing a story” (p. 62). From this sentence, we developed 2 of the 10 items on the observation checklist: (a) think aloud while planning story and (b) think aloud while writing story. See Figure 2 for the full observation checklist. Two coders scored the videos with 20% overlap between the two coders. Inter-rater reliability was 99.7% (agreements/[agreements + disagreements] × 100). This measure had high internal consistency as indicated by Cronbach’s α of .93.

Observation checklist scoring rubric.
Perceived CL survey
Participants self-reported their perceived CL using the NASA-TLX measure (www.nasatlx.com). When using this measure, participants respond to a single item representing six domains: mental demand, physical demand, temporal demand, overall performance, frustration level, and effort. For each domain, participants responded on a scale from low (0) to high (100). The online system computes a score for each of the domains based on where the respondent clicked on the scale. It also provides an overall score of workload or perceived mental load on a scale of 0–100. Participants completed this measure immediately after completing their assigned learning activity (i.e., reading, lecture, or CAP-TV). This measure had moderate internal consistency as indicated by a Cronbach’s α of .67.
Duration measure
Participants in the article group and CAP-TV group recorded the time the intervention began and the time they completed the intervention. From these data, we calculated the average duration for both the article and lecture groups. The live lecture at all institutions was audio or video recorded. From these recordings, we calculated the average duration of the lecture.
Experimental Design and Data Analysis
To address each research question regarding the impact of CAPs on knowledge acquisition, application, perceived CL, and perceived social validity, we conducted separate one-way analyses of variance (ANOVAs) to determine whether significant differences existed between the three groups. To examine the impact of self-reported perceived CL on knowledge acquisition and teaching performance, we conducted separate linear regressions using the total NASA-TLX scores to predict scores on the knowledge measure and the observation checklist. This study used an experimental, pretest–delayed posttest design across three universities and three conditions. The independent variable was the type of instruction.
Results
Duration
Students in the article group reported spending an average of 16.83 min (SD = 5.4, n = 40) reading the article. The average lecture duration was 33.42 min (SD = 9.57, n = 4). The actual playtime for the CAP-TV was 30.38 min. Students in the CAP-TV group reported spending an average of 44.75 min (SD = 9.8, n = 40) watching the CAP-TVs. The CAP-TV and lecture were designed to be similar in duration.
SRSD Knowledge Measure
A one-way ANOVA on pretest scores indicated no significant differences between the three groups, F(2, 156) = .107, p = .898. Thus, our stratified random assignment successfully distributed participants across the instruction conditions. Means on the pretest for each condition were very low (.06 or less). Therefore, we did not include the pretest in subsequent analyses.
We checked for outliers on each of the dependent variables within each level of the independent variable. On the posttest, we inspected box plots and detected three outliers. All outliers were in the article group. We included these outliers when running the omnibus ANOVA. We did have some violations of normality. However, because the groups were nearly equal in number, we continued with the ANOVA because this procedure is fairly robust to violations of normality. The CAP group had the highest mean on the posttest (n = 57, M = 12.25, SD = 6.53) followed by the lecture group (n = 52, M = 9.75, SD = 6.58) and the article group (n = 57, M = 4.19, SD = 3.82). The homogeneity of variance assumption was violated as indicated by Levene’s test for equal variances (p < .001). Therefore, we used Welch’s F statistic to determine whether differences between the groups were statistically significant. Scores on the posttest were significantly different across instructional groups, Welch’s F(2, 98.905) = 37.978, p < .001,
Observation Checklist
Inspection of box plots indicated three outliers in the article group. These outliers positively skewed the group’s mean. We included these outliers in subsequent analysis. Data in the article group were not normally distributed. However, we continued with our analysis without altering the data because group sizes were similar. The CAP group (n = 40) had the highest mean on the observation checklist (M = 9.73) followed by the lecture group (n = 36, M = 3.75) and the article group (n = 39, M = 1.33). The homogeneity of variance assumption was violated (p < .001). The omnibus one-way ANOVA on the observation checklist indicated statistically significant differences across the three groups Welch’s F(2, 60.788) = 38.363, p < .001,
When examining the individual items on the checklist, the CAP group had a higher mean than the lecture group and article group on every item. The differences between the CAP group’s mean and the article group and lecture group were statistically significant for every item except the first item (i.e., think aloud while planning). On this item, results indicated no significant difference between the CAP-TV group and the lecture group. See Table 3 for complete results of the observation checklist items.
Perceived CL
To help determine whether participants’ reported CL can help explain observed differences in performance on the posttest and the teaching activity, all students completed the NASA-TLX Scale online following their group’s instruction. Participants in the reading group (M = 45.30, SD = 13.74) had a significantly higher overall workload score than participants in the CAP, M = 32.01, SD = 13.5; F(1, 98) = 123.7, p < .001, d = 0.98 and lecture, M = 34.68, SD = 14.95; F(1, 87) = 12.2, p < .001, d = 0.74, groups. The lecture and CAP groups did not differ significantly from one another, F(1, 91) = 0.81, p = .369. Given the significant difference in performance on the dependent measures of learning and teaching performance (see Table 2 for results), this finding suggests higher perceived CL may function to suppress performance for students in the reading group compared to those who watched the CAPs or participated in the lecture.
NASA-TLX Domain Results. (Sharek, 2009)
Note. CAP-TV: n = 52, article: n = 48, lecture: n = 41. NASA-TLX = National Aeronautics and Space Administration Task Load Index.
*p < .05. **p < .001.
Observation Checklist Results.
Note. CAP-TV: n = 40, article: n = 39, lecture: n = 36.
*p < .05. **p < .001.
To further explore this relationship between perceived CL and learning, we conducted a simple linear regression. Results suggest that the total score on the NASA-TLX predicted a significant proportion of the total variation in posttest scores: β = 13.0, 95% CI [10.1, 16.0], p < .001, adjusted R2 = .054. In other words, regardless of experimental condition, a student’s score on the NASA-TLX is a good predictor of posttest score, F(1, 139) = 8.99, p = .003. Results show posttest scores are reduced by .25 points for every 1 additional point of perceived cognitive load recorded on the NASA-TLX instrument.
To further explore the relationship between perceived CL and performance on the observation checklist, we conducted another simple linear regression. Results suggest that the total score on the NASA-TLX predicted a significant proportion of the total variation in scores on the observation checklist: β = 8.49, 95% CI [5.4, 11.6], p < .001, adjusted R2 = .034. In other words, regardless of experimental condition, a student’s score on the NASA-TLX is a good predictor of performance on the teaching activity, F(1, 102) = 4.6, p = .035. Results show scores on the video are reduced by .21 points for every 1 additional point of perceived CL recorded on the NASA-TLX instrument.
Social Validity
On the social validity survey, participants in the CAP-TV group reported agreeing to slightly agreeing on most statements. The 2 items where students in the CAP-TV group scored the lowest (between neutral and slightly agree) were in regard to their confidence in their knowledge of SRSD and entry-level ability to apply SRSD in the classroom. On average, survey responses on each item were lower for the lecture group followed by the article group. Differences between the lecture group and CAP-TV group were statistically significant for 3 items of the 6 items, and differences between the CAP-TV and article group were statistically significant for all items. Differences between the article and lecture group were statistically significant for 5 of the 6 items. See Table 4 for complete results of the social validity survey items.
Social Validity Survey Results.
Note. CAP-TV: n = 56; lecture: n = 51; article: n = 55. SRSD = self-regulated strategy development.
*p < .05. **p < .001.
Discussion
Observational evidence of writing instruction is scarce, but the limited existing research suggested teachers did not regularly use high-level teaching practices (e.g., Applebee & Langer, 2011). The state of writing instruction and achievement indicated a writing reform was necessary in K–12 schools (National Commission on Writing, 2003). Over a decade after the National Commission on Writing (2003) made their recommendations, many preparation programs still include very little preparation in writing instruction (Myers et al., 2016). It is certainly important for preparation programs to include a method course on writing instruction, but with systemic change being difficult to achieve, it is incumbent on teacher educators to include preparation for writing instruction in preexisting courses in effective ways. Based on results from this study and other similar studies, CAP-TVs are an effective method of instruction that have the potential for efficiently improving the pedagogical content knowledge and practice of preservice teachers and can be embedded into existing courses.
Improved Knowledge Acquisition
After watching CAP-TVs, preservice teachers scored higher on the knowledge measure than their peers who learned the same information via an article reading. This finding reinforces previous research supporting the use of CAP-TVs for improving preservice teacher knowledge outcomes. The effect size for the CAP-TV group in comparison to the article group was large (d = 1.59) and similar to previous comparisons of CAPs to text-based learning.
The CAP-TV group did not outperform the lecture group at a statistically significant level. However, the effect size between these two groups (d = 0.38) suggests a moderate positive effect for CAP-TVs. This effect size is similar to previous studies comparing CAP-TV to lectures. It is possible that this study was underpowered to detect significant differences on the knowledge measure. After dividing the sample into three experimental groups, each group had a small sample size meaning differences between the group would have to be very large to achieve statistical significance. It is also possible that assessments of pedagogical knowledge do not detect the depths of understanding that become evident when teachers perform the practice in a lesson. Regardless of the differences in outcomes on the posttest, or lack thereof, preservice teachers in the CAP-TV condition were able to implement the model it phase of SRSD instruction at a significantly higher level.
Improved Implementation of Practice
Although improved knowledge outcomes are important, the goal of improving preservice teacher preparation is to improve the teaching quality of future teachers. This study built on previous studies (e.g., Ely et al., 2014, 2015) investigating the ability of CAP-TVs to improve preservice teachers’ practice. The current study found similar results compared to past studies and did so while including more participants. This study also extended previous research by examining writing instruction as the content rather than vocabulary instruction. Results indicated participants in the CAP-TV condition included significantly more elements of modeling instruction in their teaching videos. When examining individual items on the checklist, the CAP group outperformed both comparison groups on every item with only one exception: No significant differences were found between the CAP group and the lecture group on the first checklist item (i.e., think aloud while planning). Some elements of the observation checklist were not explicitly taught in the CAP-TV. For example, the CAP-TV did not tell participants to provide an explicit cue for each checklist item. However, these behaviors were modeled in the modeling video portion of the CAP-TV. Further research needs to investigate the extent to which implicit behaviors are learned by preservice teachers watching CAP-TVs. It seems reasonable to expect that behaviors with a certain level of complexity need to be explicitly taught and modeled while other less complex behaviors can be learned more implicitly.
Although the CAP group significantly outperformed both comparison groups, the mean of the CAP group (M = 9.81) indicated students earned approximately half of the possible points on the checklist. Clearly, there is room for improvement in the participants’ implementation of modeling instruction. Several possible causes could have limited their performance. First, the role-play teaching scenario could have made the instructional routine more difficult to implement. Much of SRSD instruction requires discussion and interaction with students. Because the participants did not have access to a K–12 student population, some elements of modeling instruction might have been too difficult to reasonably expect them to complete in this role-play scenario. Second, it is possible that the modeling videos embedded in the CAP-TV did not provide as clear of a demonstration of each element as possible. Third, it is possible that more supports are necessary for successful implementation of modeling instruction. These supports could include opportunities to repeatedly view the CAP-TV, practice opportunities with feedback, or materials (e.g., lesson plans).
Reduced Perceived CL
Participants across all three groups reported fairly low perceived CL on the NASA-TLX. Participants in the CAP-TV group reported significantly lower perceived CL than participants in the article group. Similarly, participants in the lecture group reported significantly lower perceived CL than the article group. The difference in reported perceived CL between the lecture group and the CAP-TV group was not significant. However, because both the lecture materials and CAP-TVs were identical and based on Mayer’s (2008) design principles, the lack of significant differences in perceived CL is not surprising. Although previous studies found that CAP-TVs reduced perceived CL in comparison to lecture-based learning, this study’s results suggest that reduced perceived CL stems from Mayer’s (2008) instructional design principles rather than a simple change in learning format (i.e., video vs. lecture). Results from the regression analysis further support the conclusion that reducing CL leads to improved comprehension and implementation of learned material. A 1-point increase in perceived CL resulted in a .25 decrease on the posttest and a .21 decrease on the observation checklist.
Social Validity
On the social validity survey, participants in the CAP group on average reported agreement with CAPs working well for their learning needs, being appropriate for learning about strategy instruction, suggesting the use of CAPs to learn about strategy instruction, and the use of CAPs being an effective way to learn new content. Interestingly, the CAP group reported lower levels of agreement, although still positive, for their entry-level ability to apply SRSD instruction. The results on this item are interesting because, in fact, the CAP group far outperformed the article group and lecture group on the application measure (i.e., observation checklist). As one possible explanation for the disconnect between student perceptions of their ability to apply elements of SRSD in their actual performance, students in the CAP group may have gained a deeper understanding of SRSD and the complexities involved with its implementation and therefore felt less prepared than peers in the comparison groups who may not have understood SRSD as a complex concept. This deeper understanding may reflect some nuance of SRSD instruction that was present in the modeling videos but was not detected by the researcher-created posttest.
Implications
The results of this study have implications for both researchers and teacher educators. For researchers, the potential for CAP-TV to improve teaching practice is exciting. More research needs to be done on the extent to which CAP-TV can improve teaching practice, what practices can most benefit from the CAP-TV format, and what supports are needed in combination with CAP-TVs to improve teaching practice. As mentioned previously, there is still much room to improve performance on the observation checklist. Researchers should explore the effect of multiple CAP views. Kennedy et al. (2016) found that the number of repeated CAP views positively predicted posttest knowledge outcomes. Researchers need to investigate if multiple CAP-TV views have a similar effect on implementation of teaching practices. Additionally, researchers could expand the library of CAP-TVs to teach other components of SRSD instruction (e.g., developing background knowledge and independent performance). Researchers could also consider providing preservice teachers’ practice opportunities with feedback to improve their teaching practice. Finally, researchers should also examine the ability of CAP-TV to improve the writing instruction of in-service teachers.
For teacher educators, these results suggest CAP-T and TV should be considered for integration when creating course syllabi. CAP-T and TV have been developed for a wide range of topics and are freely available on the Internet. CAP-T and TV have experimental evidence suggesting they lead to improved learning over articles. They can be used in synchronous or asynchronous formats. For example, a teacher educator could show a CAP-T during class to activate prior knowledge before leading a discussion on a topic. Also, for some topics, CAP-T and CAP-TV could replace reading an article or other assigned reading on course syllabi as a more engaging and effective method of instruction. Teacher educators could support learning of these teaching practices by structuring course time in a way that allows preservice teachers to practice the learned instructional strategies and receive feedback from the teacher educator. Finally, teacher educators should consider the time required for activities when designing the activities. This study found that students in the article reading condition spent substantially less time engaging with the material than participants in the lecture or CAP-TV group. It is highly unlikely that preservice teachers will be able to meaningfully engage in complex topics, such as modeling instruction, with such a brief activity.
Although the CAP-TV group did not significantly outperform the lecture group on the knowledge measure, participants in the CAP-TV group clearly had a deeper understanding of SRSD instruction as reflected in their ability to implement these instructional principles. In addition to deeper understanding, CAP-TVs have several advantages over traditional lectures. First, they are more individualized than a traditional lecture. The average lecture time was 33.42 min. The duration of the CAP-TVs was approximately 29.45 min, but participants spent an average of 44.75 min watching the CAP-TV. This difference in playtime and viewing duration suggests participants paused and possibly replayed portions of the CAP-TV. CAP-TVs allow learners to process material at an individualized pace. Second, CAP-TVs allow for an enhanced instructional experience by incorporating embedded modeling videos and comprehension questions with feedback. Participants were able to confirm understanding throughout the video and identify points that they did not understand as well. Third, CAP-TVs can be replayed a number of times to allow information to be rehearsed and relearned. Finally, CAP-TVs are a permanent product that can be used by teacher educators in perpetuity. Based on these advantages, CAP-TV could replace or supplement a lecture for some content. Incorporating CAP-TV could open class time for more hands-on activities. For example, rather than listening to a lecture on SRSD, teacher educators could assign a CAP-TV before class and have students practice using the SRSD instructional sequence in class and provide real-time feedback on the students’ performance. Using CAPs in this way aligns with the current trend toward more practice-based approaches to teacher education.
Based on recommendations from the National Commission on Writing (2003), teacher preparation programs should find ways to devote more attention to developing the ability to teach writing in all teacher candidates. SRSD is a high-quality, evidence-based instructional strategy that can be used in a wide variety of content areas and grade levels. Recognizing that the teacher preparation curriculum is near capacity, CAP-TVs are one potential method of incorporating instruction on a needed topic area in an efficient manner.
Limitations
Results of this study should be viewed in light of some limitations. First, the contrived nature of the role-play scenario limits our ability to make strong claims about how these teachers would perform in actual classrooms. Teaching a lesson in your dorm room to an empty or friendly audience is decidedly different from teaching the same lesson in a classroom with students. Although we mentioned the role-play setting as a factor-making implementation of certain checklist items difficult, it is also likely that the setting made other items easier to implement.
Second, methodological issues tempered the claims made from the data. This study had a small sample size, a convenience sample, and data that violated assumptions for ANOVA. Results from the observation checklist are additionally limited by the fact that these results are representative of only one of the three universities. These results need to be replicated with a more participants before robust conclusions can be made about CAP-TV’s ability to improve teacher practice.
Finally, the knowledge measure and observation checklist were researcher-created measures. The knowledge measure had low to moderate internal reliability. Future studies should consider using other validated measures of writing instruction quality.
Conclusion
SRSD is an evidence-based writing intervention for a wide range of students including students with disabilities. Past research indicated CAP-TVs can be a powerful tool for improving content knowledge of preservice teachers. Based on this study, they also have potential for improving modeling instruction of preservice teachers. More research needs to be done on the extent CAP-TVs can influence teacher practice and the extent to which student outcomes improve in response to these changes in teacher practice. Teacher educators can consider using CAP-TVs as an evidence-based tool when delivering content knowledge to preservice teachers.
Footnotes
Authors’ Note
Cathy Newman Thomas is now affiliated to Texas State University, San Marcos, TX. Wendy J. Rodgers is now affiliated to University of Nevada - Las Vegas, Las Vegas, NV, USA.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
