Abstract
Using authentic audio-visual material in early language classes has become commonplace, but little prior empirical research has been published regarding why this pedagogical technique is effective for some students, and not others. This study examines the process of engagement and transportation of beginning foreign language learners into an authentic, Spanish language narrative-based music video. A structural equation model is developed that ties various elements such as belief consistency, cognitive difficulty, and affective disposition to the process of narrative engagement. The study then examines how a student’s engagement and transportation into the music video’s narrative impacts that student’s language learning, thematic comprehension, and entertainment – three commonly cited objectives of using authentic audio-visual material. Evidence is provided to indicate that the degree of engagement and transportation into the narrative-based video influences students’ language learning, thematic comprehension and entertainment.
Introduction
Teachers face challenges and need to be creative when teaching introductory language classes to students who are just beginning to learn a foreign language. Since a level of foreign language knowledge is required in many university, secondary and high school programmes, instructors of these courses are often faced with students who have a range of language ability within a class. Typically some students are highly motivated, engaged, and interested in other languages and cultures, while others struggle, are uninterested, or in some cases, even hostile to the notion of learning another language. Successful introductory language class instructors have developed a portfolio of pedagogical tools to facilitate language learning, including the use of ‘authentic’ materials. Authentic materials are, as Brinton et al. defined it, ‘written or oral texts which were created for a purpose other than language teaching’ (Brinton et al., 2003: 1). Audio-visual media in the target language, including music videos, commercials, TV segments, TED Talks, and short films are some of the more commonly used authentic classroom materials.
Several objectives are associated with using native language, or authentic audio-visual media in an introductory language learning environment. The first objective is to illustrate and reinforce the taught language elements within a natural and real speaking context. This allows students to capture the nuances of visual expressions and gestures associated with the spoken language. The second objective is that the audio-video material can augment classroom lessons and textbook material on culture by showing local scenery, cultural behaviours, customary dress, regional cuisines, and background narratives. The third objective is simply to break the monotony, and provide an element of enjoyment, excitement and motivation during the classroom experience.
Authentic music videos appear particularly well-suited to these three objectives. Songs generally involve the repetitive use of certain verbs, adjectives, nouns and other language elements thus providing a powerful tool to reinforce the same language elements covered in a class lesson. In addition, although not normally conversational in nature (such as a TV show or commercial), the photographic component of most music videos will emphasize and support the song’s lyrics, allowing students to have a strong auditory and visual association. A song with romantic lyrics, for example, will most likely be paired with a romantic scene in the video. In addition, many music videos used in introductory foreign language classrooms fall within the genre of ‘narrative music videos’ – they tell a story. For narrative oriented music videos, the narrative is often placed within the cultural and regional context of the language and musical genre; thus narrative based music videos for country western singers usually illustrate customs, dress, lifestyles, behaviours, belief systems, and cultures appropriate to the target audience; and, by definition music videos are meant to be entertaining.
Another advantage of music videos is simply their length. Music videos are relatively short, and fit nicely into a lesson plan for a single class period that often includes various kearning activities, discussion and reading components around a particular topic. But can a short music video achieve these objectives in a single class session?
Literature Review
Authentic or native-language materials used in the classroom can range in character, from ordinary, daily communication such as newspaper articles to more literary or artistic presentations. Authentic classroom materials, regardless of the communication mode or media, share some common characteristics – they are produced by native speakers for native speakers, they are produced for a ‘real audience’, and there is a ‘personal level of engagement’ or interaction between the material and the student (Gilmore, 2007). The use of authentic tools of all types in foreign language classes has been well-documented. A number of comprehensive reviews of this topic have been published in the past two decades that provide both evidence of its application, and the theoretical underpinnings, in a variety of cultures (see Al-Azri and Al-Rashdi, 2014; Becker and Strum, 2017; Emerick, 2019; Gruba, 2006; Mayer, 1997; McNutty and Lazarevic, 2014; Oritz et al., 2012; Pinner, 2016; Vanderplank, 2010). Research has shown authenticity has a positive impact on classroom outcomes, such as learning tasks (e.g. Ockey and Wagner, 2018; Pinner, 2016), communication and listening skills (e.g. Gilmore, 2011; Herron and Seay, 1991; Suvorov, 2018), cultural understanding (e.g. Herron et al., 2002; Peter et al., 2016), and student motivation (e.g. Kung, 2017; Peter et al., 2016).
While recognizing the overall effectiveness of authentic materials, research on the complex processes that lead to its effectiveness in the classroom is still in its early stages. The vast majority of evidence to date has been anecdotal, or based on survey feedback at the end of the course segment. Recently, more experimentally controlled empirical studies examining the benefits of authentic audio-visual material have been published (e.g. Ahour and Rahbar, 2014; Bajrami and Ismaili, 2016; Becker and Strum, 2017; Rahbar, 2014; Sabet and Mahsefet, 2012). In general, these empirically based studies compare samples of classes using authentic instructional material over a period of class time, ranging from several weeks to a full term, with a control group not using the materials. While these type of studies certainly indicate the overall pedagogical effectiveness of using authentic audio-visual materials at the class level over the course term, they provide little control over other confounding effects that might occur during class. Moreover, they provide almost no insight regarding sample variation, that is, why some authentic audio visual tools are more effective for some students and not others. This is particularly important since almost all of the authentic audio-visual material used in foreign language learning environments, such as native language TV shows, documentaries, interviews, music videos, and movie segments are highly narrative in nature.
Understanding the impact of individual characteristics for any type of narrative based material like music videos, can be considered within a number of theoretical frameworks such as ‘belief-consistency’, ‘mood-congruence’, ‘media and entertainment theory’, and ‘transportation theory’. There is a large body of research, for example, examining counter-attitude impacts of artistic-oriented narratives, such as in documentaries, cinema, theater, comedy, and literary works – audiences generally ignore, or dislike, those narrative elements that are inconsistent with their personal beliefs and attitudes. Similarly, from a mood-congruence perspective, scholars examining a variety of narrative vehicles have also argued that identification, comprehension, and ultimately enjoyment of a particular narrative is correlated with its congruency to the viewer’s or reader’s emotions that arise from the viewer’s underlying attitudes and philosophical outlooks (e.g. Plantinga, 2012). These relationships can also be mediated by the audience’s association with characters, or ‘transportation’ into the narrative.
This ‘transportation-imagery’ model (Green and Brock, 2000), with the underlying proposition that the viewer or reader of a narrative piece is engaged with, or ‘transported’ into the narrative is commonly used to explain viewer responses to narratives, and provides the underlying framework for the present study of early foreign language learning. Transportation represents the viewer’s immersion into the narrative; but for this to occur there needs to be a good fit between the viewer and the material (Green and Brock, 2000; Green, 2004). Using the transportation-imagery model, recent research on film media and literary works indicates that the degree that viewers are ‘transported’ depends on a variety of issues, such as their prior emotional state, prior belief systems, prior knowledge about the narrative content, the viewers’ cognitive difficulty with the material, the affective disposition the viewer has with the characters, and the artistic quality of the piece (e.g. Green, 2004; Green et al. 2012; Owen and Riggs, 2012; Zwarun and Hall, 2012). Perhaps a key aspect of understanding the value of using authentic materials in the foreign language classroom, and particularly the individual differences in benefit, lies in the area of engagement and enjoyment, and the elements that influence this process. This is the underlying motivation for the current research.
While the transportation-imagery model is often used to investigate films or literary works, it should be noted that the length of the narration does not appear to significantly affect the transportation process. A number of studies have shown how persuasion, learning, behavioural changes, theme awareness, and knowledge retention can occur through the transportation-imagery process using relatively short narratives, such as 30-second TV commercials, on-line advertisements, public service announcements, and short print messages (e.g. Escalas, 2004; Barbour et al., 2015; Ching et al., 2013).
Model and Hypotheses
The primary objective of this research is to investigate the relationship between an introductory foreign language student’s transportation into the narrative of a Spanish language music video, and the subsequent direct, and possibly indirect, impacts on language learning, enjoyment, and the comprehension of themes presented in the video. In addition, we are interested in other student- related personal characteristics that might directly impact both language learning and thematic comprehension while watching a Spanish language music video, or indirectly impact language learning and thematic comprehension through the transportation construct. Figure 1 presents these hypothesized relationships. The following section provides a brief description of these constructs and related hypotheses.

Model and Hypotheses.
Transportation → Learning and Thematic Comprehension
The core concept of transportation theory is the immersion of the viewer in a narrative, whether that narrative is presented in audio-visual or print form (e.g. Green and Brock, 2000). Studies of film and literary works have also indicated that enjoyment and transportation are related in that transportation can facilitate enjoyment (Green et al., 2004). The combination of the two constructs, transportation and enjoyment, and their impact on learning and narrative understanding leads to our first four hypotheses in which transportation and enjoyment have a direct impact on learning and thematic understanding for the student taking introductory Spanish. These hypotheses relate directly to the three objectives (learning, cultural understanding, and enjoyment) language instructors typically have when using authentic audio-visual material in the classroom.
H1A: Transportation into the Spanish language music video positively impacts language learning
H1B: Transportation into the Spanish language music video positively impacts thematic comprehension
H2: Transportation into the Spanish language music video positively impacts enjoyment
H3: Enjoyment positively impacts language learning
Cognitive Challenge → Transportation
A number of researchers have argued that narrative transportation occurs when there is a match between the viewer’s cognitive abilities and the material being viewed or read (Graesser et al., 1994; Owen and Riggs, 2012; Zwaan et al., 1995). As cognitive ability relative to the material being viewed or read increases, it is likely that immersion into the presented narrative should also increase (e.g. McNamara and Magliano, 2009). While cognitive challenge in a foreign language learning environment can be a function of many things, an important contributor to cognitive challenge when viewing or reading a narrative in the target language is simply the skill level that the viewer has in the target language. This leads to two hypotheses.
H4: Cognitive challenge negatively impacts transportation into the Spanish language music video
H5: Lower levels of Spanish language skill leads to higher levels of cognitive challenge when viewing the Spanish language music video
Affective Disposition → Transportation and Enjoyment
Feelings (Affects) are an important part of viewing a movie, listening to music, or reading a novel. Affective disposition when viewing or reading a narrative can develop in a number of different ways (Appel and Richter, 2007; Busselle and Blandzic, 2009; Zillman and Cantor, 1976), which in turn leads to greater identification and involvement with the narrative’s characters (Murphy et al., 2011). Because of this, affective disposition has been used as a construct that leads to transportation. Personal attraction and personal association (liking) with the characters are two important dimensions of affects that lead to greater transportation and enjoyment (Owen and Riggs, 2012), and thus should have an indirect effect on learning and thematic comprehension when viewing the music video. This leads to two hypotheses involving affective disposition.
H6: Higher levels of affective disposition toward the principal characters in the Spanish language music video positively impacts transportation.
H7: Higher levels of affective disposition toward the principal characters in the Spanish language music video positively impacts enjoyment.
Belief Consistency → Transportation
Individuals have attitudes or beliefs that may be in conflict when watching or reading narratives with different points of view. While the underlying reason for these attitudes may vary, attitudes toward diversity may become an important issue when confronted with a narrative that expresses the benefits of cultural diversity related themes. With respect to this study, the Spanish language music video used in the research clearly presents a strong narrative theme of cross-cultural diversity in personal, social, and political relationships, both in the song’s lyrics and the visual components of the video. Both transportation theory and belief consistency theory suggest that immersion, and subsequent enjoyment and comprehension of a narrative is more likely if the narrative’s message corresponds to, and re-affirms the belief system of the viewer (e.g. Busselle and Biladnzic, 2009; Green, 2004; Green and Brock, 2000; Zwarin and Hall, 2012). This leads to the final, ‘message matching’ hypothesis.
H8: Belief Consistency (positive attitudes toward cultural diversity) positively impacts transportation into the Spanish language music video.
Method
The music video selected for this study was Ella y él [She and He], a 2011 Spanish language pop music video by Grammy-Award winning singer and song writer Ricardo Arjona from Guatemala. Native language music videos such as Ella y él fall on the more ‘stylistic’ or ‘artistic’ side of the authentic material spectrum, but by their very nature, also have a higher potential for both student engagement and a multi-dimensional presentation of authenticity (visual, musical, cultural, and spoken language). The song featured in the music video makes extensive use of the verb ‘Ser’ [to be] and the adjectives covered in the lesson material for the introductory Spanish class. Since a primary objective of the present research is to examine students’ involvement and transportation into a music video’s narrative, and the associated impact this has on foreign language learning, a longer video was selected. Ella y él runs approximately 6 minutes, almost double the time normally used for music videos but typical for the narrative-based genre of music videos. The Ella y él video presents a narrative that involves an attractive Marxist Cuban woman and a young, conservative business man from the United States who meet on vacation in Mexico’s Yucátan, fall in love and then move to Paris in order to enjoy life distant from their political belief systems. This narrative presents various cultural, social and political themes, particularly the benefits of cross-cultural relationships and diversity.
The study examined an introductory Spanish course (typical 15-week, semester-long US course) offered at a large public university located in North Carolina. A total of five classes over two semesters were used for the study (n=109). The sample appeared representative of the total sample university student population – 39% were males, and 61% were females. None of the students were majoring in Spanish, and the majority of the students sampled were at the second year (Sophomore) or third year (Junior) level of studies.
A 30-minute lesson on the verb ‘Ser’ [to be] and certain descriptive adjectives, such as nationality, was provided immediately before viewing the music video. Thus the music video is employed as a ‘reinforcement’ pedagogical tool to the lesson (e.g. Chung, 2002; Elkafaifi, 2005; Li, 2013; Mayer et al., 2001). This is generally how a music video or other supporting audio-video material would be used in a teaching situation for early foreign language learners. The lesson and music video was presented in approximately the sixth week of the course.
The pre-video questionnaire was designed to measure the student’s pre-video language knowledge of the covered material and belief consistency. The post-video questionnaire measured the student’s experience with the video, that is, narrative transportation, cognitive challenge, affective disposition, enjoyment, and perception of the music video’s ‘themes’. In addition, the post-video questionnaire included the same language knowledge questions as the pre-video questionnaire in order to obtain a ‘learning’ metric. In this way an unbiased, true measure of learning associated with the video experience is obtained. The pre-video questionnaire was given immediately before viewing the music video, and the post-video questionnaire was given immediately after the music video’s conclusion so the only experimental treatment was viewing the video. No class discussion was allowed between the pre-video and post-video questionnaire completion.
In addition, in order to control for possible differences in teaching styles and lesson plan coverage, all the classes used for the study’s sample were taught by the same instructor using exactly the same lesson plan for all the classes, with the music video shown at the same point during the semester. The research was carefully designed such that the only experimental treatment was the viewing of the Spanish language video with strict control over other possible confounding elements. The university’s institutional review board (IRB) approved the study and questionnaire design.
Measures
Table 1 provides the list of items used in the research, as well as the Cronbach’s Alpha. With the exception of language learning, all of the scales used in the research were scored on a 7-point scale. Transportation was measured by 5 questions from the 7–item transportation scale developed by Green et al. (2004), and used extensively in studies examining viewer engagement with the narratives presented in various media, such as film, prose, games, videos, and advertising. The five items were selected as being most relevant to a music video narrative. The Transportation scale was provided in the post-video questionnaire.
Construct Measures and Cronbach’s Alpha (Full Sample).
(R) reverse coded.
Since diversity is a major theme in the video, we incorporated an item regarding personal attitude towards cultural diversity in society as the belief consistency measure. This question was provided in the pre-video questionnaire.
Cognitive Challenge was measured by three items selected from the scale developed by Owen and Riggs (2012). Cognitive Challenge was measured in the post-video questionnaire.
Affective Disposition was measured by four items using similar language as the study of feature films by Owen and Riggs (2012). For this measure, respondents rated how they felt regarding each character along two dimensions, whether they liked the character and whether they found the character attractive. Affective Disposition was measured in the post-video questionnaire.
Enjoyment was measured using three items. The language of these three items included a number of dimensions commonly used in studies that examine viewer appreciation and enjoyment of various artistic and creative media, such as paintings, murals, and movies (e.g. Green et al., 2012). The Enjoyment questions were provided in the post-video questionnaire.
Learning was measured by the improvement on the short language quiz that was administered pre- and post- showing the video. The quiz involved ten different questions (written in Spanish) related to the use of the verb ‘Ser’ and certain descriptive adjectives, such as nationality. Both of these language elements are used extensively in the music video, and this was one of the reasons this particular music video was chosen. The pre- and post-video language quiz were the same, so learning was measured by the increase (or decrease) in the number of quiz questions answered correctly. As an example, a student who answered six language questions correctly before seeing the video, and then answered eight language questions after seeing the video would have a positive Learning measure of ‘2’.
To measure the student’s ‘Thematic Comprehension’ a two-step process was developed. First, two college educated Spanish Heritage (Latin American) speakers independently watched the video and provided their scores on seven themes using the 7-point Likert scale. Both raters are completely fluent in both Spanish and English, and are familiar with both Latin and US culture. Interrater agreement was measured by the ‘intra-class correlation coefficient’ (ICC) for the seven themes. The ICC (absolute agreement) was 0.93 indicating a very strong agreement between the two raters regarding how the music video communicated the seven themes. Second, the ‘Thematic Comprehension’ measure was calculated by computing the distance (absolute value) each student had from the average rater score on each of the seven themes. Thus a higher value reflects a greater difference from the two bi-lingual raters.
Modelling Procedures
To examine the hypotheses a Structural Equation Modelling (SEM) approach was employed using WARP-PLS 6.0 using the default algorithm designed to reduce instances of Simpson’s paradox. Structural Equation Modelling allows statistical testing of hypothesized path models, and has been used extensively in investigating the transportation-imagery process in a variety of different types of narratives (e.g. Ching et al., 2014; Escalas, 2004; Owen and Riggs, 2012). PLS-SEM is often recommended especially if the primary objective is to predict or explain target constructs, such as ‘learning’ and ‘narrative comprehension’ in our study (Hair et al., 2014: 14). The determination of statistical significance is based on an iterative ‘bootstrapping’ method of 500 random sub-samples.
Results
Overall Learning
Since the ‘advance organizers’ (e.g. in-class lesson plan) were provided before the pre-video assessment and the only experimental treatment between the pre-video and post-video language assessments was viewing the video, the differences in scores can be considered a measure of learning associated with the video. On the average, viewing the video appeared to result in statistically significant increases in learning. Approximately 35.8% of students increased their score on the 10 question language test after viewing the music video, 7.4% decreased their score and 56.8% had no change in scores. Likewise, the average net learning was a positive 0.54 (on the 10-point scale) after viewing the video, while the mean percentage increase in learning was 10.1%, both indicating a statistically significant increase in learning associated with the video (prob<0.01).
Structural Equation Model and Hypotheses
While the above analysis certainly suggests that authentic audio-video material may be effective as a ‘reinforcement’ pedagogical tool at the class level, this does not explain the mechanism of learning, entertainment, and theme comprehension associated with viewing the video or the variation in the levels of learning, entertainment, and theme comprehension at the individual student level. For this we examine the overall SEM model and the hypotheses. Table 2 presents the estimated structural equation path coefficients.
Results of Structural Equation Model: Learning, Enjoyment and Thematic Comprehension.
Notes: *prob<0.10, **prob<0.05, ***prob<0.01, one-tailed t-test, n=108.
Lower values on the Thematic Comprehension metric indicates higher levels of agreement with the expert raters.
Eight out of the nine hypotheses are statistically supported (prob<0.01). Of particular note is that a student’s level of engagement or transportation into the Spanish language music video resulted in statistically significant increases in all three commonly cited objectives for using authentic audio-visual material – higher levels of transportation resulted in statistically higher levels of student learning (H1A), thematic comprehension (H1B) and enjoyment (H2). This is an important consideration from two perspectives. First, it helps explain the variation in learning at the individual level in the classroom. Second, combined with the results from the affective disposition-transportation link (H6), it strongly suggests using authentic audio-visual material incorporating characters of strong personal appeal (Affective Disposition). To increase student transportation, and thus learning, thematic comprehension and enjoyment.
In addition, the statistically significant results from both H4 and H5 indicate that cognitive challenge and language comprehension are key determinants of how engaged or transported a student will become when viewing a native language video, even when significant advance organizers are provided.
Finally, the hypothesis of belief consistency was not supported (H8). In fact, a strong statistical relationship was found opposite of the hypothesized relationship. Authentic audio-visual materials often have certain political or social themes, however in this case, the issue of belief consistency did not seem to significantly impact engagement and transportation in the expected manner. Strong positive beliefs of cultural diversity and immigration, were in fact, negatively related to engagement and transportation – a particularly surprising result given cultural diversity is a clear theme throughout the video.
Model Fit
Overall the model offers a reasonably good fit. The average full collinearity VIF = 1.42, less than 5.0 and the Simpson paradox ratio = 1.00. The Tenenhaus Goodness of Fit index = 0.388 indicating a high degree of fit, and the R-square contribution ratio was a high 1.00. Using the ‘gamma-exponential method’ for estimating statistical power for PLS-SEM, the power characteristics of the final estimated model is acceptable (power>0.80, Cohen, 1988).
Discussion
A common pedagogical strategy of language instructors is to use authentic language audio-visual materials in early foreign language learning environments. Music videos are particularly attractive tools for achieving three objectives – learning, cultural understanding and enjoyment. In spite of this applicability to the introductory foreign language classroom, the present research appears to be the first empirical study that examines these three objectives within the context of a narrative based music video. Few, if any, earlier studies have investigated the process of student engagement with authentic material and its impact on instructional outcomes at the individual student level; the present study is an attempt to start this discussion.
Several underlying concepts were examined. We found support for the argument that overall learning is enhanced by using the authentic video as a reinforcement tool to a previously taught lesson, or in other words, there was an ‘advance organizer’ process before watching the video. Overall, there was a statistically significant increase in language learning that could be attributed to viewing the video.
The key contribution of the present research, however, is empirically investigating the process by which learning, thematic comprehension and enjoyment take place when viewing an authentic language music video. The underlying approach for the research lies in the transportation-imagery model (Green and Brock, 2000), with the proposition that foreign language learners are transported or immersed into the video narrative at differing levels, depending on a variety of factors. This transportation or engagement, in turn, influences the three pedagogical objectives – enhanced learning, thematic comprehension, and enjoyment.
The analysis offers several findings of note. First, while there is a relatively large literature that examines transportation and engagement when viewing or reading various types of narrative and creative material, very little prior research has investigated this process using classroom material. The final model in the present research supports many of the prior findings from the broader transportation-imagery and entertainment literature but within the foreign language learning environment. In particular, affective disposition toward the characters, and the degree of cognitive difficulty all appeared to significantly influence the foreign language learner’s transportation into the Spanish language music video.
Second, the research suggests that language learning, thematic comprehension, and enjoyment can all occur within a relatively short media experience through the transportation process. For example, the music video used in this research was only six minutes long, but did result in significant student transportation and engagement for many students, which in turn, led to achieving the three targeted objectives of using authentic audio visual material in the classroom. This result supports prior research findings with advertisements, messaging, and public service announcements that narrative engagement can occur very quickly. This finding also supports prior arguments about what makes for an effective authentic teaching tool for foreign language learners, such as visual support, the simplicity of the textual themes and events, repetitive content, and the use of ‘parataxis’ or the stringing together of relatively simple clauses which is often seen in poetry and song lyrics (e.g. Brown and Yule, 1983; Bygate, 1987; Gilmore, 2007).
Third, while statistically significant, only a relatively small amount of the variance in learning is explained by the model. This indicates that other factors not captured in the tested casual model are important, particularly since the overall learning associated with viewing the video seemed to be significant. Unlike films and shows, it could be that music videos present a slightly different problem – while some enjoyment may be important to maintain attention, if the music video is highly engaging and enjoyable, then some listeners may become overly engaged in the musical components, such as the beat and harmony, and the visual components, such as the dancing and colours, and actually lose contact with the underlying narrative – particularly given the language difficulties inherent in an introductory foreign language teaching environment.
Fourth, since authentic audio-visual materials are commonly used by language instructors to facilitate discussions of different themes, such as culture and behaviours, the findings suggest that this may be an effective strategy, but only if the audio-visual material is carefully selected. Instructors need to consider choosing a video that maximizes the foreign language student’s narrative transportation, such as using videos with likable characters, in order to ultimately achieve better thematic comprehension.
While this study provides strong empirical support for the transportation-imagery model as applied to using authentic language music videos within a foreign language learning environment, much more empirical research is needed. Most importantly, however, the present research provides empirical support and a potential underlying model to fine-tune what many language instructors already use in the classroom. In addition, while this study focussed on the introductory foreign language student experience, many of the results, particularly the relationship between narrative transportation and thematic comprehension, may be applicable to other disciplines, such as management, sociology, and political science, which also commonly use audio-visual material in the classroom to illustrate teaching points.
Footnotes
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
