Examining the Effects of a Pedagogical Agent With Dual-Channel Emotional Cues on Learner Emotions,Cognitive Load,and Knowledge Transfer Performance

Abstract

Despite the continuous emphasis on emotion in multimedia learning, it was still unclear how pedagogical agent emotional cues might affect learning. In the present study, a between-subjects experiment was performed to examine the effects of a pedagogical agent with dual-channel emotional cues on learners' emotions, cognitive load, and knowledge transfer performance. Participants from a central Chinese university (age mean = 21.26, N = 66) were randomly divided into three groups. These groups received instructions from an affective pedagogical agent, a neutral pedagogical agent, or a neutral voice narration without pedagogical agent embodiment. Results showed that learners assigned the affective pedagogical agent reported a significantly higher emotional level than learners assigned the neutral pedagogical agent. Learners’ perceived task difficulty was not significantly different among groups while instructional efficiency was significantly higher for learners with the affective pedagogical agent. Moreover, learners assigned to the affective pedagogical agent performed significantly better on the knowledge transfer test than those assigned the neutral pedagogical agent or the neutral voice.

Keywords

pedagogical agent emotion dual-channel assumption multimedia learning

The term pedagogical agent refers to a computer agent that performs anthropopathic behaviors in the digital learning environment (Heidig & Clarebout, 2011). Over 25 years ago, Bates (1994) proposed that computer agents expressing emotional cues are more engaging and believable than those without any emotional cues. According to Han (2013) and Sauter et al. (2013), emotional cues include verbal and non-verbal signals such as facial expression, body movement, text, and voice that expresses liking or disliking. In multimedia learning, Moreno and Mayer (2007) suggested that emotional cues may influence the selection, organization, and integration of information, and lead to different learning outcomes. Specifically, emotional cues from a pedagogical agent may be contagious and trigger changes in learner’s emotions (Hatfield et al., 1993; Liew et al., 2017). Then, learners with enhanced positive emotions may be more actively engaged in learning and achieve better knowledge transfer performance (Um et al., 2012; Plass et al., 2014). However, emotional cues as extra information may also lead to cognitive overload and hinder learning (Fraser et al., 2012). Therefore, caution should be given when designing emotional cues.

Regarding empirical evidence, many previous studies on pedagogical agent designed emotional cues solely in the visual channel (Guo et al., 2015; Guo & Goh, 2016; Kim et al., 2017; Krämer et al. 2016; Liew et al., 2016; Romero-Hall et al., 2014). Their results turned out to be inconsistent regarding the effects of pedagogical agent emotional cues on learner emotions and performance. On the other hand, Liew et al. (2017) examined a pedagogical agent with a dual-channel emotional design. They found the pedagogical agent with dual-channel emotional cues to be more effective than the one without emotional cues in promoting learner emotions, motivation, and test performance. Comparing to the single-channel design, this might be explained as the emotional experience of humans is jointly created from visual and audio messages (Gerdes et al., 2014). The interaction between visual and audio channels could enhance a human's perception of emotions. Also, allocating emotional cues via two channels could help avoid cognitive overload in any single channel (Miller & Buschman, 2015). Nevertheless, more evidence is necessary to strengthen our understanding of the pedagogical agent with dual-channel emotion cues.

The present study aimed at further exploring the effects of a pedagogical agent with dual-channel emotional cues on learner emotions, cognitive load, and knowledge transfer performance. Several adjustments have been made that differ this study from previous ones. First, visual and auditory emotional cues were fully generated by machines. Second, participants were entirely recruited from a Chinese university. Third, a control group without a pedagogical agent was added to test if the agent-based design is indeed more effective in our context (Heidig & Clarebout, 2011). Research questions that guided this study are as follows:

RQ1: To what extent will learners who assigned the pedagogical agent with dual-channel emotional cues, the pedagogical agent without emotional cues, or the voice narration without emotional cues differ in their perceived emotions?

RQ2: To what extent will learners who assigned the pedagogical agent with dual-channel emotional cues, the pedagogical agent without emotional cues, or the voice narration without emotional cues differ in their perceived cognitive load?

RQ3: To what extent will learners who assigned the pedagogical agent with dual-channel emotional cues, the pedagogical agent without emotional cues, or the voice narration without emotional cues differ in their knowledge transfer performance?

Seven sections structure this paper. The first section discusses the theoretical framework. Section two presents the relevant studies. Section three describes the research design and research questions, and the fourth section describes the pedagogical agent design, subject matter, instruments, participants, and procedures. The last three sections present results, discussion, and conclusions.

Theoretical Framework

Several theories concerning the multimedia learning environment have different perspectives towards the design, processing, and effects of pedagogical agent emotional cues (Mayer, 2005; Mayer & Moreno, 1999; Sweller, 2011). Among these theories, the cognitive theory of multimedia learning and its extended version, the cognitive-affective theory of learning with media, have provided the most comprehensive consideration regarding learning with interactive multimedia elements. This section discusses the assumptions and principles underlying these two theories to generate implications for pedagogical agent emotional cues.

The cognitive theory of multimedia learning (CTML) described how learners select, process, and integrate information in the multimedia learning environment (Mayer, 2005). CTML holds three fundamental assumptions. The first assumption, dual-channel, claimed that learners process information through visual and audio channels (Cuevas & Dawson, 2018). The second assumption, limited capacity, illustrated that each channel has limited capacity and can only handle a small amount of information at a time (Miller & Buschman, 2015; Sweller, 2011). The third assumption, active processing, described how learners construct meanings by actively selecting, organizing, and integrating relevant information (Brame, 2016).

To accommodate advanced learning media that supports interactive elements such as the pedagogical agent, Moreno and Mayer (2007) expanded the initial CTML framework with additional assumptions forming the cognitive-affective theory of learning with media (CATLM). One of the key assumptions indicated that motivational and affective factors may influence learner’s cognitive process. To be more specific, a learner in the multimedia learning environment may be emotionally affected by affective elements according to a conception called emotional contagion (Hatfield et al., 1993). This conception was supported by studies that examined both positive and negative emotions (Becker, 2014; Frenzel et al., 2018; Meyer & Turner, 2002). Then, learners with enhanced positive emotions may be more actively engaged in the selection, organization, and integration of information, which are the central processes for learning in the multimedia environment. Um et al. (2012) and Plass et al. (2014) provided evidence that emotional design such as warm colors and face-like shapes can promote learner positive emotions and lead to better performance on comprehension and transfer tests.

An additional assumption that might follow is that a pedagogical agent's emotional cues can also enhance learner engagement and improve learning performance. However, recent empirical findings revealed that pedagogical agent emotional cues in some cases do not improve learning performance (Guo et al., 2015; Guo & Goh, 2016; Kim et al., 2017; Liew et al., 2016; Romero-Hall et al., 2014). Some researchers suggested that a major reason for the above results could be only presenting emotional cues in one channel, resulting in learners not perceiving those emotional cures or perceiving them incorrectly (Guo et al., 2015; Liew et al., 2016). Therefore, it might be necessary to consider the dual-channel assumption when designing pedagogical agent emotional cues.

Mehrabian (1971) determined that emotions transmitted in communication are carried by word (7%), voice (38%), and facial expression (55%). Although there were doubts about each element's weights, these emotional cues' importance has been well-recognized (Hale et al., 2017). Mehrabian's equation suggested that both visual and audio channels may be significant in transmitting emotions, supporting the idea that information is processed in two channels.

Another perspective regarding pedagogical agent emotional cues is that emotional cues are irrelevant to the learning content. Thus, emotional cues might take up the limited cognitive capacity and thus hinder learning (Rey, 2012). Nevertheless, based on the dual-channel assumption, researchers have suggested that allocating emotional cues to visual and audio channels could potentially avoid cognitive overload in a single channel (Park, Flowerday, et al., 2015; Park, Knörzer, et al., 2015).

Summary

Pedagogical agent emotional cues might have an important role in a multimedia learning environment. However, to achieve the optimum effects, emotional cues need to be designed cautiously, considering assumptions underlying the cognitive-affective theory of learning with media. On the one hand, the dual-channel assumption indicated that pedagogical agent emotional cues should involve visual and audio channels to enhance learners' perception. On the other hand, the limited capacity assumption implied allocating emotional cues to visual and audio channels could help avoid cognitive overload in either channel. Therefore, different assumptions and perspectives may indicate an integrated conception that a pedagogical agent with dual-channel emotional cues is more effective in supporting learning.

Review of Relevant Literature

To further understand the effects of pedagogical agent emotional cues, this section examines related empirical studies. The review was organized into two sub-sections depending on the emotional design of pedagogical agents. The first sub-section looked into studies that designed pedagogical agent emotional cues in a single channel. The second sub-section examined a study that designed a pedagogical agent with dual-channel emotional cues. Both sub-sections discussed how these pedagogical agents affect learner emotions, cognitive load, and learning performance.

Pedagogical Agents with Single-Channel Emotional Cues

Pedagogical agents can deliver emotional cues via visual and audio channels through mediums such as facial expression, on-screen text, and voice. This section reviews studies that considered only one channel. Kim et al. (2017) examined the effects of on-screen encouraging remarks in an agent-based tutorial on middle school students' mathematics learning and anxiety. Two groups received learning materials in which a static female agent image was inserted. The agent provided instructions through the on-screen textbox. One of the two groups also received encouraging texts from the agent. However, results showed no significant difference between the two groups in either mathematics learning or anxiety.

Instead of using on-screen text, three studies presented evidence on delivering emotional cues via facial expressions (Romero-Hall et al., 2014; Liew et al., 2016; Krämer et al., 2016). Frist, Romero-Hall and colleagues studied three groups of college students in studying African history. The three groups were assigned an emotionally expressive pedagogical agent, a non-expressive pedagogical agent, or a none-agent. The pedagogical agent was a male figure that spoke with an informal human voice. The expressive pedagogical agent also expressed pre-set facial expressions while speaking. Analysis of learning achievement and emotions indicated that the group receiving emotional cues had significantly lower achievement as well as more negative emotions than the other two groups. Liew et al. (2016) found similar results when comparing the effects of a male pedagogical agent with a smiling facial expression against one with a neutral facial expression. Both agents communicated through the same machine-generated voices. University students participated in this study and received a tutorial from one of the pedagogical agents on the C language. Not only did the post-test on knowledge comprehension reveal no significant difference, but learners assigned the smiling agent also reported significantly lower positive emotions. Krämer et al. (2016) investigated a pedagogical agent with a smiling expression and nodding behavior and its effects on adult learners in solving math problems. However, pedagogical agents in this study did not speak or present texts. They indicated that emotional cues from the pedagogical agent significantly enhanced learner performance compared to the pedagogical agent without emotional cues. However, perceived positive emotion did not differ between the two groups.

Guo et al. (2015) and Guo and Goh (2016) combined facial expressions and on-screen messages to deliver emotional cues in the visual channel. Pedagogical agents in both studies were cartoon-like characters that communicated via on-screen texts. The main difference was that Guo et al. (2015) designed a classroom scenario while Guo and Goh (2016) used a game-based learning approach. University students in the two studies were assigned to three conditions to learn about information literacy: an affective pedagogical agent, a neutral pedagogical agent, or none-agent. Both studies indicated that the three groups did not differ in knowledge retention. Learners in the affective agent group reported higher motivation and enjoyment. Guo et al. (2015) concluded that one of their study's limitations was the lack of audio cues, which could improve the learning outcome.

Liew et al. (2020) recently examined the effects of an enthusiastic and calm voice narration on undergraduates' transfer performance and cognitive load in learning computer algorithms. They reported that the lively voice narration leads to significantly higher transfer performance than the calm voice narration. Moreover, the calm voice in their study led to a higher germane cognitive load for non-native speakers.

The previous studies suggest that while emotional cues improve learning results are inconclusive. Although Liew et al. (2020) provided some initial evidence of the positive effects of enthusiastic audio cues on transfer performance, follow-up studies in different contexts and backgrounds are necessary. As for learner emotions, while learners in Guo et al. (2015) and Guo and Goh (2016) reported higher positive emotions after interacting with affective pedagogical agents. Liew et al. (2016). Romero-Hall et al. (2014) found that learners assigned affective pedagogical agents experienced more negative emotions. Liew et al. (2016) suggested an emotional cue could carry multiple meanings, thus providing emotional cues from more than one channel could potentially help learners perceive emotional cues more precisely. Furthermore, evidence suggested that emotional cues in voice narration may only influence participants who natively speak a different language from the pedagogical agent (Liew et al., 2020).

Pedagogical Agents with Dual-Channel Emotional Cues

Liew et al. (2017) designed a pedagogical agent delivering emotional cues in both visual and audio channels with facial expressions, hand gestures, voice, and on-screen texts. Liew paired university freshmen with either the affective pedagogical agent or a neutral pedagogical agent to learn a basic programming language. Findings indicated that the affective pedagogical agent significantly enhanced learners’ positive emotions, motivation, and performance. Also, no significant difference in cognitive load was found between the two groups.

Although Liew et al. (2017) provided evidence that dual-channel emotional cues may be more effective, several questions remained. First, Davis (2018) suggested that hand gestures alone may carry significant social and cognitive functions. Thus, there has been a need to consider the effects of emotional cues, including facial expressions, voice, and words, separately from hand gestures. Second, Liew et al. (2017) used recorded human voice for the pedagogical agent based on the assumption that the human voice was better than the machine-generated voice, defined as the voice effect (Atkinson et al., 2005). However, with newer voice-generating technology, recent studies have reported that modern machine-generated voice was comparable to the human voice in terms of their effects on learning outcomes, cognitive load, and learners’ perceptions (Craig & Schroeder, 2017, 2019; Davis et al., 2019). Last, Liew et al. (2017) did not include a control group without the pedagogical agent. Thus, it is uncertain if having a pedagogical agent with dual-channel emotional cues produced better learning outcomes than tutorials without a pedagogical agent (Heidig & Clarebout, 2011).

Research Design

After reviewing the literature, we made three adjustments to the research design. We operationalized emotional cues as facial expressions, voice, and auditory remarks. Second, machine-generated voices were employed instead of recorded human voices. Last, we incorporated a control group.

The present study implemented a between-subjects experimental design with two treatment groups and one control group (Charness et al., 2012). The independent variable was the type of pedagogical agent. Dependent variables were learner emotions, cognitive load, and knowledge transfer performance. The researchers assigned the first group to a pedagogical agent with an embodiment, voice, and emotional cues (i.e., Affective Pedagogical Agent, APA group). The second group was assigned a pedagogical agent with an image and voice but without emotional cues (i.e., Neutral Pedagogical Agent, NPA group). The control group was assigned voice without embodiment or emotional cues (i.e., Neutral Voice, NV group).

Method

Pedagogical Agent and Emotional Cues

The pedagogical agent was developed with the Media Semantics Character Builder (Version 5.4.8, 2017). It was portrayed as a 35-year-old male teacher dressed in suits. The pedagogical agent's location was at the lower right corner of the screen showing only body parts above shoulders to avoid potential distractions. In both APA and NPA groups, the pedagogical agent had natural eye blinks and head movements to mimic humans. Also, the pedagogical agent had synchronous lip movements while talking. The emotional cues of the pedagogical agent were displayed through facial expressions, voice, and auditory remarks.

For facial expressions, the affective pedagogical agent in the APA group employed smile, wink, and a suggestive facial expression shown in Figure 1. The neutral pedagogical agent in the NPA group only had a neutral facial expression.

Figure 1.

Screenshot of the tutorial and pedagogical agent facial expressions.

The present study used a speech synthesizing engine from the Baidu AI Open Platform (2019) for voice. Baidu has one of the top speech synthesizing engines for mandarin in the market, and it provided various types of voices. The APA group adopted an affective male voice, while the NPA and NV groups adopted a standard male voice.¹

For auditory remarks, participants received responses after answering two multiple-choice questions inserted in the tutorial. Participants in the APA group received more confirming and encouraging responses such as “Well done! This is a very useful way to find information”. On the other hand, learners in the NPA and NV group received neutral responses such as “Ok” and “I see.”

Subject Matter

Participants in this study received the tutorial about a six-stage problem-solving model known as the Big 6. The six stages included task definition, information seeking strategies, location and access, use of information, synthesis, and evaluation. The tutorial was structured, as shown in Figure 2. All contents in the tutorial were presented in Mandarin.

Figure 2.

The organization of the tutorial.

Instruments

Instruments used in this study included a questionnaire and a knowledge transfer test. The questionnaire consisted of demographic, emotion, and cognitive load items. The demographic items asked participants about their student id, age, gender for descriptive purposes.

The emotion items were acquired from the control-value theory of achievement emotions, which identified emotions most related to learning activities and performance (Pekrun, 2006). Enjoyment, pride, anxiety, and hopelessness were selected from the achievement emotions for several considerations. First, according to emotional contagion, the smiling expression might induce learner enjoyment. Two empirical studies also reported that the affective agent increased learners' enjoyment (Guo et al., 2015; Guo & Goh, 2016). Second, pride has been linked to intrinsic motivation for all students (Pekrun et al., 2002). Since emotional cues in the present study included encouraging remarks which might influence learner pride. Third, anxiety was included because it was the most commonly studied emotion in learning (Pekrun et al., 2011). However, since anxiety might either activate or hinder learning depending on its level, a deactivating negative emotion, hopelessness, was measured as a supplement to anxiety (Pekrun et al., 2002). Participants rated each emotion on a nine-point Likert scale (e.g., Please rate your current level of enjoyment.), with one being the weakest and nine being the strongest.

Researchers have posited that it might not be possible to measure the intrinsic, extraneous, and germane cognitive load individually (Kirschner et al., 2011; Schroeder, 2017). Thus, the cognitive load was usually measured as perceived task difficulty and invested mental effort (Paas, 1992). However, Van Gog and Paas (2008) have suggested that these two items are related and differ significantly only in extreme conditions. Paas & van Merriënboer (1993) have also argued that the perceived mental effort can only be interpreted meaningfully together with learner performance. Therefore, we operationalized cognitive load as perceived task difficulty and instructional efficiency. Perceived task difficulty was directly measured on a nine-point Likert scale. As for instructional efficiency, it was calculated using equation (1) proposed by Paas and van Merriënboer (1993), where invested mental effort was measured on a nine-point Likert scale and the measurement of transfer performance was indicated in the following paragraph. $Z_{Transfer Performance}$ and $Z_{Invested Mental Effort}$ were standardized Z scores for knowledge transfer performance and invested mental effort. This method has been employed in two studies and established to be reliable and valid (Schroeder, 2017; Yung & Paas, 2015).

Instructional Efficiency = \frac{Z_{Transfer Performance} - Z_{Invested Mental Effort}}{\sqrt{2}}

(1)

In the knowledge transfer test, participants were asked to find errors in a Big 6 case report. The case report described procedures for retrieving and organizing information to build a digital album. In this case report, researchers have intentionally created 16 errors, all related to the Big 6 model. For each mistake participants found, they received one point. The possible total score for the knowledge transfer test ranged from zero to 16.

Participants

Participants in this study initially included 66 juniors registered in the Modern Educational Technology course from a major university in central China. Participants were from nine different majors, such as chemistry, mathematics, and art. All participants possessed the necessary skills for learning online because each program in this university required a certain amount of online or blended courses. After participants signed up for the experiment, they were each assigned a random number using the Excel “RAND=” function. The researcher then sorted participants based on their numbers in ascending order and put the first 22 people into the APA group, the next 22 people into the NPA group, and the rest 22 people into the NV group. However, four participants failed to complete all required procedures later due to personal reasons. The researcher discarded those incomplete data. The final sample consisted of 41 (66%) females and 21 (34%) males with an age mean (standard deviation) of 21.26 (0.571).

Procedures

The experiment took place in one of the university computer labs. The researcher introduced the purpose and procedures of the investigation to the participant. After the participant granted his/her oral consent, a tutorial was presented on a pre-loaded web page. Upon finishing the tutorial, each participant completed a transfer test. Participants received a nominal cash reward after the session.

Results

To answer the research questions, one-way ANOVAs were performed on learner emotions, cognitive load, and knowledge transfer score using SPSS 22. Table 1 presented the means, standard deviations, and F ratios of the dependent variables for APA, NPA, and NV groups.

Table 1.

Means and standard deviations of dependent variables for three groups.

Dependent variables	APA group (n = 20)		NPA group (n = 20)		NV group (n = 22)		F
Dependent variables	M	SD	M	SD	M	SD	F
Emo-Enjoyment	6.60	1.14	5.75	1.21	5.77	1.27	3.25*
Emo-Pride	6.50	1.24	5.55	1.28	5.73	1.12	3.52*
Emo-Anxiety	4.40	1.70	5.45	1.61	4.82	2.00	1.77
Emo-Hopelessness	3.70	1.30	4.55	1.70	3.95	1.76	1.48
CL-Task difficulty	4.15	1.73	5.10	1.77	4.86	1.89	1.52
CL-Instructional efficiency	0.41	1.12	−0.49	0.88	0.07	0.81	4.67*
Knowledge transfer score	10.45	2.03	8.90	2.02	9.00	1.66	4.19*

Note. n indicated group size; Emo indicated emotion; CL indicated cognitive load.

*p < .05, two-tailed.

For learner emotions, the F-test showed significant differences among three groups in enjoyment (F (2, 59) = 3.25, p = 0.046, $ω^{2} = 0.068$ ) and pride (F (2, 59) = 3.52, p = 0.036, $ω^{2} = 0.075$ ). Effect size omega squared was used, for it was less biased than eta squared when the group sample size is smaller than 30 (Lomax, 2007). There was no significant difference in anxiety (F (2, 59) = 1.77, p = 0.180) or hopelessness (F (2, 59) = 1.48, p = 0.236). Tukey’s post hoc tests were performed on enjoyment and pride as shown in Table 2. Results revealed that the APA group had a significantly higher sense of pride than the NPA group (MD = 0.95, p = 0.041). However, the post-hoc test did not find any significant difference in terms of enjoyment.

Table 2.

Mean differences and significance values for post-hoc comparisons.

Dependent variables	APA – NPA		APA – NV		NPA – NV
Dependent variables	MD	p	MD	p	MD	p
Emo-Enjoyment	0.85	0.076	0.83	0.077	−0.02	0.998
Emo-Pride	0.95*	0.041	0.77	0.105	−0.18	0.884
CL-Instructional efficiency	0.90*	0.010	0.34	0.482	−0.56	0.138
Knowledge transfer score	1.55*	0.034	1.45*	0.044	−0.10	0.984

Note. MD indicated mean difference; Emo indicated emotion; CL indicated cognitive load.

*p < .05, two-tailed.

For cognitive load, the F-test showed significant differences among three groups in instructional efficiency (F (2, 59) = 4.67, p = 0.013, $ω^{2} = 0.106$ ). There was no significant difference in perceived task difficulty (F (2, 59) = 1.52, p = 0.227). A Tukey’s post hoc test was performed on instructional efficiency as shown in Table 2. Results revealed that the APA group had a significantly higher score than the NPA group (MD = 0.90, p = 0.010).

For knowledge transfer performance, the F-test showed a significant difference among the three groups (F (2, 59) = 4.19, p = 0.020, $ω^{2} = 0.093$ ). Tukey’s post hoc test (see Table 2) indicated that the APA group was significantly higher than the NPA group (MD = 1.55, p = 0.034). Also, APA group was significantly higher than NV group (MD = 1.45, p = 0.044). There was no significant difference observed between NPA and NV groups (MD = 0.10, p = 0.984).

Discussion

The present study examined the effects of the pedagogical agent with dual-channel emotional cues on learner emotions, cognitive load, and knowledge transfer performance. Overall, the present study produced several meaningful findings demonstrating that the affective pedagogical agent can benefit learning. First, learners assigned the affective pedagogical agent reported a higher sense of pride than learners assigned the neutral pedagogical agent. Second, the perceived task difficulty did not differ across the three groups while the APA group reported significantly higher instructional efficiency than the NPA group. Third, learners assigned the affective pedagogical agent performed better on the knowledge transfer test than those given the neutral pedagogical agent or the neutral voice narration.

Statistical findings supported that learners assigned the affective pedagogical agent reported a higher sense of pride than those given the neutral agent. According to Pekrun et al. (2002), pride is an emotion related to a learner’s intrinsic motivation. Learners interacting with the affective pedagogical agent experienced a higher sense of pride, indicating more confidence in their ability to master content than learners interacting with the neutral pedagogical agent. This result is consistent with findings from Guo et al. (2015), where learners who interacted with an affective embodied agent reported a higher level of confidence than learners with a neutral embodied agent. Similarly, Liew et al. (2017) also concluded that an enthusiastic agent had induced higher intrinsic motivation than a neutral agent.

However, no significant difference was found in pride between APA and NV groups or between NPA and NV groups. Thus, it might indicate that the difference in pride is more related to how a pedagogical agent's emotional cues are designed than whether a pedagogical agent is present. Nevertheless, pride as an emotion needs to be further studied to understand its role in agent-based multimedia learning.

As for enjoyment, though omnibus analysis did indicate a significant variation, the post hoc test showed no difference among groups. Statistically, this might be due to the Tukey follow-up test being more conservative than the F-test to control the overall alpha level (Howell, 2010). Educationally, there could be two reasons. First, as found by Krämer et al. (2016), the emotional cues of the pedagogical agent were not strong enough to induce a significant difference in learns’ perceptions. Nevertheless, several studies, including Krämer et al. (2016), have concluded that emotional cues could promote a behavioral difference even though they did not induce a perceptual difference among learners (Huang et al., 2011; Von der Pütten et al., 2010). As in the present study, the APA group had better knowledge transfer performance than other groups. That is to say, emotional cues may influence learning without being explicitly reported. Second, Muir et al. (2017) suggested that learner emotions fade over time. Thus, using a questionnaire might not have captured the change in enjoyment during learning.

Results indicated that learners did not differ in negative emotions. Our findings contradict Liew et al. (2016) and Romero-Hall et al. (2014), finding that students who interacted with the affective pedagogical agent experienced more negative emotions. Liew et al. (2016) discussed that a smile has at least four different meanings. Thus, learners might perceive facial expressions differently without other references. Considering that these two studies only used the visual channel, results from the present study support presenting emotional cues via two channels may help learners perceive the emotions more precisely.

The APA group did not differ from the other two groups in perceived task difficulty. Our finding is consistent with Liew et al. (2017), where the enthusiastic pedagogical agent did not significantly increase more cognitive load than the neutral pedagogical agent. As for the instructional efficiency, we found that the APA group's instruction was significantly more efficient than the NPA group. From the learners’ perspective, this result supports the assumption regarding motivational and affective factors in CATLM. Learners who received emotional cues became more engaged in information selection, processing, and integration, contributing to better knowledge transfer performance (Mayer, 2005). Another possible explanation is related to cueing. Cueing in multimedia learning environments can focus learners’ attention on the more relevant information to allocate their limited cognitive resources more effectively (Yung & Paas, 2015). Bayliss et al. (2010) also found that the process of guiding someone’s attention was influenced by the guider’s facial expressions. Emotional cues from the pedagogical agent might have played a similar role in directing learners’ attention. However, additional studies are required to investigate how pedagogical agent emotional cues may affect learners' attention in learning-related contexts.

Overall, taking results on perceived task difficulty and instructional efficiency into consideration, the present study supports Park, Flowerday et al. (2015) findings that presenting emotional cues through visual and audio channels activates motivating and encouraging functions of emotional cues while avoiding increasing the cognitive load perceived by learners.

The APA group assigned the affective pedagogical agent performed significantly better on the knowledge transfer test than the other two groups. Our findings support Liew et al. (2017) that learners who interacted with the enthusiastic pedagogical agent significantly outperformed those who interacted with a neutral pedagogical agent. More importantly, since the present study used machine-generated voices for pedagogical agents, it also provides supporting evidence to Craig and Schroeder (2017, 2019) and Davis et al. (2019) that machine-generated voice is comparable to human voices in a Mandarin context. Moreover, this study generally also supports that emotional presence is an essential component contributing to a worthwhile learning experience (Cleveland-Innes & Campbell, 2012; Majeski et al., 2018).

Limitations and Future Directions

Although meaningful findings were reported in the present study, there were two limitations. The first limitation is related to ecological validity. A learner's emotion is subject to the influence of his/her surroundings. It is uncertain if the study findings can be reproduced outside of a controlled setting. The second limitation is generalizability. This study was conducted in a relatively short learning session with only juniors from a university. More studies are needed to examine the affective pedagogical agent in a more extended learning session or even multiple sessions to investigate how emotional cues may change. It will also be worthwhile to repeat the experiment with different types of learners, such as K-12 students.

Based on the findings from this study, future studies may be conducted in two directions. The first direction involves looking deeper into the two information processing channels to investigate how emotional cues are processed respectively. The second direction is related to the mechanism of emotional cues in learning, meaning through which paths emotional cues are connected to learning performance.

Conclusions

The present study investigated the effects of pedagogical agent dual-channel emotional cues on learning. Results suggest that dual-channel emotional design is effective in inducing learners’ positive emotions and enhancing learning performance. Also, allocating emotional cues into two channels can help avoid cognitive overload. This study provides evidence on how emotional cues should be employed and designed for agent-based multimedia learning.

Statements on Open Data, Ethics, and Conflict of Interest

In this research, all participants took part in the experiment voluntarily. Data were collected and used after participants granted their consent. Personal or personally identifiable information was not included in the data. There is no conflict of interest regarding this work.

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the following grants: National Natural Science Foundation of China, Grant Number: 61772012, National Natural Science Foundation of China, Grant Number: 61977035, and Ministry of Culture and Tourism of China, Grant Number: 20201194075.

ORCID iDs

Shen Ba

Taotao Long

Linjing Wu

Note

Author Biographies

Shen Ba is a doctoral student in the School of Educational Information Technology at Central China Normal University (CCNU), Wuhan, China. His research interests include the design of pedagogical agents in online learning environments.

David Stein is an associate professor in the College of Education and Human Ecology at OSU. His research interests include pedagogical practices for teaching online.

Qingtang Liu is a professor in the School of Educational Information Technology at CCNU. His research interests include learning analytics and virtual reality technologies.

Taotao Long is a lecturer in the School of Educational Information Technology at CCNU. Her research interests include flipped classroom and scientific argumentation.

Kui Xie is a professor in the College of Education and Human Ecology at OSU. His research interests include motivation, engagement, and self-regulation in digital learning.

Linjing Wu is an associate professor in the School of Educational Information Technology at CCNU. Her research interests include educational data mining, learning analytics.

References

Atkinson

R. K.

Mayer

R. E.

Merrill

M. M.

(2005). Fostering social agency in multimedia learning: Examining the impact of an animated agent’s voice. Contemporary Educational Psychology, 30(1), 117–139. https://doi.org/10.1016/j.cedpsych.2004.07.001

Baidu AI Open Platform [Computer software]. (2019). https://ai.baidu.com

Bayliss

A. P.

Schuch

Tipper

S. P.

(2010). Gaze cueing elicited by emotional faces is influenced by affective context. Visual Cognition, 18(8), 1214–1232. https://doi.org/10.1080/13506285.2010.484657

Bates

(1994). The role of emotion in believable agents. Communications of the ACM, 37(7), 122–125. https://doi.org/10.1145/176789.176803

Becker

E. S.

Goetz

Morger

Ranellucci

(2014). The importance of teachers’ emotions and instructional behavior for their students’ emotions–an experience sampling analysis. Teaching and Teacher Education, 43, 15–26. https://doi.org/10.1016/j.tate.2014.05.002

Brame

C. J.

(2016). Effective educational videos: Principles and guidelines for maximizing student learning from video content. CBE—Life Sciences Education, 15(4), es6. https://doi.org/10.1187/cbe.16-03-0125

Cleveland-Innes

Campbell

(2012). Emotional presence, learning, and the online learning environment. The International Review of Research in Open and Distributed Learning, 13(4), 269–292. https://doi.org/10.19173/irrodl.v13i4.1234

Craig

S. D.

Schroeder

N. L.

(2017). Reconsidering the voice effect when learning from a virtual human. Computers & Education, 114, 193–205. https://doi.org/10.1016a/j.compedu.2017.07.003

Craig

S. D.

Schroeder

N. L.

(2019). Text-to-Speech software and learning: Investigating the relevancy of the voice effect. Journal of Educational Computing Research, 57(6), 1534–1548. https://doi.org/10.1177/0735633118802877

10.

Cuevas

Dawson

B. L.

(2018). A test of two alternative cognitive processing models: Learning styles and dual coding. Theory and Research in Education, 16(1), 40–64. https://doi.org/10.1177/1477878517731450

11.

Davis

R. O.

(2018). The impact of pedagogical agent gesturing in multimedia learning environments: A meta-analysis. Educational Research Review, 24, 193–209. https://doi.org/10.1016/j.edurev.2018.05.002

12.

Davis

R. O.

Vincent

Park

T. J.

(2019). Reconsidering the voice principle with non-native language speakers. Computers & Education, 140, 103605. https://doi.org/10.1016/j.compedu.2019.103605

13.

Frenzel

A. C.

Becker-Kurz

Pekrun

Goetz

Lüdtke

(2018). Emotion transmission in the classroom revisited: A reciprocal effects model of teacher and student enjoyment. Journal of Educational Psychology, 110(5), 628–639. https://doi.org/10.1037/edu0000228

14.

Gerdes

A. B. M.

Wieser

M. J.

Alpers

G. W.

(2014). Emotional pictures and sounds: A review of multimodal interactions of emotional cues in multiple domains. Frontiers in Psychology, 5, 1351. https://10.3389/fpsyg.2014.01351

15.

Charness

Gneezy

Kuhn

M. A.

(2012). Experimental methods: Between-subject and within-subject design. Journal of Economic Behavior & Organization, 81(1), 1–8. http://doi.org/10.1016/j.jebo.2011.08.009

16.

Fraser

Teteris

Baxter

Wright

McLaughlin

(2012). Emotion, cognitive load and learning outcomes during simulation training. Medical Education, 46(11), 1055–1062. https://doi.org/10.1111/j.1365-2923.2012.04355.x

17.

Guo

Y. R.

Goh

D. H. L.

Luyt

Sin

S. C. J.

Ang

R. P.

(2015). The effectiveness and acceptance of an affective information literacy tutorial. Computers & Education, 87, 368–384. https://doi.org/10.1016/j.compedu.2015.07.015

18.

Guo

Y. R.

Goh

D. H. L.

(2016). Evaluation of affective embodied agents in an information literacy game. Computers & Education, 103, 59–75. https://doi.org/10.1016/j.compedu.2016.09.013

19.

Hale

A. J.

Freed

Ricotta

Farris

Smith

C. C.

(2017). Twelve tips for effective body language for medical educators. Medical Teacher, 39(9), 914–919. https://doi.org/10.1080/0142159X.2017.1324140

20.

Han

(2013). Do nonverbal emotional cues matter? Effects of video casting in synchronous virtual classrooms. American Journal of Distance Education, 27(4), 253–264. https://doi.org/10.1080/08923647.2013.837718

21.

Hatfield

Cacioppo

J. T.

Rapson

R. L.

(1993). Emotional contagion. Current Directions in Psychological Science, 2(3), 96–100. https://dx-doi-org.web.bisu.edu.cn/10.1111/1467-8721.ep10770953

22.

Heidig

Clarebout

(2011). Do pedagogical agents make a difference to student motivation and learning? Educational Research Review, 6(1), 27–54. https://doi.org/10.1016/j.edurev.2010.07.004

23.

Howell

D. C.

(2010). Statistical methods for psychology (8th ed.). Cengage Wadsworth.

24.

Huang

Morency

L. P.

Gratch

(2011). Virtual rapport 2.0. In J. Allbeck, N. Badler, T. Bickmore, C. Pelachaud, & A. Safonova (Eds.), Lecture notes in computer science (Vol. 6895). Intelligent virtual agents (pp. 68–79). Springer. https://dx-doi-org.web.bisu.edu.cn/10.1007/978-3-642-23974-8_8

25.

Kim

Thayne

Wei

(2017). An embodied agent helps anxious students in mathematics learning. Educational Technology Research and Development, 65(1), 219–235. https://doi.org/10.1007/s11423-016-9476-z

26.

Krämer

N. C.

Karacora

Lucas

Dehghani

Rüther

Gratch

(2016). Closing the gender gap in STEM with friendly male instructors? On the effects of rapport behavior and gender of a virtual agent in an instructional interaction. Computers & Education, 99, 1–13. https://doi.org/10.1016/j.compedu.2016.04.002

27.

Kirschner

P. A.

Ayres

Chandler

(2011). Contemporary cognitive load theory research: The good, the bad and the ugly. Computers in Human Behavior, 27(1), 99–105. https://doi.org/10.1016/j.chb.2010.06.025

28.

Liew

T. W.

Zin

N. A. M.

Sahari

Tan

S. M.

(2016). The effects of a pedagogical agent’s smiling expression on the learner’s emotions and motivation in a virtual learning environment. The International Review of Research in Open and Distributed Learning, 17(5). https://doi.org/10.19173/irrodl.v17i5.2350

29.

Liew

T. W.

Zin

N. A. M.

Sahari

(2017). Exploring the affective, motivational and cognitive effects of pedagogical agent enthusiasm in a multimedia learning environment. Human-Centric Computing and Information Sciences, 7(1), 9. https://doi.org/10.1186/s13673-017-0089-2

30.

Liew

T. W.

Tan

S. M.

Tan

T. M.

Kew

S. N.

(2020). Does speaker’s voice enthusiasm affect social cue, cognitive load and transfer in multimedia learning? Information and Learning Sciences, 121(3/4), 117–135. https://doi.org/10.1108/ILS-11-2019-0124

31.

Lomax

R. G.

(2007). Statistical concepts: A second course (3rd ed.). Lawrence Erlbaum Associates Publishers.

32.

Majeski

R. A.

Stover

Valais

(2018). The community of inquiry and emotional presence. Adult Learning, 29(2), 53–61. https://doi.org/10.1177/1045159518758696

33.

Mayer R

(Eds.). (2005). The Cambridge handbook of multimedia learning. Cambridge University Press.

34.

Mayer

R. E.

Moreno

(1999). A cognitive theory of multimedia learning: Implications for design principles. Journal of Educational Psychology, 91(2), 358–368. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.105.5077&rep=rep1&type=pdf

35.

Media Semantics Character Builder [Computer software]. (2017). https://www.mediasemantics.com/index.html

36.

Mehrabian

(1971). Silent messages. Wadsworth.

37.

Meyer

D. K.

Turner

J. C.

(2002). Discovering emotion in classroom motivation research. Educational Psychologist, 37(2), 107–114. https://doi.org/10.1207/S15326985EP3702_5

38.

Miller

E. K.

Buschman

T. J.

(2015). Working memory capacity: Limits on the bandwidth of cognition. Daedalus, 144(1), 112–122. https://doi.org/10.1162/DAED_a_00320

39.

Moreno

Mayer

(2007). Interactive multimodal learning environments. Educational Psychology Review, 19(3), 309–326. https://doi.org/10.1007/s10648-007-9047-2

40.

Muir

Madill

Brown

(2017). Individual differences in emotional processing and autobiographical memory: Interoceptive awareness and alexithymia in the fading affect bias. Cognition & Emotion, 31(7), 1392–1404. https://doi.org/10.1080/02699931.2016.1225005

41.

Paas

F. G.

(1992). Training strategies for attaining transfer of problem-solving skill in statistics: A cognitive-load approach. Journal of Educational Psychology, 84(4), 429–434. https://doi.org/10.1037/0022-0663.84.4.429

42.

Paas

van Merriënboer

J. J.

(1993). The efficiency of instructional conditions: An approach to combine mental effort and performance measures. Human Factors: The Journal of the Human Factors and Ergonomics Society, 35(4), 737–743. https://doi.org/10.1177/001872089303500412

43.

Park

Flowerday

Brünken

(2015). Cognitive and affective effects of seductive details in multimedia learning. Computers in Human Behavior, 44, 267–278. https://doi.org/10.1016/j.chb.2014.10.061

44.

Park

Knörzer

Plass

J. L.

Brünken

(2015). Emotional design and positive emotions in multimedia learning: An eye tracking study on the use of anthropomorphisms. Computers & Education, 86, 30–42. https://doi.org/10.1016/j.compedu.2015.02.016

45.

Pekrun

(2006). The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational research and practice. Educational Psychology Review, 18(4), 315–341. https://doi.org/10.1007/s10648-006-9029-9

46.

Pekrun

Goetz

Frenzel

A. C.

Barchfeld

Perry

R. P.

(2011). Measuring emotions in students’ learning and performance: The achievement emotions questionnaire (AEQ). Contemporary Educational Psychology, 36(1), 36–48. https://doi.org/10.1016/j.cedpsych.2010.10.002

47.

Pekrun

Goetz

Titz

Perry

R. P.

(2002). Academic emotions in students' self-regulated learning and achievement: A program of qualitative and quantitative research. Educational Psychologist, 37(2), 91–105. https://doi.org/10.1207/S15326985EP3702_4

48.

Plass

J. L.

Heidig

Hayward

E. O.

Homer

B. D.

(2014). Emotional design in multimedia learning: Effects of shape and color on affect and learning. Learning and Instruction, 29, 128–140. https://doi.org/10.1016/j.learninstruc.2013.02.006

49.

Rey

G. D.

(2012). A review of research and a meta-analysis of the seductive detail effect. Educational Research Review, 7(3), 216–237. https://doi.org/10.1016/j.edurev.2012.05.003

50.

Romero-Hall

Watson

Papelis

(2014). Using physiological measures to assess the effects of animated pedagogical agents on multimedia instruction. Journal of Educational Multimedia and Hypermedia, 23(4), 359–384. https://www.learntechlib.org/primary/p/114603/

51.

Sauter

D. A.

Panattoni

Happé

(2013). Children’s recognition of emotions from vocal cues. British Journal of Developmental Psychology, 31(1), 97–113. https://doi.org/10.1111/j.2044-835X.2012.02081.x

52.

Schroeder

N. L.

(2017). The influence of a pedagogical agent on learners’ cognitive load. Journal of Educational Technology & Society, 20(4), 138–147. https://www-jstor-org-443.web.bisu.edu.cn/stable/26229212

53.

Sweller

(2011). Cognitive load theory. In J. P. Mestre & B. H. Ross (Eds.), Psychology of learning and motivation (Vol. 55, pp. 37–76). Academic Press. https://doi.org/10.1016/B978-0-12-387691-1.00002-8

54.

Plass

J. L.

Hayward

E. O.

Homer

B. D.

(2012). Emotional design in multimedia learning. Journal of Educational Psychology, 104(2), 485–498. https://doi.org/10.1037/a0026609

55.

Van Gog

Paas

(2008). Instructional efficiency: Revisiting the original construct in educational research. Educational Psychologist, 43(1), 16–26. https://doi.org/10.1080/00461520701756248

56.

von der Pütten

A. M.

Krämer

N. C.

Gratch

Kang

S.-H.

(2010). “ It doesn’t matter what you are!” Explaining social effects of agents and avatars. Computers in Human Behavior, 26(6), 1641–1650. https://doi.org/10.1016/j.chb.2010.06.012

57.

Yung

H. I.

Paas , F. (2015). Effects of cueing by a pedagogical agent in an instructional animation: A cognitive load approach. Journal of Educational Technology & Society, 18(3), 153. www-jstor-org.web.bisu.edu.cn/stable/jeductechsoci.18.3.153