Abstract
Despite the continuous emphasis on emotion in multimedia learning, it was still unclear how pedagogical agent emotional cues might affect learning. In the present study, a between-subjects experiment was performed to examine the effects of a pedagogical agent with dual-channel emotional cues on learners' emotions, cognitive load, and knowledge transfer performance. Participants from a central Chinese university (age mean = 21.26, N = 66) were randomly divided into three groups. These groups received instructions from an affective pedagogical agent, a neutral pedagogical agent, or a neutral voice narration without pedagogical agent embodiment. Results showed that learners assigned the affective pedagogical agent reported a significantly higher emotional level than learners assigned the neutral pedagogical agent. Learners’ perceived task difficulty was not significantly different among groups while instructional efficiency was significantly higher for learners with the affective pedagogical agent. Moreover, learners assigned to the affective pedagogical agent performed significantly better on the knowledge transfer test than those assigned the neutral pedagogical agent or the neutral voice.
The term pedagogical agent refers to a computer agent that performs anthropopathic behaviors in the digital learning environment (Heidig & Clarebout, 2011). Over 25 years ago, Bates (1994) proposed that computer agents expressing emotional cues are more engaging and believable than those without any emotional cues. According to Han (2013) and Sauter et al. (2013), emotional cues include verbal and non-verbal signals such as facial expression, body movement, text, and voice that expresses liking or disliking. In multimedia learning, Moreno and Mayer (2007) suggested that emotional cues may influence the selection, organization, and integration of information, and lead to different learning outcomes. Specifically, emotional cues from a pedagogical agent may be contagious and trigger changes in learner’s emotions (Hatfield et al., 1993; Liew et al., 2017). Then, learners with enhanced positive emotions may be more actively engaged in learning and achieve better knowledge transfer performance (Um et al., 2012; Plass et al., 2014). However, emotional cues as extra information may also lead to cognitive overload and hinder learning (Fraser et al., 2012). Therefore, caution should be given when designing emotional cues.
Regarding empirical evidence, many previous studies on pedagogical agent designed emotional cues solely in the visual channel (Guo et al., 2015; Guo & Goh, 2016; Kim et al., 2017; Krämer et al. 2016; Liew et al., 2016; Romero-Hall et al., 2014). Their results turned out to be inconsistent regarding the effects of pedagogical agent emotional cues on learner emotions and performance. On the other hand, Liew et al. (2017) examined a pedagogical agent with a dual-channel emotional design. They found the pedagogical agent with dual-channel emotional cues to be more effective than the one without emotional cues in promoting learner emotions, motivation, and test performance. Comparing to the single-channel design, this might be explained as the emotional experience of humans is jointly created from visual and audio messages (Gerdes et al., 2014). The interaction between visual and audio channels could enhance a human's perception of emotions. Also, allocating emotional cues via two channels could help avoid cognitive overload in any single channel (Miller & Buschman, 2015). Nevertheless, more evidence is necessary to strengthen our understanding of the pedagogical agent with dual-channel emotion cues.
The present study aimed at further exploring the effects of a pedagogical agent with dual-channel emotional cues on learner emotions, cognitive load, and knowledge transfer performance. Several adjustments have been made that differ this study from previous ones. First, visual and auditory emotional cues were fully generated by machines. Second, participants were entirely recruited from a Chinese university. Third, a control group without a pedagogical agent was added to test if the agent-based design is indeed more effective in our context (Heidig & Clarebout, 2011). Research questions that guided this study are as follows:
RQ1: To what extent will learners who assigned the pedagogical agent with dual-channel emotional cues, the pedagogical agent without emotional cues, or the voice narration without emotional cues differ in their perceived emotions? RQ2: To what extent will learners who assigned the pedagogical agent with dual-channel emotional cues, the pedagogical agent without emotional cues, or the voice narration without emotional cues differ in their perceived cognitive load? RQ3: To what extent will learners who assigned the pedagogical agent with dual-channel emotional cues, the pedagogical agent without emotional cues, or the voice narration without emotional cues differ in their knowledge transfer performance?
Seven sections structure this paper. The first section discusses the theoretical framework. Section two presents the relevant studies. Section three describes the research design and research questions, and the fourth section describes the pedagogical agent design, subject matter, instruments, participants, and procedures. The last three sections present results, discussion, and conclusions.
Theoretical Framework
Several theories concerning the multimedia learning environment have different perspectives towards the design, processing, and effects of pedagogical agent emotional cues (Mayer, 2005; Mayer & Moreno, 1999; Sweller, 2011). Among these theories, the cognitive theory of multimedia learning and its extended version, the cognitive-affective theory of learning with media, have provided the most comprehensive consideration regarding learning with interactive multimedia elements. This section discusses the assumptions and principles underlying these two theories to generate implications for pedagogical agent emotional cues.
The cognitive theory of multimedia learning (CTML) described how learners select, process, and integrate information in the multimedia learning environment (Mayer, 2005). CTML holds three fundamental assumptions. The first assumption, dual-channel, claimed that learners process information through visual and audio channels (Cuevas & Dawson, 2018). The second assumption, limited capacity, illustrated that each channel has limited capacity and can only handle a small amount of information at a time (Miller & Buschman, 2015; Sweller, 2011). The third assumption, active processing, described how learners construct meanings by actively selecting, organizing, and integrating relevant information (Brame, 2016).
To accommodate advanced learning media that supports interactive elements such as the pedagogical agent, Moreno and Mayer (2007) expanded the initial CTML framework with additional assumptions forming the cognitive-affective theory of learning with media (CATLM). One of the key assumptions indicated that motivational and affective factors may influence learner’s cognitive process. To be more specific, a learner in the multimedia learning environment may be emotionally affected by affective elements according to a conception called emotional contagion (Hatfield et al., 1993). This conception was supported by studies that examined both positive and negative emotions (Becker, 2014; Frenzel et al., 2018; Meyer & Turner, 2002). Then, learners with enhanced positive emotions may be more actively engaged in the selection, organization, and integration of information, which are the central processes for learning in the multimedia environment. Um et al. (2012) and Plass et al. (2014) provided evidence that emotional design such as warm colors and face-like shapes can promote learner positive emotions and lead to better performance on comprehension and transfer tests.
An additional assumption that might follow is that a pedagogical agent's emotional cues can also enhance learner engagement and improve learning performance. However, recent empirical findings revealed that pedagogical agent emotional cues in some cases do not improve learning performance (Guo et al., 2015; Guo & Goh, 2016; Kim et al., 2017; Liew et al., 2016; Romero-Hall et al., 2014). Some researchers suggested that a major reason for the above results could be only presenting emotional cues in one channel, resulting in learners not perceiving those emotional cures or perceiving them incorrectly (Guo et al., 2015; Liew et al., 2016). Therefore, it might be necessary to consider the dual-channel assumption when designing pedagogical agent emotional cues.
Mehrabian (1971) determined that emotions transmitted in communication are carried by word (7%), voice (38%), and facial expression (55%). Although there were doubts about each element's weights, these emotional cues' importance has been well-recognized (Hale et al., 2017). Mehrabian's equation suggested that both visual and audio channels may be significant in transmitting emotions, supporting the idea that information is processed in two channels.
Another perspective regarding pedagogical agent emotional cues is that emotional cues are irrelevant to the learning content. Thus, emotional cues might take up the limited cognitive capacity and thus hinder learning (Rey, 2012). Nevertheless, based on the dual-channel assumption, researchers have suggested that allocating emotional cues to visual and audio channels could potentially avoid cognitive overload in a single channel (Park, Flowerday, et al., 2015; Park, Knörzer, et al., 2015).
Summary
Pedagogical agent emotional cues might have an important role in a multimedia learning environment. However, to achieve the optimum effects, emotional cues need to be designed cautiously, considering assumptions underlying the cognitive-affective theory of learning with media. On the one hand, the dual-channel assumption indicated that pedagogical agent emotional cues should involve visual and audio channels to enhance learners' perception. On the other hand, the limited capacity assumption implied allocating emotional cues to visual and audio channels could help avoid cognitive overload in either channel. Therefore, different assumptions and perspectives may indicate an integrated conception that a pedagogical agent with dual-channel emotional cues is more effective in supporting learning.
Review of Relevant Literature
To further understand the effects of pedagogical agent emotional cues, this section examines related empirical studies. The review was organized into two sub-sections depending on the emotional design of pedagogical agents. The first sub-section looked into studies that designed pedagogical agent emotional cues in a single channel. The second sub-section examined a study that designed a pedagogical agent with dual-channel emotional cues. Both sub-sections discussed how these pedagogical agents affect learner emotions, cognitive load, and learning performance.
Pedagogical Agents with Single-Channel Emotional Cues
Pedagogical agents can deliver emotional cues via visual and audio channels through mediums such as facial expression, on-screen text, and voice. This section reviews studies that considered only one channel. Kim et al. (2017) examined the effects of on-screen encouraging remarks in an agent-based tutorial on middle school students' mathematics learning and anxiety. Two groups received learning materials in which a static female agent image was inserted. The agent provided instructions through the on-screen textbox. One of the two groups also received encouraging texts from the agent. However, results showed no significant difference between the two groups in either mathematics learning or anxiety.
Instead of using on-screen text, three studies presented evidence on delivering emotional cues via facial expressions (Romero-Hall et al., 2014; Liew et al., 2016; Krämer et al., 2016). Frist, Romero-Hall and colleagues studied three groups of college students in studying African history. The three groups were assigned an emotionally expressive pedagogical agent, a non-expressive pedagogical agent, or a none-agent. The pedagogical agent was a male figure that spoke with an informal human voice. The expressive pedagogical agent also expressed pre-set facial expressions while speaking. Analysis of learning achievement and emotions indicated that the group receiving emotional cues had significantly lower achievement as well as more negative emotions than the other two groups. Liew et al. (2016) found similar results when comparing the effects of a male pedagogical agent with a smiling facial expression against one with a neutral facial expression. Both agents communicated through the same machine-generated voices. University students participated in this study and received a tutorial from one of the pedagogical agents on the C language. Not only did the post-test on knowledge comprehension reveal no significant difference, but learners assigned the smiling agent also reported significantly lower positive emotions. Krämer et al. (2016) investigated a pedagogical agent with a smiling expression and nodding behavior and its effects on adult learners in solving math problems. However, pedagogical agents in this study did not speak or present texts. They indicated that emotional cues from the pedagogical agent significantly enhanced learner performance compared to the pedagogical agent without emotional cues. However, perceived positive emotion did not differ between the two groups.
Guo et al. (2015) and Guo and Goh (2016) combined facial expressions and on-screen messages to deliver emotional cues in the visual channel. Pedagogical agents in both studies were cartoon-like characters that communicated via on-screen texts. The main difference was that Guo et al. (2015) designed a classroom scenario while Guo and Goh (2016) used a game-based learning approach. University students in the two studies were assigned to three conditions to learn about information literacy: an affective pedagogical agent, a neutral pedagogical agent, or none-agent. Both studies indicated that the three groups did not differ in knowledge retention. Learners in the affective agent group reported higher motivation and enjoyment. Guo et al. (2015) concluded that one of their study's limitations was the lack of audio cues, which could improve the learning outcome.
Liew et al. (2020) recently examined the effects of an enthusiastic and calm voice narration on undergraduates' transfer performance and cognitive load in learning computer algorithms. They reported that the lively voice narration leads to significantly higher transfer performance than the calm voice narration. Moreover, the calm voice in their study led to a higher germane cognitive load for non-native speakers.
The previous studies suggest that while emotional cues improve learning results are inconclusive. Although Liew et al. (2020) provided some initial evidence of the positive effects of enthusiastic audio cues on transfer performance, follow-up studies in different contexts and backgrounds are necessary. As for learner emotions, while learners in Guo et al. (2015) and Guo and Goh (2016) reported higher positive emotions after interacting with affective pedagogical agents. Liew et al. (2016). Romero-Hall et al. (2014) found that learners assigned affective pedagogical agents experienced more negative emotions. Liew et al. (2016) suggested an emotional cue could carry multiple meanings, thus providing emotional cues from more than one channel could potentially help learners perceive emotional cues more precisely. Furthermore, evidence suggested that emotional cues in voice narration may only influence participants who natively speak a different language from the pedagogical agent (Liew et al., 2020).
Pedagogical Agents with Dual-Channel Emotional Cues
Liew et al. (2017) designed a pedagogical agent delivering emotional cues in both visual and audio channels with facial expressions, hand gestures, voice, and on-screen texts. Liew paired university freshmen with either the affective pedagogical agent or a neutral pedagogical agent to learn a basic programming language. Findings indicated that the affective pedagogical agent significantly enhanced learners’ positive emotions, motivation, and performance. Also, no significant difference in cognitive load was found between the two groups.
Although Liew et al. (2017) provided evidence that dual-channel emotional cues may be more effective, several questions remained. First, Davis (2018) suggested that hand gestures alone may carry significant social and cognitive functions. Thus, there has been a need to consider the effects of emotional cues, including facial expressions, voice, and words, separately from hand gestures. Second, Liew et al. (2017) used recorded human voice for the pedagogical agent based on the assumption that the human voice was better than the machine-generated voice, defined as the voice effect (Atkinson et al., 2005). However, with newer voice-generating technology, recent studies have reported that modern machine-generated voice was comparable to the human voice in terms of their effects on learning outcomes, cognitive load, and learners’ perceptions (Craig & Schroeder, 2017, 2019; Davis et al., 2019). Last, Liew et al. (2017) did not include a control group without the pedagogical agent. Thus, it is uncertain if having a pedagogical agent with dual-channel emotional cues produced better learning outcomes than tutorials without a pedagogical agent (Heidig & Clarebout, 2011).
Research Design
After reviewing the literature, we made three adjustments to the research design. We operationalized emotional cues as facial expressions, voice, and auditory remarks. Second, machine-generated voices were employed instead of recorded human voices. Last, we incorporated a control group.
The present study implemented a between-subjects experimental design with two treatment groups and one control group (Charness et al., 2012). The independent variable was the type of pedagogical agent. Dependent variables were learner emotions, cognitive load, and knowledge transfer performance. The researchers assigned the first group to a pedagogical agent with an embodiment, voice, and emotional cues (i.e., Affective Pedagogical Agent, APA group). The second group was assigned a pedagogical agent with an image and voice but without emotional cues (i.e., Neutral Pedagogical Agent, NPA group). The control group was assigned voice without embodiment or emotional cues (i.e., Neutral Voice, NV group).
Method
Pedagogical Agent and Emotional Cues
The pedagogical agent was developed with the Media Semantics Character Builder (Version 5.4.8, 2017). It was portrayed as a 35-year-old male teacher dressed in suits. The pedagogical agent's location was at the lower right corner of the screen showing only body parts above shoulders to avoid potential distractions. In both APA and NPA groups, the pedagogical agent had natural eye blinks and head movements to mimic humans. Also, the pedagogical agent had synchronous lip movements while talking. The emotional cues of the pedagogical agent were displayed through facial expressions, voice, and auditory remarks.
For facial expressions, the affective pedagogical agent in the APA group employed smile, wink, and a suggestive facial expression shown in Figure 1. The neutral pedagogical agent in the NPA group only had a neutral facial expression.

Screenshot of the tutorial and pedagogical agent facial expressions.
The present study used a speech synthesizing engine from the Baidu AI Open Platform (2019) for voice. Baidu has one of the top speech synthesizing engines for mandarin in the market, and it provided various types of voices. The APA group adopted an affective male voice, while the NPA and NV groups adopted a standard male voice. 1
For auditory remarks, participants received responses after answering two multiple-choice questions inserted in the tutorial. Participants in the APA group received more confirming and encouraging responses such as “Well done! This is a very useful way to find information”. On the other hand, learners in the NPA and NV group received neutral responses such as “Ok” and “I see.”
Subject Matter
Participants in this study received the tutorial about a six-stage problem-solving model known as the Big 6. The six stages included task definition, information seeking strategies, location and access, use of information, synthesis, and evaluation. The tutorial was structured, as shown in Figure 2. All contents in the tutorial were presented in Mandarin.

The organization of the tutorial.
Instruments
Instruments used in this study included a questionnaire and a knowledge transfer test. The questionnaire consisted of demographic, emotion, and cognitive load items. The demographic items asked participants about their student id, age, gender for descriptive purposes.
The emotion items were acquired from the control-value theory of achievement emotions, which identified emotions most related to learning activities and performance (Pekrun, 2006). Enjoyment, pride, anxiety, and hopelessness were selected from the achievement emotions for several considerations. First, according to emotional contagion, the smiling expression might induce learner enjoyment. Two empirical studies also reported that the affective agent increased learners' enjoyment (Guo et al., 2015; Guo & Goh, 2016). Second, pride has been linked to intrinsic motivation for all students (Pekrun et al., 2002). Since emotional cues in the present study included encouraging remarks which might influence learner pride. Third, anxiety was included because it was the most commonly studied emotion in learning (Pekrun et al., 2011). However, since anxiety might either activate or hinder learning depending on its level, a deactivating negative emotion, hopelessness, was measured as a supplement to anxiety (Pekrun et al., 2002). Participants rated each emotion on a nine-point Likert scale (e.g., Please rate your current level of enjoyment.), with one being the weakest and nine being the strongest.
Researchers have posited that it might not be possible to measure the intrinsic, extraneous, and germane cognitive load individually (Kirschner et al., 2011; Schroeder, 2017). Thus, the cognitive load was usually measured as perceived task difficulty and invested mental effort (Paas, 1992). However, Van Gog and Paas (2008) have suggested that these two items are related and differ significantly only in extreme conditions. Paas & van Merriënboer (1993) have also argued that the perceived mental effort can only be interpreted meaningfully together with learner performance. Therefore, we operationalized cognitive load as perceived task difficulty and instructional efficiency. Perceived task difficulty was directly measured on a nine-point Likert scale. As for instructional efficiency, it was calculated using equation (1) proposed by Paas and van Merriënboer (1993), where invested mental effort was measured on a nine-point Likert scale and the measurement of transfer performance was indicated in the following paragraph.
In the knowledge transfer test, participants were asked to find errors in a Big 6 case report. The case report described procedures for retrieving and organizing information to build a digital album. In this case report, researchers have intentionally created 16 errors, all related to the Big 6 model. For each mistake participants found, they received one point. The possible total score for the knowledge transfer test ranged from zero to 16.
Participants
Participants in this study initially included 66 juniors registered in the Modern Educational Technology course from a major university in central China. Participants were from nine different majors, such as chemistry, mathematics, and art. All participants possessed the necessary skills for learning online because each program in this university required a certain amount of online or blended courses. After participants signed up for the experiment, they were each assigned a random number using the Excel “RAND=” function. The researcher then sorted participants based on their numbers in ascending order and put the first 22 people into the APA group, the next 22 people into the NPA group, and the rest 22 people into the NV group. However, four participants failed to complete all required procedures later due to personal reasons. The researcher discarded those incomplete data. The final sample consisted of 41 (66%) females and 21 (34%) males with an age mean (standard deviation) of 21.26 (0.571).
Procedures
The experiment took place in one of the university computer labs. The researcher introduced the purpose and procedures of the investigation to the participant. After the participant granted his/her oral consent, a tutorial was presented on a pre-loaded web page. Upon finishing the tutorial, each participant completed a transfer test. Participants received a nominal cash reward after the session.
Results
To answer the research questions, one-way ANOVAs were performed on learner emotions, cognitive load, and knowledge transfer score using SPSS 22. Table 1 presented the means, standard deviations, and F ratios of the dependent variables for APA, NPA, and NV groups.
Means and standard deviations of dependent variables for three groups.
Note. n indicated group size; Emo indicated emotion; CL indicated cognitive load.
*p < .05, two-tailed.
For learner emotions, the F-test showed significant differences among three groups in enjoyment (F (2, 59) = 3.25, p = 0.046,
Mean differences and significance values for post-hoc comparisons.
Note. MD indicated mean difference; Emo indicated emotion; CL indicated cognitive load.
*p < .05, two-tailed.
For cognitive load, the F-test showed significant differences among three groups in instructional efficiency (F (2, 59) = 4.67, p = 0.013,
For knowledge transfer performance, the F-test showed a significant difference among the three groups (F (2, 59) = 4.19, p = 0.020,
Discussion
The present study examined the effects of the pedagogical agent with dual-channel emotional cues on learner emotions, cognitive load, and knowledge transfer performance. Overall, the present study produced several meaningful findings demonstrating that the affective pedagogical agent can benefit learning. First, learners assigned the affective pedagogical agent reported a higher sense of pride than learners assigned the neutral pedagogical agent. Second, the perceived task difficulty did not differ across the three groups while the APA group reported significantly higher instructional efficiency than the NPA group. Third, learners assigned the affective pedagogical agent performed better on the knowledge transfer test than those given the neutral pedagogical agent or the neutral voice narration.
RQ1: To what extent will learners who assigned the pedagogical agent with dual-channel emotional cues, the pedagogical agent without emotional cues, or the voice narration without emotional cues differ in their perceived emotions?
Statistical findings supported that learners assigned the affective pedagogical agent reported a higher sense of pride than those given the neutral agent. According to Pekrun et al. (2002), pride is an emotion related to a learner’s intrinsic motivation. Learners interacting with the affective pedagogical agent experienced a higher sense of pride, indicating more confidence in their ability to master content than learners interacting with the neutral pedagogical agent. This result is consistent with findings from Guo et al. (2015), where learners who interacted with an affective embodied agent reported a higher level of confidence than learners with a neutral embodied agent. Similarly, Liew et al. (2017) also concluded that an enthusiastic agent had induced higher intrinsic motivation than a neutral agent.
However, no significant difference was found in pride between APA and NV groups or between NPA and NV groups. Thus, it might indicate that the difference in pride is more related to how a pedagogical agent's emotional cues are designed than whether a pedagogical agent is present. Nevertheless, pride as an emotion needs to be further studied to understand its role in agent-based multimedia learning.
As for enjoyment, though omnibus analysis did indicate a significant variation, the post hoc test showed no difference among groups. Statistically, this might be due to the Tukey follow-up test being more conservative than the F-test to control the overall alpha level (Howell, 2010). Educationally, there could be two reasons. First, as found by Krämer et al. (2016), the emotional cues of the pedagogical agent were not strong enough to induce a significant difference in learns’ perceptions. Nevertheless, several studies, including Krämer et al. (2016), have concluded that emotional cues could promote a behavioral difference even though they did not induce a perceptual difference among learners (Huang et al., 2011; Von der Pütten et al., 2010). As in the present study, the APA group had better knowledge transfer performance than other groups. That is to say, emotional cues may influence learning without being explicitly reported. Second, Muir et al. (2017) suggested that learner emotions fade over time. Thus, using a questionnaire might not have captured the change in enjoyment during learning.
Results indicated that learners did not differ in negative emotions. Our findings contradict Liew et al. (2016) and Romero-Hall et al. (2014), finding that students who interacted with the affective pedagogical agent experienced more negative emotions. Liew et al. (2016) discussed that a smile has at least four different meanings. Thus, learners might perceive facial expressions differently without other references. Considering that these two studies only used the visual channel, results from the present study support presenting emotional cues via two channels may help learners perceive the emotions more precisely.
RQ2: To what extent will learners who assigned the pedagogical agent with dual-channel emotional cues, the pedagogical agent without emotional cues, or the voice narration without emotional cues differ in their perceived cognitive load?
The APA group did not differ from the other two groups in perceived task difficulty. Our finding is consistent with Liew et al. (2017), where the enthusiastic pedagogical agent did not significantly increase more cognitive load than the neutral pedagogical agent. As for the instructional efficiency, we found that the APA group's instruction was significantly more efficient than the NPA group. From the learners’ perspective, this result supports the assumption regarding motivational and affective factors in CATLM. Learners who received emotional cues became more engaged in information selection, processing, and integration, contributing to better knowledge transfer performance (Mayer, 2005). Another possible explanation is related to cueing. Cueing in multimedia learning environments can focus learners’ attention on the more relevant information to allocate their limited cognitive resources more effectively (Yung & Paas, 2015). Bayliss et al. (2010) also found that the process of guiding someone’s attention was influenced by the guider’s facial expressions. Emotional cues from the pedagogical agent might have played a similar role in directing learners’ attention. However, additional studies are required to investigate how pedagogical agent emotional cues may affect learners' attention in learning-related contexts.
Overall, taking results on perceived task difficulty and instructional efficiency into consideration, the present study supports Park, Flowerday et al. (2015) findings that presenting emotional cues through visual and audio channels activates motivating and encouraging functions of emotional cues while avoiding increasing the cognitive load perceived by learners.
RQ3: To what extent will learners who assigned the pedagogical agent with dual-channel emotional cues, the pedagogical agent without emotional cues, or the voice narration without emotional cues differ in their knowledge transfer performance?
The APA group assigned the affective pedagogical agent performed significantly better on the knowledge transfer test than the other two groups. Our findings support Liew et al. (2017) that learners who interacted with the enthusiastic pedagogical agent significantly outperformed those who interacted with a neutral pedagogical agent. More importantly, since the present study used machine-generated voices for pedagogical agents, it also provides supporting evidence to Craig and Schroeder (2017, 2019) and Davis et al. (2019) that machine-generated voice is comparable to human voices in a Mandarin context. Moreover, this study generally also supports that emotional presence is an essential component contributing to a worthwhile learning experience (Cleveland-Innes & Campbell, 2012; Majeski et al., 2018).
Limitations and Future Directions
Although meaningful findings were reported in the present study, there were two limitations. The first limitation is related to ecological validity. A learner's emotion is subject to the influence of his/her surroundings. It is uncertain if the study findings can be reproduced outside of a controlled setting. The second limitation is generalizability. This study was conducted in a relatively short learning session with only juniors from a university. More studies are needed to examine the affective pedagogical agent in a more extended learning session or even multiple sessions to investigate how emotional cues may change. It will also be worthwhile to repeat the experiment with different types of learners, such as K-12 students.
Based on the findings from this study, future studies may be conducted in two directions. The first direction involves looking deeper into the two information processing channels to investigate how emotional cues are processed respectively. The second direction is related to the mechanism of emotional cues in learning, meaning through which paths emotional cues are connected to learning performance.
Conclusions
The present study investigated the effects of pedagogical agent dual-channel emotional cues on learning. Results suggest that dual-channel emotional design is effective in inducing learners’ positive emotions and enhancing learning performance. Also, allocating emotional cues into two channels can help avoid cognitive overload. This study provides evidence on how emotional cues should be employed and designed for agent-based multimedia learning.
Statements on Open Data, Ethics, and Conflict of Interest
In this research, all participants took part in the experiment voluntarily. Data were collected and used after participants granted their consent. Personal or personally identifiable information was not included in the data. There is no conflict of interest regarding this work.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the following grants: National Natural Science Foundation of China, Grant Number: 61772012, National Natural Science Foundation of China, Grant Number: 61977035, and Ministry of Culture and Tourism of China, Grant Number: 20201194075.
