Abstract
With the rising impact of social robots, research on understanding user perceptions and preferences of these robots gains further importance, especially for emotional expressions. However, research on facial expressions and their effects on human-robot interaction has had mixed results. To unpack this issue further, we investigated users’ emotion recognition accuracy and perceptions when interacting with a social robot that displayed emotional facial expressions or not in a storytelling setting. In our experiment, twenty-eight participants received verbal feedback either with or without facial expressions from the robot. Participants showed a significant recognition accuracy effect for emotions and significantly higher clarity for disgust, happiness, and surprise in the facial expression condition than in the no facial expression condition. In addition, participants rated Milo with facial expressions significantly higher than Milo without facial expressions in warmth and attractiveness. No significant differences were found among the rating scores of naturalness. The results from the present study indicated the importance of facial expressions when considering design choices in social robots.
Introduction
With the advancement of technologies in social robots, user expectations for human-robot interaction have increased. One such advancement is in more accurate portrayals of emotional expressions in physical social agents. Gockley, Forlizzi, and Simmons (2006) indicated that users were more likely to interact with an emotional robot, while also being able to accurately detect basic emotions such as happiness, sadness, and a neutral emotion in the robot. Additional research has shown that people are more likely to feel comfortable and positive towards humanlike agents rather than robotic counterparts (Broadbent et al., 2013). Studies have further examined the impact of robot type, voice, and emotion on perception of the robot (Ko et al., 2020). From these previous studies, it is indicated that different modalities have diverse influences on how users perceive a robotic agent.
Despite the extensive research that has been conducted on the effects of various modalities on the user perception of a social robotic agent, there is a lack of substantive research on robots’ emotional facial expressions. To bridge this research gap, we examined the effects of emotional facial expressions in a social robot on the users’ emotion recognition accuracy as well as their perceptions of the robot.
Anthropomorphism
Research has examined factors that affect the perceived anthropomorphism of social agents, including physical presence and facial features. Studies have shown that physical embodiment improves task performance, trust, and cooperation (Leyzberg et al., 2011, Wilma et al., 2011). Broadbent et al. (2013) showed that the interaction between participants and the physically present robotic agent was rated as more natural than the agent without physical presence. These studies indicate a significant increase in anthropomorphization from the physical presence of social robots.
Regarding facial features, when presented with a robotic agent that looked at the participant, looked away from the participant, or did not move its head at all, participants rated both movement options as more natural than the condition where the robotic agent did not move its head (Wang et al., 2006). Similarly, a robot with a humanlike face was considered to have higher anthropomorphic ratings compared to a silver face and no face while completing tasks that require human capabilities, such as storytelling (Broadbent et al., 2013). These studies indicate the importance of congruent facial features that emulate human facial features with respect to anthropomorphization.
To examine the results from the previous studies, there is an importance on physical presence and humanlike facial features. These factors are applied to the present study using a physical robot, non-repeating voice lines, and facial expressions created by the robot application programming interface (API).
Robot Emotion
People use emotions to express their feelings toward an event or a relationship without much thought. These emotions could be expressed through voice, facial expressions, and gestures. Emotions are not only contagious among humans but also within human-robot interaction, indicating that participants feel empathy towards the robot, and their valence rating, arousal rating, and task performance were also influenced by the robot’s emotions (McColl et al., 2016). Based on the finding, the present study implemented emotions on the robot through facial expressions and investigated the impact of emotive facial expressions on participants’ robot storytelling.
A study (Borutta et al., 2009) showed that although the participants rated speech intelligibility lower in the emotional robot conditions, they preferred the robot with emotions as opposed to a neutral robot. The emotion recognition time for the emotional robot condition was also faster. In the present study, we observed a higher clarity rating of emotions from the robot with facial expressions.
Robot Facial Expression
Facial expressions are a common form of non-verbal communication among humans to express emotions. Displaying appropriate facial expressions according to situations is also necessary in human-robot interaction. When the robot has the ability to change its emotions based on the environment and expresses its emotions through facial and whole-body motions, it is perceived as more friendly, human-like, favorable, and available (Ko et al., 2020). In addition, adding facial expressions to a robot improves participants’ emotion recognition ability meaning that people will identify emotions faster and more accurately from the robot with facial expressions (Harwood, Hall, & Shinkfield, 1999).
However, there are contradictory results on whether emotional synchronization using facial expressions benefits human-robot interaction. In the study by Gonsior et al. (2011), participants rated the social motivation model condition (where the robot displayed facial expressions according to its own internal model) higher than the mirroring condition (where the robot mirrored the participant’s emotion) and the no emotion condition in the scales of trust, perceived sociability, perceived enjoyment, and intention to use. Conflictingly, in Li and Hashimoto’s study (2011), the robot with the mirroring condition was rated higher than the robot without the mirroring condition in terms of positive characteristics, such as pleasant and friendly. Furthermore, a study showed that when a robot has facial expressions similar to (but not exactly the same as) humans, people feel uncomfortable interacting with them, perhaps due to the Uncanny Valley (Seyama and Nagayama, 2007).
In sum, there have been mixed results about the effects of adding facial expressions to robots on people’s perceptions. We believe that these different effects may come from different robot designs, user characteristics, tasks, and contexts. To explore this problem further, the present study consists of facial expressions as one of the experimental variables of a humanoid robot, Milo, to determine the effects of facial expressions on people’s perception in a storytelling environment. We hypothesized that Milo with facial expressions will improve people’s emotion recognition accuracy and receive higher scores on likeability, suitability, clarity, and other positive characteristics.
Methods
Participants
In the experiment, 28 college students were recruited as participants (16 female and 12 male) with a mean age of 21.11 years old (SD = 1.95). Participants tended to rate themselves as only having little previous interaction with social robots in a seven-point Likert-scale question (1: Never, 7: Always; M = 1.96, SD = 1.50).
Experimental Design
A 2 (Expression Types) × 7 (Emotions) within-subjects design was applied. As such, participants experienced both with facial expression and without facial expression conditions across all seven emotions (anticipation, anger, disgust, fear, happiness, sadness, and surprise). Beyond both facial expression and emotions, participants experienced two stories, with one containing the facial expression condition and the other containing the no facial expression condition. The order of the facial expression condition was counterbalanced to minimize the order effects. The matching between the story and the facial expression condition was also alternate. The order of the emotions was fixed because they followed the storylines.
Equipment and Stimuli
A humanoid robot, Milo (Figure 1, left), was used in the experiment because of its distinct facial expression capabilities. During the experiment, the robot played a verbal response for each emotion. In one condition, the robot also used a facial expression to portray the emotion of the verbal response. This was done across two stories with unique verbal responses per story and facial expressions. All verbal responses used the

Milo robot with different emotive facial expressions (From left, anger, disgust, fear, anticipation, happiness, sadness, and surprise) and experimental setup.
Milo robot’s built-in text-to-speech (TTS) engine. The male TTS voice made use of distinct punctuation to ensure that the inflection matched a proper emotional utterance. Also, the TTS voice moved Milo’s mouth, regardless of the content of the speech. Facial expressions were created making use of five face servos to match the seven emotions used. These servos controlled the robot’s eyelids, eyebrows, eye placement, mouth, and smile. Each servo was adjusted on a normalized scale from 0 to 200. For anger and sadness, the premade command “expression frown” from Robokind’s API was further adjusted. To make more appropriate emotional expressions, further adjustments were made after the premade command was run.
Seven emotions were presented throughout each story, six of which were derived from Ekman’s basic emotions (anger, disgust, fear, happiness, sadness, and surprise). These emotions were selected due to their importance in psychological studies.
The seventh emotion, anticipation, was selected from Plutchik’s basic emotions (1980). Its inclusion allowed us to add one more positive emotion in addition to happiness. The seven emotions fit into both stories (“The three little pigs” and “The boy who cried wolf”).
Procedure
The Milo robot was placed to the right of the participant in the experiment room (Figure 1, right). After the COVID-relevant research protocol with the IRB consent, participants were instructed to sit down, read a story until they come across a picture, show the picture to Milo, and fill out a questionnaire.
Each participant went through both stories, experiencing all seven emotions for each story and the facial expressions for one story. The story and facial expression conditions were counterbalanced. After concluding the story, participants finished questionnaires, and repeated the procedure for the second story, with an additional questionnaire being filled out after both stories were completed.
Participants filled out questionnaires after each emotional reaction, after completing the story, and after experiencing both stories. In particular, the questionnaires measured emotion recognition accuracy as well as general qualities of the robot, based on perceptions (likeability, attractiveness, warmth, naturalness, and uncomfortableness), reliability (honesty, trustworthiness), and human likeness (humanlike, robotic).
Results
The results of the present study were analyzed with 2 (Facial expression) x 7 (Emotion) repeated measures ANOVA. Greenhouse-Geisser correction was applied if the sphericity assumption violation occurred in the data.
Emotion Perception: Accuracy, Clarity, Suitability, and Robot Features
The accuracy of emotion recognition was calculated from the correct answers of participants’ emotion recognition over the total number of answers as a percentage. To achieve the calculation, the data were aligned with corresponding correct emotions to count the number of correct responses (1: correct, 0: wrong). Figure 2 shows the percentage for the accuracy of emotion perception over emotions by expression types.

Emotion recognition accuracy over expression types (left) & emotion recognition accuracy over emotions (right) (*: p < 0.5).
The data analysis revealed that participants in the facial expression condition showed numerically (but not significantly) higher emotion accuracy than those in the no facial expression condition (F(1, 27) = 2.60, p = .12; Figure 2 left). The overall emotion recognition accuracy over facial expression conditions was 68.88% for Milo with facial expressions and 60.71% for Milo without facial expressions. The result reveals that there are statistically significant differences among emotions (F(6, 162) = 12.95, p < .01, ηp2 = .32; Figure 2 right). There was no interaction between facial expression and emotion (F(6, 162) = 1.17, p = .33). The accuracy of anticipation (M = 89.29%, SD = .31) was significantly higher than all negative emotions (anger (M = 37.50%, SD = .31), disgust (M = 60.71%, SD = .49), fear (M = 57.14%, SD = .50), and sadness (M = 50.00%, SD = .50); p < .01).
The perceived clarity and suitability of emotions over expression types were computed based on the correct responses of recognized emotions only. Participants’ answers followed a 1 to 7 Likert-scale (1: Lowest, 7: Highest). Results showed that the expression types (F(1, 27) = 13.82, p < .01, ηp2 = .34) and emotions (F(4.32, 116.58) = 4.09, p < .01, ηp2 = .23 ) were found to have main effects on the clarity ratings without an interaction effect. No significant difference was found in the suitability rating. Milo with facial expression condition (M = 5.59, SD = 1.46) was rated significantly higher in overall clarity than Milo without facial expression condition (M = 4.76, SD = 1.77; Figure 3 left). In the Least Square Means Differences (LSD) post-hoc test, there were significant differences in Disgust (p = 0.03), Happiness (p < 0.01), and Surprise (p = 0.03) for the rating scores of clarity over emotions by expression types (Figure 4) specifically.

The rating scores of clarity and suitability over expression types (left) & the rating scores of clarity over emotions by expression types (right) (*: p < 0.5).

Warmth and attractiveness over expression types (*: p < 0.5).
Participants’ responses to a question, “What characteristics of the robot brought to mind that emotion?” were categorized into five groups: facial expression, context, speech content, speech tone, and others (comments that were not specified). The corresponding words from participants’ comments that fell into those categories were counted. Only the correctly recognized emotion answers were considered. Results showed that words related to speech content and facial expression occurred the most by 38.5% and 31.1% respectively in participants’ responses as to how they perceived the emotions from the robot. For each emotion, speech content impacted the emotion perceptions for Anger by 47.8%, Anticipation by 36.2%, Fear by 39.0%, and Sadness by 36.4%; facial expressions impacted the emotion perceptions for Disgust by 45.2%, Happiness by 40.7%, and Surprise by 41.5%.
Characteristics: Warmth, Honesty, and Trustworthiness
In the following perceptions result sections, emotion was not considered in the analysis because they are short-term “states” not “traits”. Therefore, the perception measures were analyzed in a one-way repeated ANOVA with two levels of expression types. The expression types of the robot were found to have a main effect on the warmth ratings (F(1, 25) = 13.37, p = .03, ηp2 = .17). Participants rated Milo with facial expressions (M = 4.63, SD = 1.64) significantly higher than Milo without facial expressions (M = 3.68, SD = 1.70) in warmth (Figure 4). No significant difference was found in the honesty and trustworthiness ratings.
Preferences: Likeability, Attractiveness, and Comfortableness
The expression types of the robot were found to have a main effect on the attractiveness ratings (F(1, 25) = 7.69, p < .01, ηp2 = .24). Participants rated Milo with facial expressions (M = 5.04, SD = 1.48) significantly higher than Milo without facial expressions (M = 4.07, SD = 1.51) in attractiveness (Figure 4). No significant difference was found for the likeability and comfortableness ratings.
Naturalness: Naturalness, Human-likeness, and Robotlikeness
No significant results were found for naturalness, human-likeness, and robot-likeness.
Discussion
Twenty-eight participants interacted with the robot, Milo, by reading scripts of two different fairy tales to the robot in the two different conditions where Milo did or did not have facial expressions when responding to participants. The results showed that 1) the emotion recognition was numerically higher in Milo with facial expressions condition than Milo without facial expressions condition; 2) Milo with facial expressions condition revealed a significantly higher score than Milo without facial expression condition for the clarity rating; and 3) Milo with facial expressions was considered significantly warmer and more attractive than the Milo without facial expressions.
Emotion Perception: Accuracy, Clarity, Suitability, and Robot Features
The results showed that the accuracy of recognizing emotions was significantly dependent on the emotions, but how clearly the participants perceived the emotions was significantly dependent on both emotions and facial expressions. The emotion recognition accuracy percentages for anger (37.50%) and sadness (50.00%) were the lowest among the seven emotions. This result was similar to the results from the previous study (Ko et al., 2020), in which negative emotions showed lower recognition accuracy. In the confusion matrix between presented and perceived emotions, anger was mostly misclassified as disgust (34.0%) and sadness was mostly misclassified as anticipation (21.0%). Other emotions that were correctly recognized the most, including anticipation (89.29%), happiness (82.14%), and surprise (76.69%) were correctly classified with a high proportion of count over 75%. Overall, among the seven emotions, five emotions were recognized numerically higher with the facial expressions, which bodes well for adding facial expressions to the social robots.
Interestingly, there were significant differences found in disgust, happiness, and surprise as a result of clarity measures by expression types. Calder et al. (2003) found that as age increases, adults started to recognize disgust better in terms of facial expression, implying the importance of facial expression for that emotion. Happiness and surprise could both be considered as emotions with positive valence among those seven emotions, and they were the two emotions that were correctly recognized the most following anticipation, which made sense that adding facial expressions significantly impacted how clearly participants perceived emotions from the robot. Given that speech content and facial expression each can contribute to the recognition of different emotions, it would be of interest for future research to disentangle further, and it can provide practical design guidelines (e.g., for higher recognition of anger, anticipation, fear, and sadness, focus more on speech content, whereas for higher recognition of disgust, happiness, and surprise, focus more on facial expressions).
Because there was no significant difference in suitability, the suitability result might imply that participants perceived the facial expressions of a humanoid robot as acceptable and appropriate.
Characteristics, Naturalness, and Preferences
The results showed that Milo with facial expressions was considered significantly warmer and more attractive than Milo without facial expressions. No significant difference was observed among perceptions in the rating scores of naturalness (naturalness, human-likeness, and robot-likeness). These results partially supported what we hypothesized at the beginning of the study. This result might imply that implementing facial expressions on humanoid robots improves human perception in human-robot interaction without the Uncanny Valley effect (Mori, 1970), which aligns with the result from the study by Koschate et al. (2016) that displaying emotions in a highly human-like robot could reduce uncanniness. At the same time, there was no significant difference occurred for the comfortableness scale as well, again indicating that participants did not feel more uncomfortable towards the robot with facial expressions than the robot without facial expressions as in Seyama and Nagayama’s study (2007).
Other Anecdotal Findings
During the experiment, participants’ reactions towards the robot with facial expressions could be obviously distributed into two groups: some participants interacted with the robot more, and other participants thought the robot was “scary” and “weird”. The participants who fell under the first group usually started to make side conversations with the robot. Those participants thought of the facial expressions in the robot as “impressive”, “cute”, “vivid”, “honest”, and “friendly”. The participants who fell under the second group thought the robot was “perfunctory”, “creepy”, and “unnatural” while presenting facial expressions. Although participants had different attitudes towards the robot, they all tended to put a personality on the robot and thought that the robot had its own thinking when the robot started having facial expressions. Broadbent et al. (2013) also mentioned the same result that people attribute mind and positive personality characteristics to a robot with a human face. In addition, nine out of 28 participants commented that the robot sounded “robotic”, “flat”, and “unnatural” after the experiments were over, likely due to the use of Milo’s text-to-speech (TTS) engine for voice lines.
Design Implications
Although more research and experiments are needed to support our findings, the results suggest that having facial expressions is appropriate to implement on a humanoid, social robot, which will increase the accuracy of emotion recognition and the clarity of emotion in human-robot interaction. With facial expressions, people’s positive perceptions of a humanoid robot, such as warmth and attractiveness, will be improved without increasing uncomfortableness. Since not all seven emotions were considered “clearly expressed’ in the facial expressions, the effectiveness of adding on certain emotions on social robots may be further investigated.
Limitations
The design of facial expressions in Milo was verified with four researchers and verified by conducting a pilot study with an extra participant. However, other participants may perceive emotions differently from those facial expressions. Furthermore, the servos that allowed for the creation of the facial expressions were limited to five degrees of freedom, making very complex emotions less feasible. The voice of the robot was created through its own program, which may explain why participants found the robot voice to be “robotic.” Finally, due to the outbreak of COVID-19, a limited number of participants were recruited to participate in the experiments in coordination with laboratory and university protocols.
Conclusion and Future Work
The impact of adding facial expressions in a social robot on human perceptions towards the robot’s emotions was investigated in the study. The result in the accuracy of recognizing emotions in expression types with and without facial expressions recommended recruiting more participants in order to find whether there will be an overall significant difference. A follow-up study with the robot voice used in the previous study (Ko et al., 2020) should be conducted to avoid unnecessary robot features that impact participants’ perceptions toward the emotions. In the future study, gender and age could be considered as factors, particularly because Lawrence, Campbell, and Skuse (2015) stated that gender and age differences may lead to differing emotive facial expression recognition results. The resulting future study will determine whether social robots with facial expressions are effective and more useful for special purposes for different age groups, developmental stages, or other specific audience groups.
