Abstract
Socially assistive robots (SARs) are a recent development in the field of artificial intelligence designed to assist human users through social interactions in vulnerable settings. SARs design has led to significant emotion recognition and empathy research in order to create sufficient and effective userrobot bonds. However, much of this research has focused on the successful emulation of emotion and empathy in robots, rather than the effects of prolonged exposure to such artificial emotion and empathy on subsequent human-to-human empathetic interaction and emotion recognition. In this paper, we present a review of interpersonal empathy in the context of human-robot relationships and discuss the potential consequences of these relationships on human-human social interactions. We also present recommendations for research into this field, drawing from real-life examples of humans bonding to inanimate artificial beings.
Introduction
Social robotics research has become more popular in the past 20 years, shifting toward the domain of socially assistive robots (SARs) (Tapus and Mataric 2007). SARs are social robots specifically designed for intimate personal interactions that will involve emotional bonding (Scheutz 2011; Berdasco et al. 2019); SAR development focuses on designing intelligent robots to assist human users through social interaction. The primary settings in which researchers suggest the use of SARs include hospitals, therapeutic clinics, eldercare facilities, homes, and schools (Berdasco et al. 2019). To successfully assist users in these settings, SARs need to gain users’ trust and form emotional connections (Feil-Seifer and Mataric 2005). This need has spurred research into emotion and empathy recognition and anthropomorphism of robots. Anthropomorphism is defined as the “tendency to imbue the real or imagined behavior of nonhuman agents with human-like characteristics, motivations, intentions, or emotions” (Epley, Waytz, and Cacioppo 2007). This includes the attribution of human-like mental states, emotional states, conscious experience, behavioral characteristics, and metacognition (Epley, Waytz, and Cacioppo 2007), and has been a focal point for human-robot interaction research to enhance the relationship between the two agents. Humans fre- quently anthropomorphize animals, objects, and other artifacts (Airenti 2018), a tendency demonstrated in experimental work since Heider and Simmel (1944). This tendency to anthropomorphize serves as a means to establish a relationship with the object, leading to the automatic attribution of mental states (Airenti 2018). It has been well documented that humans even treat technology, such as computers, as social actors (the Media Equation) (Nass, Steuer, and Tauber 1994; Nass and Moon 2000; Niculescu et al. 2013). Therefore, with designers’ intent to create SARs with human-like behavior and expression, we expect to find a heightened sense of anthropomorphism toward SARs. Results similar to the Media Equation have recently been reproduced with human-robot interaction (Hoffmann et al. 2009; Niculescu et al. 2013; Scheutz 2012); eerily, the social effects observed towards robots and virtual characters prove stronger than behavior towards typical computers due to the heightened attribution of anthropomorphism from the robots’ embodiment (Niculescu et al. 2013). The increasing design of human-like and emotive behavior of SARs can create a response in the user that results in an overtly uni-dimensional bond that may harm the ability to empathize with other humans. Humanmachine interaction research suggests that the more humanlike the robot’s appearance and behavior, the more people will be led to assume that the robot has human capacities (Scheutz 2012; Zhao, K Phillips, and F Malle 2019). Our concerns are multifaceted: There is a need to highlight the lack of studies surrounding the effects of prolonged interaction with increasingly anthropomorphic SARs on humans’ abilities to subsequently recognize emotion and empathize with other humans.
Background
Empathy Defined / Emotional Recognition Defined
Emotions are mental states experienced by humans associated with degrees of pleasure or displeasure that result in psychological changes that influence behavior and action (Cabanac 2002; Lim, Mountstephens, and Teo 2020)(Steinmetz, 2022); emotions are complex combinations of subjective experiences, psycho-physiological responses, behavioral responses, and cognitive processes (Iris 2009; Lim, Mountstephens, and Teo 2020) (Steinmetz, 2022). Paul Ekman classified six basic emotions that can be universally recognized—anger, disgust, fear, happiness, sadness, and surprise (Wiley et al. 2001; Lim, Mountstephens, and Teo 2020), that others recognize and align with during human interaction (Hatfield, Cacioppo, and Rapson 1994). Early psychological researchers studying emotion and emotion recognition emphasized how people mimic the facial, behavioral, postural, and vocal expressions of others— therefore converging emotionally as a consequence (Hatfield, Cacioppo, and Rapson 1993, 1994; Herrando and Constantinides 2021)(Steinmetz, 2022). This convergence allows others to successfully emphasize, understand, and communicate with one another—assisting in the synchronization of social interactions (Hatfield, Cacioppo, and Rapson 1994). Facial emotion recognition extends to the recognition of emotion through bodily expressions, suggesting that facial mimicry extends to a larger sensorimotor program associated with emotion. Further studies have shown links between mimicry and the reward center of the brain, as well as facial mimicry and individual levels of empathy (Hsu, Sims, and Chakrabarti 2018; Borgomaneri et al. 2020). Mimicry is also linked to liking and the affective empathy responses of another person’s emotional state (Hsu, Sims, and Chakrabarti 2018).
Empathy is often referred to as the ability to understand and recognize the feelings, experiences, and thoughts of another person (Drigas and Papoutsi 2023). Davis (1983) found two main experiences for empathy— cognitive and affective. Cognitive empathy refers to the awareness, understanding, and recognition of another’s situation or emotional state while affective, or emotional, empathy refers to feeling the same emotion as another person, feelings of distress in response to another’s situation (not necessarily the same exact feeling), and feeling compassion for another (Davis 1983; Tapus and Mataric 2007; Besel and Yuille 2010). However, some researchers note that there is little known about cognitive empathy’s connection to the recognition of facial expressions (Besel and Yuille 2010). More research on this topic is needed, as much of SAR design relies on design features that promote an illusion of cognitive empathy to initiate bonding with the user.
Emotion Recognition / Empathy and Technology
Our concern surrounding the inability to empathize with others following prolonged interaction with SARs has been demonstrated in various publications and longitudinal studies with technology usage in children and young adults. In 2010, researcher Sara Konrath analyzed empathy-measuring surveys provided to fourteen thousand American college students across multiple decades. Sara found that as time went on, empathy among young people decreased a substantial amount; In 2010, college students had 48% less empathy than people their age in 1979 (O’Brien, Hsing, and Konrath 2010). The suggested culprit was the rapid advent of social technology between 2000 and 2010. Further, Uhls and collaborators (2014) found that exposure to digital media decreased children’s abilities to read other people’s emotions (Uhls et al. 2014). Human facial expressions are so integral to emotion recognition that facial paralysis impairs our ability to detect and mimic facial expressions and emotions (Korb et al. 2016; Nellis et al. 2018), suggesting that facial paralysis results in an inability to socially connect, empathize, and recognize emotion. This is important because artificial intelligence robots are not designed to be fully affect-sensitive (i.e. capable of recognizing human affect and responding in a manner that requires the embodiment of affective processes), and may instead express no affective display, which communicates indifference or lack of empathy to the user (Scheutz 2012). Additionally, many SARs are not designed with human-like faces; quite rudimentary instead, current SAR prototypes are simply pieces of machinery attached together to form a body-like figure with dark circles for eyes and a mouth. Consequently, users do not receive real-time, human facial feedback that wires our brains for emotional and physiological mimicry and reaction. We are concerned that we may be atrophying our ability to detect emotion and provide empathy for others during face-to-face interaction. Recent research found that exposing participants to robot stimuli during emotion recognition impaired the accuracy of emotion recognition in subsequent stimuli of human facial expressions (Finkbeiner and Helton 2022). There are very few studies exploring the negative effects of exposure to robots on emotion recognition and as the field of SARs continues to grow, with plans to integrate SARs into educational, health, and home settings, the need to determine potential pitfalls in emotion recognition and empathy grows more significant.
SARs and the Current State of Emotional Recognition/ Empathy
As the field of SARs becomes popular, researchers are increasingly tasked with assessing the formation of human and robot bonds. Researchers found that people have stronger empathetic emotions and interactions with others they’ve developed a social relationship with or have a common background with (Tapus and Mataric 2007). To promote the same bonds with robots, similar interactions need to be cultivated between human users and SARs; the robot’s perceived empathy needs to be believable, but not overly realistic or humanistic as to provoke impossible expectations or the uncanny valley phenomenon— the dilemma of the relation between the human likeness of an object and the perceiver’s inclination towards it that would shift the perceiver’s empathy to repulsion as the object approached lifelike appearance (Mori 1970). The robot should also mimic facial, behavioral, and postural expressions similar to humans, as research has shown that mimicry creates liking between humans that imitate each other (Kuhne and Peter 2022; Hsu, Sims, and Chakrabarti 2018) and promotes bonding and prosocial behavior (Van Baaren et al. 2004). Tapus and Mataric (2007) proposed that certain features are required to produce an “empathetic” robot: the capability of recognizing and interpreting the user’s emotional state; the capability of processing and expressing emotions through vocal, facial, postural, and bodily expressions; the ability to communicate with users; and the ability to provide perspective. Additionally, verbal and non-verbal communication help mimic social cues that make robots appear intuitive and natural (Tapus and Mataric 2007). In reference to the two types of empathy, robots should also promote empathy through cognitive recognition, such as appearing as if the robot understands the human’s emotion or is affected by it, and should portray the illusion of emotion through facial, vocal, and bodily expressions to fit the social context. Tapus and Mataric (2007) developed a facial expression detection system to identify Ekman’s six basic emotions on human facial expressions and mimic these emotions through expression to convey the impression of cognitive and affective empathy with the user.
Unlike typical industrial robots, SARs have the ability to initiate motion and behavior—interacting with users at higher levels of sophistication and making (limited) decisions about what actions or behaviors to take (Scheutz 2011). This prompts users to attribute intentions and states to SARs in order to make sense of the machine (Scheutz 2011). In situations with robots who did not embody the human-like facial expressions and human-like body parts proposed for socially assistive robots, humans still attributed care to the robots as if the machines were human or petlike. For example, an autonomous robot at the Yuma Test Grounds in Arizona was entrusted with the task of denoting mines by stepping on them, causing destruction to the robot’s legs whenever a mine was detonated. As the robot dragged itself by its last leg to blow up the final mine, the Army Colonel demanded that the task be halted because it was “inhumane” to the robot (Scheutz 2011). Further, robo-dogs and Roomba vacuum cleaners elicited the same agency concerns in the majority of newsletter postings online (Kahn Jr, Friedman, and Hagman 2002). In a variety of studies, Roomba users developed gratitude for the Roomba and even cleaned the house for the Roomba, rather than allowing the Roomba to do its job, so that it could rest (Scheutz 2011). Turkle (2005) utilized robots such as Tiger
Electronic’s Furby, Sony’s AIBO, and Hasbro’s My Real Baby to observe the reactions of senior citizens and children to simple encounters with technology that simulate social interactions. When engaged in play, seniors in nursing homes were more comfortable re-enacting family scenes with the robot dolls provided rather than traditional dolls, with some even treating the robot dolls as if they were sentient (Turkle et al. 2006). These robot device users developed a tendency to form illusions about the non-existent mental states of the robot and/or form unidirectional relationships.
The bond between a robot and a user develops through the user’s nature to not only anthropomorphize inanimate objects, but the increasing human characteristics designers create robots with to facilitate human bond formation (Massa, Bisconti, and Nardi 2022). The bond is strengthened through the illusion of a humanistic nature, yet it is present only through a one-sided or hallucinatory form; the robot cannot form organic thoughts, emotions, and facial expressions like humans, it simply mimics and generates the phrases, actions, and bodily expressions encoded in its system. The system engages in affective behavior that suggests these are realized states of the robot, resulting in unidirectional emotional bonds with a robot that is not truly capable of these states nor emotional bonds (Scheutz 2012). This confirmatory dynamic and lack of partialization on part of the robot flame the infantilization debate in companion robot literature. The concern we are most interested in is the decreased ability to tolerate relational frustration with humans due to the conciliatory and confirmatory actions of robots— essentially, users can appease themselves by interacting with a human-like robot who will not disagree or differ from the user’s thoughts, actions, and desires, while humans are not robots that will simply affirm each thought, action, and desire of other humans during an interaction. This infantilization is described in Donald Winnicott’s “Use of An Object,” in which he describes the infant recognition and development process of interpersonal relationships (Winnicott et al. 2018). Winnicott characterizes the infant or subject’s in-distinction between an internal object or fantasy, an external object, or an independent thing, and the frustration produced when the external object does not conform to the child/subject’s internal desires. When this occurs, the child will attack the object that no longer fits the fantasy (Massa, Bisconti, and Nardi 2022). If the external object resists the aggression, then the subject or infant will have no effect on the object and the dominion of fantasy will be destroyed, however, if the object does not resist, then the autonomy of the external object is destroyed and the subject’s dominion of fantasy will be restored. In our view, we believe the polarization of our current society coupled with the decreasing levels of empathy found in the technological generation, may produce a similar infantile relational frustration toward independent, autonomous humans who do not conform to the dominion of those who are used to the fulfillment of the fantasy of dominion over their social robots.
Recommendations
Recommendation 1: Human-AI Chatbot Relationships
While understanding human-robot relationships is the priority for this line of research, there are few viable use cases in which human beings have had prolonged emotional relationships with robotics that are capable of expressing emotional states. Alternatively, we may look at the emergence of online chatbots as an analog for naturalistic human-robot unidirectional emotional relationships to examine changes in user behaviors in response to increasingly intimate human-agent relationships and their effects on the person’s ability to empathize with other human beings.
AI chatbots such as Replika AI and Xiaoice utilize advanced machine learning models to provide users with prolonged empathetic communication that is meant to be analogous to a friend or lover (Zhao, K Phillips, and F Malle 2019). AI chatbots rely on human-like design to evoke anthropomorphic emotional cues (Brandtzaeg, Skjuve, and Følstad 2022) which could be examined to draw predictions on their effect on humans when interacting with these same design cues in robotics Developers of AI Chatbots that are designed for empathetic relationships with humans face similar issues as those with social robotics, primarily dealing with human-robot relationships that are fundamentally different than human-human relationship (Froding and Peterson 2021). Observational research on the users of Replika AI would give us insight into the motivations of users involving themselves with artificial agents to provide a foundational understanding of the larger world of human relationships with artificial agents that could be translated to robotics.
Recommendation 2: Eye Tracking
As robots become increasingly human-like, it is important to understand how the intersection of human-like physical and emotional characteristics affects changes in behavior in human beings that interact with these robots. To properly capture these changes, it is important to understand how human beings gauge the physical manifestations of emotional responses and their influence on human social cues in response.
We recommend that researchers begin incorporating eyetracking measures as a key objective examination of longitudinal changes in human emotion recognition when interacting with robotics. Several researchers have acknowledged eye tracking as a viable examination tool for human emotional recognition (Tarnowski et al. 2020) with particular interest among the scientific community in pupil dilation (Partala and Surakka 2003; Snowden et al. 2016), fixation duration (Chi and Lin 1997) as well as gaze paths and heat maps (Tarnowski et al. 2020). Using these measures, we propose that researchers can examine what parts of the robot face human beings gravitate to when responding to emotional cues and how these patterns change with extended interactions with robots. Through this method, we may begin to track behavioral changes in humans as they have prolonged empathetic communications with robots.
Recommendation 3: Longitudinal Research
It is imperative to understand what the long-term ramifications are for human beings interacting with SARs that are capable of emotional expression. We propose the development of experiments that focus on repeated human-robot interactions in which a human is conversing with robots capable of emotional responses. This recommendation follows an increasing demand in the HRI community for developing longitudinal experiments. Within the HRI community, there is a strong desire of understanding the longitudinal effects of human-robot interactions (Hart et al. 2022), highlighting the shortcomings of repeated measures of human-robot experiments on generalizing long-term and dynamic relationships between humans and robots. By emphasizing longitudinal experiments, we hope to advance beyond human perceptions of robotics at the beginning of a relationship and extrapolate the development of the relationship to actually examine the relationship that unfolds in experimental settings.
Conclusion
As socially assistive robotics become increasingly incorporated into human life, researchers must begin to understand how human social dynamics could shift as a result of their increasing role as companions rather than tools. Changes in human relationships stemming from transcendent technology like the smartphone and social media should serve as cautionary tales for how unregulated growth of technology can cause irreparable changes in human social dynamics. To better prepare for the potentially seismic changes caused by SARS, it is imperative that research is devised which examines the changes in human behaviors when interacting with SARS.
