Abstract
Research studies in Human Robot Interaction (HRI) with social robots usually gather observational data in order to explore the dynamics of short and long-term interactions. The most common approach for the analysis of observational data is the proposal of a small number of behavioural units which frequency and/or duration is captured. As a consequence, comparing results between studies is difficult. The present manuscript proposes a procedure to assess the complete human-robot interactive activity. Experiences with two different robots were analysed using the novel instrumentation, leading to further considerations. Finally, general guidelines extracted from experimentation are proposed to assess the interaction quality between social robots and users. Further studies can be benefited from the proposed instruments, which are expected to be validated in different HRI contexts (e.g., school, hospitals, home) with different users (e.g., children, elderly, hospitalised people).
Introduction
Nowadays robots are more often part of everyday living helping people in diverse tasks, from housework [20] to emergency situations and catastrophes [40], as well as in many applied professional fields as Psychology or Medicine as helpful tools for developing some specific support jobs [17,46]. In particular, social robots, defined as robotic platforms designed to interact with people in a human-like manner [6,19] have already been proposed as supplementary tools for rehabilitation [37,47], autism therapy [11,29], treatment adherence and compliance, as well as to provide entertainment, enjoyment, and comfort [34,41,43].
In contrast with service or industrial robots, efficacy and efficiency, i.e., the quality of social robots commitment, is difficult to assess, quantify, or measure. Some well-known metrics are proposed in [35] for quantifying task effectiveness (TE), defining this concept as some measure of how well a task is actually performed. However, TE metrics do not provide any insight on how to improve the human-robot interface or how such interface might be modified to increase the effectiveness. Moreover, social robots usually are not task oriented, hence other quantitative metrics, as neglect tolerance (NT) or robot attention demand (RAD), are less relevant. In contrast, measurable criteria about the ability of robots to facilitate engagement with users and to be perceived as social actors would be more suitable for assessing quality in social Human Robot Interaction (HRI) [39]. It is suggested in [45] that, among other technical metrics (e.g., navigation, management, manipulation), psychological and sociological metrics would be important for evaluating performance. Unfortunately, it is also highlighted that these metrics would differ from one field to another.
According to [4], there exist five main evaluation methods for HRI: self assessment, behavioural observation, psychophysiology measures, interviews, and task performance metrics. The most employed techniques for exploring the extent to which social robots elicit some desired behaviour are questionnaires and observational methodology. Regarding observational methodology, it is considered to be a useful tool to obtain information about the interaction process since it allows an objective non-intrusive exploration of interactive behaviour. Several studies videotape the interaction situation, although there is not a common instrument to define interactive behaviour. For instance, the number of children interacting with a robot and the average time of interaction per week to evaluate children-robot interactive behaviour were defined in [27]. In [31], interaction between preschoolers and Aibo robots was videotaped, encoding the number of instances of treating the robot as an artefact or as a machine (poking, shaking), instances of affection (hugging, petting, kissing, stroking), the attempts at reciprocal interaction (offering a ball, talking to, motioning to) and the instances of apprehension (because children were interacting with unfamiliar objects). It is explained in [18] how Pleo robot failed to reach long-term interest of children, which interact with such a robot as a toy. This study proposes some guidelines in order to build up long-term interactions that could maintain interest in the robotic pet for longer periods.
All the analysed studies showed a clear interest for observational data, although each of them uses a different scheme to code interactive behaviour, which was directly related to their hypotheses. An exception is found in [10], where an effort was made for obtaining a simple two-category system to compare differences in eye gazing of autistic children playing with both a robot and a toy (a truck). Their system consisted on 14 criteria divided in two general categories, one related to movement (eye gaze, eye contact, operate, handling, touch, approach, move away, and attention) while the other is related to verbal activities (vocalisation, speech, verbal stereotype, repetition, blank, and other).
The above mentioned ad-hoc procedures are helpful for exploring some specific topics in social HRI but they do not allow a comprehensive comparison and generalisation across studies. The aim of the present work is to propose a set of instruments to facilitate the description and assessment of social HRI within a general framework for the study of the interaction with social robots.
The rest of the paper is organised as follows: HRI concepts are introduced in Section 2, as well as some psychological constructs required to describe the presented approach. Proposed instruments are presented in Section 3. Videotaped experiences are analysed in Section 4 using the proposed set of instruments. These videotaped sessions helped to complement and refine the proposal, as exposed in Section 5. Finally, Section 6 gives the concluding remarks and future work.
Methods for human robot interaction
Human Computer Interaction (HCI) methodologies offer important insights to understand and evaluate HRI, however social robots and computers are very different agents, therefore new knowledge and perspectives are required to address the social dimension of interaction between people and such artificial partners [7]. Whereas the main research focus of HRI is on the development of specific robotic systems and applications, some methods for the evaluation of interactive behaviour between people and robots have been adopted and/or modified from fields as human-computer interaction, user experience, and Psychology [4,14,49,50]. These complementary approaches face the central question of studying whether the manner in which people interact with a robot is similar either to interactions between humans and computers or to the interaction among humans. In this section, these contributions are reviewed since they are the theoretical basis for the novel proposal for assessment.
Recently, an evaluation framework has been proposed in [53] by considering some of the HRI and HCI existing evaluation methods [7,13,33]. Specifically, [33] introduced a theoretical three level design model for defining the impact that a product can make on the costumer: visceral, behavioural, and reflective. The visceral impact is based on appearance, the behavioural one is based on use effectiveness, and the reflective level involves the meaning of the product and the after experience feelings. Proposal in [13] is focused on the awareness that people and robots have of their social structures and activities within a group. Finally, [7] focuses on the classification of based on its social characteristics. Based on the key elements of these three approaches, the work in [53] evaluated the holistic interaction situation using a set of perspectives: The first perspective (P1) was targeted to the evaluation of the visceral factors during interaction (the initial impression, acceptance, emotions); the second perspective (P2) focuses on social mechanics (communication, movements, facial expression, body gestures); the third perspective (P3) is about social structures, that is, the influence of human environment and social structures on the interaction process (how robot fits into human environment). Such study concludes that these perspectives could contribute to assess the whole interaction experience and they can be used as a powerful and simple notation and vocabulary for communicating findings. Furthermore, these authors highlighted that it is necessary to develop more precise methodologies and tools to facilitate the use of these perspectives among evaluators.
Taxonomy for classification
Taxonomy for classification
This taxonomy could be useful as general framework for classifying the interaction setting. However, in social robotics studies, other criteria, recognising the features for natural communication, must be included for assessing and comparing interaction settings.
Regarding psychological and social competences, interaction characteristics, persuasiveness, trust, engagement, and compliance should be assessed [45]. Specifically, persuasiveness is related with the robot capacity for changing behaviours, feelings, or attitudes (e.g., therapy). These authors propose that personality, dialogue, emotion for capturing attention (acquisition time) and holding interest (duration) should be considered for engagement, defined by [44] as the process by which two (or more) participants establish, maintain, and end a connection during interactions. Furthermore, four events are codified in [38], involving speech and gesture, that contribute to the perceived connection between humans and maintain engagement: directed gaze, mutual face gaze, conversational adjacency pairs, and back channels. Exchange starts with a gaze or a statement made in a particular tone of voice and when the other person responds, from that point the interaction moves into engagement phase [22]. Therefore, strictly speaking, there is a directed gaze or a comment of one participant that elicits an answer from the other one.
Evaluation of context conditions where H, HT, MH, R, RT, and MR corresponds to human, human team, multiple humans, robot, robot team, and multiple robots, respectively. References taken as a basis to propose categories are indicated beside and new subcategories are bolded
Proxemics, i.e., the regulation of physical and psychological distance between actors, is a key aspect of interaction. Humans adjust the distance to their partners in relation to action, that is, the adjustment is dynamic and changes over time [23]. This author described four types of distances: intimate, personal, social, and public. These distances are influenced by external factors as gender, status, or culture, among others [21]. Proxemics research in social robotics field has found that there is some evidence of the effect of robot gaze behaviour and likeability on humans distance regulation [32]. In a laboratory setting, people who dislike the robot compensate the increase of robot’s direct gaze by maintaining a greater physical distance from the robot and disclosed less personal information (taken as a measure of psychological distance) to the robot. This fact suggest that, in a laboratory context, controlling the like/dislike of robots is important for explaining some interaction patterns.
They also proposed a category scheme in order to code and study children spontaneous behaviour. Categories related to interaction with objects and interpersonal relationships are of interest for the present work. They will be considered and defined in the next section for the study of interaction among robots and users.
Finally, attachment is also an important concept in social HRI studies, especially with children. Attachment Theory describes the dynamic of long-term relationships between humans, mainly in families and with friends [5]. Infants become attached to adults who are sensitive and responsive in social interactions with them and remain as consistent caregivers for some months during the period from about six months to two years of age. This relationship with principal caregivers helps in future social and emotional developments. Infant behaviour associated with attachment is the seeking of proximity to an attachment figure. Several methods for assessing attachment in children and adults have been proposed for classifying people in one of the proposed styles: secure, anxious-avoidable or anxious-ambivalent for children [1] and secure, preoccupied, dismissive or fearful [3]. Evaluation of attachment should be carried out along time and could be an indicator of quality in long-term relationships.
Considering the HRI and concepts of Psychology presented above, a global assessment tool is proposed in order to describe and analyse interactive situations involving social robots and human users.
The assessment tool presented in this Section covers specific features of social HRI in order to evaluate the quality of interaction and its experience. The evaluation system focuses on three key points: classifying scenario features, classifying (current/potential) social robots features, and assessing of behavioural units during (and after) interaction.
Assessing the setting: How was the scenario?
Considering the theoretical basis presented in [4,18,27,31], a table with the basic traits of the setting is proposed. This taxonomy allows the evaluation of the initial conditions of the interaction context (see Table 2). It also allows an easy assessment of the resemblance of diverse interaction settings.
The first row indicates the level of shared interaction, that is, the number of actors that interact in the situation, ranging from the minimum, 1 human – 1 robot, to the maximum level, multiple humans – robot team. The second row shows the interaction roles proposed by [42]. However, there are two new roles for social robots that are being introduced here, i.e., companionship and coaching.
The companion role has been included since pet-like robots expected to reproduce the social-emotional benefits associated with the interaction and the emotional bond between children and companion animals such as entertainment, relief, support and enjoyment [11]. This social bond is supposed to provide therapy relevant effects to hospitalised children in the way real pets do. However, animal-assisted activities, that have been proven to be effective for paediatric purposes [37], are not possible in hospital environment. The relationship between master and pet is based on hierarchy and attachment. The social situation defined by the master/pet interdependence could naturally produce engaging activities (i.e., teaching new skills, learning to understand, care giving, playing together) and expressions of affection and concern. In this role, the robot must be able to deploy (or acquire) social skills for effective communication (i.e., orientation, attention, responsiveness), for hierarchy submission (i.e., recognition, obedience), and to express and generate attachment (i.e., affective expressiveness).
The coaching role is based on the social bond (affective involvement), tasks, and goals. Obtaining the collaboration of the pupil is an essential issue and requires an agreement about the relevance and usefulness of tasks and goals. The coach must provide ongoing supervision, encouragement, feedback, counselling, and support for goal fulfilment. Furthermore, in order to enhance agreement and compliance to treatment, it is necessary to create an affective bond. For instance, rehabilitation is usually hard and motivation must come from an affective bond of trust and intimacy (alliance) between pupil and coach. The coach must be responsive to pupil needs and emotions in an empathic way and find an acceptable balance between goals, commitment, and concern for the pupil’s welfare. Engaging communication and contingent feedback are required for task monitoring. For empathic rapport, affective communication and awareness of child’s psychological and physical state are needed [28].
The third row in Table 2 shows five models of physical proximity between humans and robots defined by interpersonal distances in case that their interactions are collocated [25]. The “none” category was defined by [52] to consider not collocated interactions. They are ordered in increasing physical proximity. It has been suggested that, when multiple types of physical interaction are applicable, the one that involves the most/highest physical proximity should be chosen. The fourth row shows the time-space category. It is based on whether humans and robots are using systems at the same time (synchronous) or different times (asynchronous) and while in the same place (collocated) or in different places (non-collocated) [16]. Finally, the fifth row categorises the autonomy level, as a continuous value, ranging from tele-operation to fully autonomous. Quantified by means of the percentage of time that the robot accomplish a task on his own or, complementary, the percentage of time that intervention is required. An example of the lowest degree of robotic autonomy is one which only can be tele-operated.
Assessing social robot features: How social is the robot?
Figure 1 shows a description grid generated according to a taxonomy for assessing robots social features for communication, that is, a specific list of characteristics that social robots should have at the moment of the use or should be implemented in future development. This data sheet allows to compare social features of several robots, it also allows a quick assessment of a new platform in terms of its social requirements, and could help in the planning of future research guidelines and development. Classification is divided in three major groups of characteristics: natural cues, non-natural cues, and computer mediated devices. Natural cues are those human alike traits that allow a fluent interaction and facilitate mutual awareness, such as the verbal abilities and non-verbal traits. The non-natural cues refer to those unusual traits in human interaction that could be present in a HRI context such as light, colours, sound, or even a virtual agent. Finally, the third category gathers some computed mediated devices that could be a tool to carry out some specific aspects of interaction, such as writing or reading (screen, keyboard, and mouse).

The description grid for the features of social robots.
As shown in Fig. 1, natural cues are divided in two categories, verbal and non-verbal. The first one deals with communicative abilities. For the several ways of reaching this social function, three major possibilities are proposed: conversational, bidirectional no conversational, and unidirectional. Conversational mode involves a joint dialogue where coherent and logical rules between one sentence and another take place, as humans usually do. The bidirectional no conversational pattern takes place when robot is programmed to recognise and reproduce some limited phrases to accomplish a specific task that is verbally mediated, such as guide costumers in a supermarket. Bi-directionality is pretended by guiding conversation throughout a topic and when the dialogue seems highly natural, although robot was designed to maintain this specific pseudo-conversation. Finally, unidirectional verbal communication has also been considered since some platforms are able to process some incoming verbalisations and emit some others, but they are not aligned and a dialogue is not really happening. Vocalisation has been included under the verbal category in order to describe sounds that do not constitute a conversation (e.g., growling).
No conversational communicational aspects are included in the non-verbal category, such aspects are usually present in interaction contexts allowing to transmit a message with emotional intensity (e.g., face, gaze, gesture, body stance, movement). Among non-natural cues are light, colours, sound, and even virtual agents that sometimes the robot produce. Finally, computer mediated HCI devices also deserve special attention since in some cases they will guide the whole interaction process, hence its evaluation becomes fundamental among the interaction assessment process.
As an instance of the applicability of this assessment tool for different platforms. Figure 2 shows the proposed grid describing the resources for interactive behaviour of the humanoid robot REEM, in addition to further descriptions to be provided in Section 4.

Humanoid REEM social features description.
Interactive behaviour codification system
Interactive behaviour codification system
A specific research methodology is proposed in the first column in order to obtain results of the interaction situation. Hence, observation would be employed to gather data during the interaction session and tests, interviews, and focus groups should be employed to obtain information about thoughts after the interaction as well as to capture long term indicators (e.g., persuasiveness, attachment).
Considering the levels of design proposed in [33], two points of time have been selected in order to assess and analyse interactive behaviour: during and after interaction experience. The fundamental basis of interactive behaviour is that a first contact happens among the two actors, from that point it develops along the session, and some feelings and thoughts remain after it. Thus, during interaction, systematic observation is the most appropriate methodology to explore the response of the user when interacting with a social platform whereas the after-interaction experience requires indirect techniques such as focus groups, personal interviews, and questionnaires. In the following lines, some categories are proposed to provide a scheme to analyse some components of the interaction between a social robot and a user.
During the interaction experience some psychological constructs can be studied. Specially, the user’s initial emotions are the intuitive reaction to the presence of the social robot. Roughly, emotions can be characterised by positive or negative, however a theory is proposed in [15] discriminating a set of basic emotions: joy, sadness, surprise, disgust, anger, fear, and a neutral state. This is an interesting set of features for evaluating the impact of the robot on the user and considering changes along the session. As pointed out by [15], the primary function of emotions is to deal with interpersonal encounters and, therefore, they are of interest for the assessment of interpersonal exchange in an interactive situation. Moreover, changes in proxemic behaviour are a fundamental indicator of perceiving the robot as a social actor [51]. Proxemic behaviour can complement or even help the researcher to interpret user emotional response during the session. Social robots should transmit trust and the user should feel safe and comfortable interacting with the robot. Four social spaces are defined in [23]: intimate, the closest “bubble” of space surrounding a person. Entering this space is acceptable only for the closest friends and intimates (15–45 cm); personal and social spaces, the spaces in which people feel comfortable conducting routine social interactions with acquaintances as well as strangers (46–120 cm/121–350 cm), and public space, the area of space beyond which people will perceive interactions as impersonal and relatively anonymous (>350 cm). Changes in emotions and proxemics should be coded along the session, although they could be a powerful source of information at the beginning of the session to assess user initial reaction.
Besides emotions and proxemic behaviour, analysis of exchange that takes place between a user and a social robot becomes fundamental for researchers and professionals that work with this kind of robots. It is supposed that engagement starts when there is a feedback from the person in response to certain robot behaviour such as directed gaze, mutual facial gaze, suddenly approaching, suddenly moving away, or greeting. Once the robot has attracted the attention of the user, a behavioural exchange will start defined by communication, changes in social distance, body gestures and/or changes in gaze. Some of these behaviours indicate the maintenance of engagement (e.g., directed gaze, communication) and some others indicate loss of interest (e.g., maintained or repeated loss of visual contact). Whether physical interaction with the robot is allowed, four items can be analysed: the exploration of robot characteristics, simple manipulation, non-conceptual (thematic) use, and conceptual (thematic) use. The exploration of robots characteristics concerns all the behaviours that users execute in order to discover robots affordances. Simple manipulation covers non-interactive (non-intentional) behaviours (e.g., transportation). The conceptual use includes user behaviours that correspond with the expected for its role. Non-conceptual (thematic) use correspond to any other behaviour that do not match with the user role. Examples for each category should be proposed using data gathered from observational studies; some of them are shown for the case of Ugobe’s Pleo robotic dinosaur. Whether interaction with other people is allowed, the kind of relationship that people establish among them can be analysed coding the general pattern that takes place: closeness and/or physical contact, attention/observation/maintained gaze, directed comments or verbal expressions, shared activity, or mixed, that is, a combination of some of them.
Interaction can be considered finished when the user breaks engagement by means of a long period of inactivity, saying good bye or getting into a public space, among others. These indicators become fundamental whether the aim of a study is exploring the loss or lack of interest in the social interaction, especially in HRI long-term studies. However, a complete assessment of human robot relation should include an after-interaction phase: when users and robots have been separated, after interaction thoughts and long-term indicators should be assessed. At this moment, interviews, focus groups, and questionnaires are more appropriate for exploring user experience with the robot. Further explorations of after-interaction experience can explore in depth the attachment bond emergence and other goals of the platform such as attitudes, behaviours, or feelings modification. In these cases, more than one interaction session or assessment point is necessary.
The proposed assessing approach has been applied in two different contexts in order to evaluate the interaction between robots and end users. Some analyses have been carried out to provide quantitative indicators of each interactive activity.
Lab experience with Pleo
The aim of this study was exploring the dynamics of social bond emergence between children and social robots (for further details see [12]). A field study with 49 sixth grade scholars (aged 11–12 years) and 4 different robots (Sony’s Aibo, Ugobe’s Pleo, Aldebaran’s Nao, and Meccano’s Spykee) was carried out at an elementary school. At the school, researchers focused on exploring children attribution of robots competences and skills based on appearance. Two months after the school experience, a series of play sessions with Pleo were conducted in the lab to explore the interdependence dynamics and interactive behaviour in a second meeting. Pleo is a commercial entertainment robot developed by Ugobe equipped with different tactile sensors under its skin, touch sensors in the feet, speakers, and microphones. Among its features, it presents a set of creature-like personalities and develops internal drives like hunger or sleep, and several mood modes: happy, extremely scared, curious. The main objective was to observe children behaviour when they met Pleo in a second meeting and explore how the previous contact with the robot in the school affects on subsequent interaction. The play session took place in an observation laboratory and the instruction given by the conductor was: “You can stay here with Pleo as long as you want. When you want to give up, just tell me”.
Assessing the setting: How was the scenario?
The laboratory experience can now be described according to the proposed taxonomy in Table 2: The context of this study is defined by H-R level of interaction, this HRI follows a companion role which involves a high level of physical proximity (approaching, touching) in a synchronous-collocated situation with a highly autonomous social robot (Pleo). Highlighting the main traits of the present context, using Table 2, allows to assess the interaction setting with a simple visual inspection.
Describing social robot features: How social is the robot?
As shown in Fig. 3, Pleo could be defined as a pet-baby alike robot. Regarding natural cues, non-verbal features are developed rather than verbal ones, specifically, the use of certain parts of the body such as mouth, eyes, or head. User identification or face tracking would be an improvement for Pleo social skills.

Pleo social features description.
Percentage (observation duration) of time for each category and participant considering activity during interaction for the “Lab experience with Pleo”
Percentage (observation duration) of time for each category and participant considering activity during interaction for the “Lab experience with Pleo”
A workshop in robotics was carried out in a public school in order to explore the interaction between non-expert users and social robots. Furthermore, students from 9 to 11 years old agreed to participate in some specific activities with the Nao robot that were videotaped. Signed consent form was previously requested from their parents. Nao is a full body humanoid robot of middle size (58 cm height). Its body has 25 joints and weights 4.3 kg. It also counts with a powerful multimedia system, which includes 2 speakers, 2 cameras and a microphone. Incorporating these features, text-to-speech synthesis, sound localisation, and facial and shape recognition are feasible.
One of the activities was testing the implementation of a classic game into Nao platform, specifically, the 20 questions game (20Q) [36]. In this game, the player thinks of something and the 20Q system asks a series of questions for guessing what the player is thinking. Originally, players can answer these questions with: Yes, No, Unknown, Irrelevant, Sometimes, Probably or Doubtful. Currently, this game has been implemented in an automatic system called 20Q (

Nao’s system architecture: computer, server, and robot interconnection.
Following the taxonomy summarised in Fig. 1, this setting can be described as follows: Several users were in the interactive situation and one Nao robot was with them. The interaction role was companion and physical proximity was not allowed. There was a face-to-face interaction and no intervention was required once the game had started.
Robot social features
As it is shown in Fig. 5, Nao social features for 20Q game consisted of verbal interaction by means of speech recognition and voice synthesis. Regarding non-verbal features, the robot can detect, recognise, and follow faces and carry out some movements (e.g., to dance, to sit down, to walk). Nao has some non-natural cues such as light, colours and sound.

Nao social features description.
Ten children were seated in a semicircle around the robot to carry out the 20Q game and a volunteer was chosen to develop the question-answer activity. The game always started with the same question, requiring the user to think about something (animal, vegetable or a thing). The system then initialises the question/answer iteration, registering the user responses and processing the possible solution to the game. After 20 questions, the system makes its guess about the user initial thought. The facilitator stood near the volunteer and the robot in order to help in case it would be necessary. Thus, while the volunteer do the answer/question activity, the rest of the children could see the game playing and try to guess (mentally) the thing, the animal or the vegetable the volunteer was thinking about. In these conditions only a few categories could be registered and, therefore, emotions, facial expression, and gaze behaviour were registered following a time sampling with 30 second intervals. The total duration of the session was approximately 10 minutes.
Setting assessment using the grid in Table 2 for Pleo
Setting assessment using the grid in Table 2 for Pleo
Setting assessment using the grid in Table 2 for Nao
After analysing the Pleo and Nao cases with the proposed instrument, results can be compared in terms of scenario, social robot features, and the interactive behaviour.
Scenario
In Tables 5 and 6 the traits of the context are being highlighted. It can be seen that both contexts are similar in terms of interaction role, time-space trait, and autonomy level. They differ at the level of interaction and physical proximity.
Robot social features
The information in Figs 3 and 5 shows that Nao has higher verbal skills than Pleo (both current and potential), since allows to develop bidirectional non conversational interaction. Therefore, Nao shows an advantage in terms of verbal skills. Regarding non-verbal cues, Nao has more advantages for face recognition. Both robots are comparable in terms of movement for the considered experiments. Neither of them is computed mediated.
The HRI session
As a result of the above mentioned scenario and social features analysis, the comparable features of the interactive behaviour are reduced to the following categories: emotions, gaze, and facial expression. Both robots elicit joy and neutral emotions, with a higher percentage of the latter. Regarding gaze behaviour, Pleo achieves a higher percentage of directed gaze, although this value is maybe influenced by the physical proximity limitation in the Nao school situation. Finally, both robots get smiles and laughs or inexpressive facial expressions, although Pleo reach a slightly higher percentage of the former. Overall, it seems that both platforms can reach positive emotions and facial expressions but Pleo gets more gaze attention from participants.
Discussion
The aim of this manuscript is to propose a scheme for describing the social features of robots. It also discusses on the interaction settings and the observational instrumentation to code behaviours during HRI situations. With the definition of an observational instrument, several analyses can be carried out in order to explore possible behaviour patterns during interaction. For instance, temporal patterns [30] can be analysed and hidden structures can be described for predicting the loss of interest of users. The use of a common instrument for the study of HRI (specifically, using social robots) will facilitate the dissemination of results and comparison between studies. However, the manuscript presents a method that is the first step towards a definition of a global tool for the interaction assessment.
Regarding theoretical implications, different moments along interaction that takes place between humans and social robots have been described. Moreover, according with the proposed moments of the HRI situations (during and post-interaction) different methodologies have been recommended (observational methodology, questionnaires, focus groups, or interviews). Assessment of some of the proposed traits requires more than one observation session since they are related to long-term relationships (e.g., attachment). These specific types of relations between humans and social robots could be monitored by means of the proposed observational scheme. Thus, longitudinal studies become particularly relevant for exploring long-term constructs such as attachment or persuasiveness. Further theoretical development must specify criteria in interaction achievement, that is, both desired and undesired interaction patterns in different scenarios. This definition, together with instruments for defining behaviour, will help in evaluating quality of social robots commitment. Further research is necessary to refine the categories proposed above in order to eliminate, modify or increase the number of described behavioural units. The application of these instruments in other contexts and with other platforms will help to improve and test the usefulness of these tools since it is the first step towards a definition of a complete assessment system.
Regarding the examples presented in this work, some considerations should be done. Considering the study performed with Nao robot, results suggest that the 20Q game implemented in this platform gets a medium level of engagement in the current state of development. When analysing results, it is important to highlight that the time between the volunteer answering a given question and the robot asking the next ranged from 6 to 30 seconds. Prior research suggests that matching of speech rhythms is related to perceived quality of interaction and interpersonal attunement. Partners tend to match pause duration both within and between turns in adult conversation [26] and pauses during speech that exceed 1 second are perceived as disruptive [8]. Interpersonal timing is a crucial aspect of human communication and, therefore, the present results could be related with the platform response time. Further development should address the improvement of platform response time in order to regulate conversational turn taking, timing, and contingency. Once the platform can perform in a more natural conversational timing, HRI should be analysed again in order to explore the possible changes in users engagement. Therefore, technological changes in social platforms entail constant evaluation in order to test if they influence the interactive behaviour.
Regarding the results obtained with the study of the interaction with the Pleo robot at the lab, some aspects must be discussed. The most expressed emotional states are joy or neutral and all the participants kept the robot in the lap during approximately all session. The participants look at Pleo during the vast majority of the session. However, it should be point out that they stop looking at the platform as the session arrives to its end, that is, they used to look at somewhere else when the end was close. This fact is interesting since it is an indicator of fatigue or boredom. Although inexpressive faces were predominant, there were parts of the session were participants showed smiles and even laughs. Body gestures were mainly in-existent since participants took the robot in their lap and stay quiet and seated. Finally, physical contact was allowed and interaction with the platform was based on exploration of robot characteristics and conceptual use. Exploration of robot characteristics included behaviours as tail grabbing, lift and shaking in order to test if they could awake the robot or pass the hand in front of the face of the robot to test if it can watch. Conceptual (thematic) use included cuddling, feeding, or giving affection. The low percentages of time (approximately between .5% and 10%) that were no interactive behaviours seems to be a good indicator of robot’s competence to maintain children engaged.
Conclusions
A specific procedure has been introduced to describe and assess HRI when dealing with social robots. The proposed instruments help to describe the setting characteristics, the robot features, and allow to study interactive behaviour. This proposal provides a common vocabulary for communicating findings and instruments to study social interaction between users and social robots. Based on the patterns studied in [53], the tools proposed in the present manuscript contribute to assess the whole interaction experience and serve as a powerful and simple notation and vocabulary for communicating findings. Furthermore, monitoring could help to properly describe long-term relationships and the detailed analysis of changes in frequencies (or duration) along time, which will help to predict or avoid lack of interest and detachment.
Some considerations should be made about the categories proposed to assess the interactive behaviour between end users and social robots. Firstly, regarding emotions, researchers could expect that any of the proposed basic emotions could happen at the beginning of the interaction and emotional state could change during the session. However, social robots should not elicit negative emotions as a final result of the session. Furthermore, time elapsed until a positive or, at least, a neutral emotion should be taken into account. Considering both indicators together, that is, final emotion and time needed to reach a positive or neutral emotional state, the session could be positively assessed whether the user final emotion is positive (such as joy or surprise) or neutral, or time needed for reaching a positive emotional answer is considerably short.
Secondly, as it has been mentioned before, social robots should generate trust and the user should feel safe and comfortable interacting with the robot. A moving away behaviour may occur at the first moment since robots could elicit fear to users given that they do not know so much about the platform. However, it is unacceptable that these behaviours last the whole session or the vast majority of time. Results in [51] show that, conforming [23] human social distances could be considered an indicator of perceiving the robot as a social actor. Thus, sharing personal-intimate space can be considered an indicator of psychological closeness among these actors generating an enjoyable climate. This fact becomes crucial when social robots are thought to play the companionship role with, for instance, elderly or hospitalised people.
Thirdly, behavioural units considered for coding and analysing interactive behaviour (gaze, facial expression, body gestures, interaction with the robot and interpersonal relationships) are the first attempt to create a set of categories for studying human robot interactive behaviour. The present proposal has both theoretical and practical basis, that is, it has been built considering the available frameworks in HRI and Psychology and at the same time taking into account the narrative analyses of videotaped sessions. Future studies should consider to modify the present set of categories in order to capture another aspects not included in the current proposal.
Finally, in-depth considerations on the rationale of the assessment of social robots-human interaction deserve to be mentioned. The available theoretical frameworks in the research areas of HRI and Psychology assume that HRI should be highly similar to human-human interaction. This fact implies taking a human centred perspective to evaluate the interaction process. In this case, there are expectations for a giving type of interaction. When this specific kind of interaction is not found may easily fail the end users’ expectation about performance. In fact, [9] has previously pointed out that when robots look like humans this may elicit strong expectations about the social and cognitive competence of robots. If such expectations are not met, the user is likely to experience confusion, frustration and disappointment.
Some questions should be considered by both engineers and developers of the evaluation methods, for instance, should the interaction between children and robots be similar to the one developed with a family member or another child? Is it necessary to implement major theory-of-mind-like abilities in robots as priority for software developers? Why HRI should be highly similar to HHI? A combination of characteristics that are necessary for building a sociable robot is proposed in [6]: being there (embodied and socially situated), life-like quality, human-aware (human social perception, that is, identification of a person, recognition of a task and emotive expression), being understood (human to robot and robot to human, but robot to robot too), and socially situated learning (to learn throughout their lifetime). However, these components can be implemented in an innovative way configuring an original interaction style. For instance, similarly to an adult-children interaction, where a specific kind of interaction occurs in a context of differential (cognitive) abilities and which has their own patterns (such as baby talk consisting on high-pitched voice, sort sentences, exaggerated intonation, etc.). This specific interaction style is far from adult (normal) interaction style but is the most frequent way to interact with children. Hence, it is possible that HRI could be slightly different from the prototypical human-human interaction situations.
Similarly, humans engage and attach to inanimate objects (such as personal belongings) or animals (such as pets) that have not human interactive abilities. These are some of the deep questions that should be addressed in future studies and they may guide the rationale for implementing behaviour patterns or building instruments for coding and quantifying interaction in human-robot settings. Consequently, as pointed out in [48], there are some ethical aspects that must be considered when designing behaviour patterns whether they could affect attachment ties. Analysing human-social robot interaction will help in understanding the emergence and maintenance of social bonds among human and robotic actors avoiding the undesirable effects of discouragement, boredom, or losing interest about social relationship.
In summary, the present paper reviews the theoretical contributions in HRI and Psychology fields in order to propose a methodology for evaluating quality of social robots commitment, specifically, the scenario, the features, and the interactive behaviour between social robots and humans. For this purpose, a general taxonomy based on three tables has been proposed to combine the theoretical contributions of different frameworks and videotaped experiences in two different contexts. Further applications of these instruments will improve the proposed method and allow the description and understanding of social dynamics between humans and robots, with proper analysis allowing theoretical development in the new area possibly called the human-machine psychology [48].
