Abstract
Ubiquitous robots are robots integrated with smart environments, where they cooperate with networks of heterogeneous sensors to achieve complex tasks. Their successful application opens important research questions for both their engineering and their interaction with human users, especially in public space scenarios, when they need to interact in a socially acceptable manner with multiple and previously unknown people. This article presents a testbed, named PRIveT, which is purposefully designed to support effective human-robot interaction (HRI) in public spaces. The PRIveT testbed consists of an ubiquitous robotic system that is able to autonomously engage and adapt to its users. This article describes the design rationale of the PRIveT testbed and its technical features, and presents four research studies that have used the testbed extensively. Reflections on the testbed features that were important in these studies and lessons learnt are also discussed. An assessment of the PRIveT testbed in terms of utility of the design for human-robot interaction experiments concludes the paper.
Introduction
Recent progress in all areas of service robotics, including works which seek to integrate speech, sensing, acting, and networking, have resulted in increasingly versatile and reliable service robots. One of the most promising application domains for such robots is their deployment in public spaces acting, for example as reception and information desks attendants, museum and city guides, servants in bars and restaurants, healthcare, rehabilitation and therapy assistants in hospitals, whimsical educators and learning companions in educational institutions.
In public space scenarios, robots must necessarily deal with situations that demand them to engage humans in a socially appropriate manner. Similar to human-human communication, if a robot does not adjust its communication style to the interlocutor or situation at hand, this can lead to confusion, misunderstanding and can cause annoyance, displeasure, dissatisfaction with the service and ultimately disengagement. Adaptive human behavior reflects an individual’s social and practical competence of daily skills to meet the demands of the society. Thus, a socially competent robot needs to behave according to the contemporary conventional norms normally accepted within the society, social class, or user group [17].
However, robots in public spaces usually do not know the users they are interacting with. In addition, those interactions are typically short and dynamic and they are not limited to two parties. Rather, in many instances, those robots have to deal with multiple people simultaneously participating to the interaction (multi-party interaction), and often with changing numbers of participants [21].
For these reasons, research on human-robot interaction (HRI) in public spaces is often fundamentally different from traditional work in social robotics and HRI which tends to focus on long-term robot companions who interact with humans in a one-to-one interaction, such as household servants and caregivers. People tend to take cognizance of the age and gender of the interlocutor in choosing their wording, such as forms of address and pronouns. They also tailor the content they communicate together with their behaviors, in terms of non-verbal social cues such as body language, gestures, and gaze. Such feats are more difficult to replicate in robots that have to engage many different and usually previously unknown users.
In order to facilitate this type of HRI research, this paper presents a robotic testbed, named PRIveT, based on a ubiquitous robotic design: its system knowledge and cognition are not confined to the individual robot. Rather, ubiquitous robotics augments the capabilities of robots by leveraging ubiquitous computational and/or sensorial resources. Augmentation complements and/or enhances the capabilities of one or more robots while such robots can simultaneously serve as intermediaries or social interfaces to ubiquitous services.
The testbed described herein has been used in a number of interactive demonstrations and HRI field studies, particularly to study issues of adaptability for robots employed in public spaces. Specifically, the PRIveT testbed is able to adapt to its users by characterizing them using some of the same visual cues utilized by people in social interaction. In the work described in this paper, gender and age estimations serve as the basis for designing robot’s social adaptation to suit the preferences of people.
The following points summarize the contributions:
Provision of a portable, autonomous, adaptive ubiquitous robotic (PRIveT) testbed for HRI in public spaces. The system is able to autonomously initiate the interaction, present information, and demonstrate itself to human users. It is designed to be easily transportable and re-configurable in order to adapt to different settings and requirements.
Demonstration of how the PRIveT system is used in a number of interactive demonstrations and HRI field studies, which are multi-party public events. While the details of these studies have been documented in previously published work, here the focus is on a comprehensive analysis of the testbed and on the assessment of how it was used in a variety of situations of increasing difficulty.
Source code of the PRIveT testbed release: goo.gl/efD3Sa.
This research has been broken down into a set of research questions. The following list enumerates the central research questions that have underpinned the present work and, which are addressed in this paper.
How, and what, should a dynamically adaptive system perceive in its users? What methodology produces the best results? (Section 3)
What are the main challenges that have to be tackled when designing an effective HRI testbed for public spaces? (Sections 3 and 4)
To what extend do people accept and how do they evaluate adaptive robot behavior generated with the implemented framework? More specifically, how does the adapted robot’s verbal content impact and shape the interaction experience and assessment of the robotic system? (Section 5)
The remainder of this paper is organized as follows: Section 2 reviews related work on robots/testbeds designed for field studies in public environments. Sections 3 and 4 provide a comprehensive introduction to the design of the PRIveT testbed and each component. Thereafter, Section 5 details a number of studies in order to demonstrate and discuss PRIveT’s performance in a wide range of settings. The main experiment and its results are discussed in Section 6, and Section 7 concludes this paper.
Related work
Research on robots that are particularly designed for HRI have focused on a wide range of public environments. However, each public space sets different HRI requirements and challenges. This section provides the review of the related work in the domain of public space HRI and examines each work’s requirements, challenges and contributions.
The “network robot system” framework [12] is a result of several years of research in the domain of public environments and has been utilized in a number of field studies in train stations [31], science museums [30], and shopping malls [14] in Japan. Four mobile robots (two Robovie humanoid robots and two cart robots) and sensors embedded in the environment are integrated to provide robot services such as guiding and carrying shopping bags to people in social contexts. The framework features include recognition and anticipation of people’s behavior, identification of individuals, coordination of services and navigation paths between robots, and support for human supervision. The framework performed successfully each experiment and participants responded in a positive way indicating they would like to use these services in the future. Similar challenges presented to mobile robots in populated environments have been addressed within research efforts in city guide robots, such as FROG, the Fun Robotic Outdoor Guide, [10] and Autonomous City Explorer (ACE), as well as airport guide robot SPENCER [33], a fully autonomous mobile robot for passengers flow management.
Robots deployed at information or reception desks are often stationary and avoid any need of dealing with navigation and socially-aware mapping. However, such robots need to behave according to the accepted social norms. The Hala robot is utilized as a receptionist. It consists of a human-like stationary torso with an LCD mounted on a pan-tilt unit. The LCD screen serves as a robot head, which allow rendering of character faces, appearance cues, verbal and non-verbal behaviors of ethnicity that were controlled for the ethnic similarity. The experiment was conducted with 30 participants: adult native speakers of Arabic (fluent in English) and native speakers of American English. The results show that Hala robot of a relatively low human likeness could evoke associations between the robot’s verbal and non-verbal behaviors and its attributed ethnicity. However, the results of this experiment did not find evidence of ethic homophily [19] i.e. association and bonding with similar others. Another study by Salem [25] exploring culture-specific variations of HRI between Arabic and English native speakers highlights the importance of addressing and exploiting cultural differences when designing multilingual and cross-cultural service robots.
Similar challenges to this paper’s work have been addressed by the robot bartender JAMES [11], which is designed to work in dynamic, multi-party social situations. The robot system incorporates state-of-the-art components for computer vision, linguistic processing, state management, high-level reasoning, and robot control. JAMES’s hardware consists of two manipulator arms with humanoid hands mounted in a position to resemble human arms, along with Microsoft Kinect and an iCat robot used as animatronic talking head. During the study conducted in laboratory settings, the authors report that the system performed “generally successfully” with 31 university participants [11]. In contrast, this paper’s work was evaluated in a number of field studies in real-world environments.
A number of testbeds for child-robot interaction research has been utilized in educational institutions and hospitals to investigate how long-term human-robot interaction can be used to provide support and education. During the European project LIREC [18] and the US-based research with the DragonBot robot [32] a number of studies was conducted in schools to investigate long-term effect of interactions with social robots. During these studies, the researchers focused in addressing the challenges of maintaining children’s interest in social robots and how to sustain robot’s social presence, children’s engagement and self-validation with the aim to improve child’s learning during long-term interactions in school settings [18,32]. Additionally, the research efforts in the development of HRI testbeds have produced impactful outcome in a number of HRI studies for the therapy of children with autism spectrum disorders. Notable examples are the minimally expressive robot KASPAR [7] and the huggable robot Probo [35] as tele-interface for entertainment, communication and assistance. ALIZ-E [2] used a humanoid NAO robot to help children with diabetes, by offering training and entertainment in real hospital settings. Similar to this paper’s work and its emphasis on multi-sensory data fusion, the DREAM project [8] aims to convert the therapy room into a smart space environment in order to address the challenges of data analysis, modelling, and interpretation for diagnostic support of children affected by autism spectrum disorders. In contrast to these systems developed for public environments, this work aims to support rather short-term dynamically adaptive interactions.
A number of works have focused on public exhibitions. For example, [16] describe a gesture-centric android system that is able to adjust gestures and facial expressions based on a speaker’s location or situation for multi-party communication. The speaker location is identified by face recognition and microphone position. The experiment was conducted with 1662 subjects interacting with Actroid-SIT android in a shopping mall in Japan. Another field experiment was conducted during a 6-day Fleet Week in New York with 202 subjects [20]. Groups of three people firstly trained the Octavia robot to memorize their soft biometrics information (complexity, height and clothes), then as the people tried to trick Octavia by changing their location, she could successfully identify people with 90% accuracy. Both systems successfully address multi-party HRI relying on multi-modal recognition of people: sound and vision. However, this paper’s testbed is able to perceive visitors’ features to estimate age and gender and then adapt its interaction according to the particular individuals.
Compared to existing systems of robots in intelligent environments and robotic testbeds for public environments, the PRIveT testbed is designed to leverage the heterogeneous perception abilities available in such robotic systems, and also to be easily tailored to different requirements and settings. In addition, it has been equipped with a methodology for age and gender estimation to facilitate dynamic adaptation of its interaction style. The PRIveT testbed with its concept and design has been effective in the conduction of HRI studies with a large number of participants in a relatively short period of time.
System design
This section details both the hardware and the software components of the PRIveT testbed.
Middleware
The PRIveT testbed is built out of highly heterogeneous software and hardware components, including multiple robots and wireless sensor nodes. In order to support zero network configuration and interoperability among all these components, the system utilizes the PEIS kernel [5], a software suite previously developed as part of the Ecologies of Physically Embedded Intelligent Systems project [24]. PEIS includes a decentralized mechanism for collaboration between separate processes running on separate devices, which allows for automatic discovery, high-level collaboration through subscription-based connections. It also offers a shared tuplespace blackboard that allows for automatic discovery of new components/devices, and their high-level collaboration over subscription based connections. Specifically, PEIS components can indirectly communicate through the exchange and storage of tuples, key-value pairs used in associating any piece of data (and related meta-information, such as timestamps and MIME types), to a logical key. The PEIS kernel is written in pure C (with binding for Java and other languages) and with as few library and RAM/processing dependencies as possible to maximize compatibility with heterogeneous devices. The current ubiquitous robotic system is an extension of the previous work detailed in [26,28]. The middleware permits testbed scalability since any component can be easily added or removed for a particular study settings or application purposes.

Testbed Hardware Components.
Figure 1 illustrates all the devices currently integrated within the PRIveT testbed, namely:
a toy mini-kitchen, equipped with a IEEE802.15.4 [13] compliant Wireless Sensor Network (WSN);
a Mini-PC (a Raspberry Pi [22]), equipped with a monitor and a wireless network card;
a web camera;
a humanoid NAO robot [1].
In the following sections each element is discussed together with the software components subtending their operations and integration.
Sensorized mini-kitchen
The mini-kitchen is a toy kitchen with an oven, a cupboard, a microwave, a sink and a cooker with simulated hobs, all in
In order to gather knowledge on the status of the mini-kitchen, the kitchen is augmented with a number of WSN motes installed, i.e. devices equipped with a micro-controller, a radio, and a sensor board. Specifically, IEEE 802.15.4 compliant motes are employed based on the original open-source “TelosB” platform [6]. Each mote can be equipped with a range of sensor boards. Some boards have sensors which yields real-time information on the air temperature and humidity, as well as lighting conditions. Other boards are interfaced with magnetic switches, microphones and infrared passive sensors, which are used, respectively, to sense the opening/closing of drawers, the level of noise and the presence of humans moving in their proximity. Each mote runs a copy of a WSN communication software [3], based on the Nesc/TinyOS 2.x platform, which can be exploited to (i) discover the capabilities of the sensor nodes, (ii) activate and configure the sampling of sensor data, and (iii) transmit it to a sink node.
Mini-PC & sensor hub
The mini-kitchen is equipped with a Raspberry Pi, a single-board computer running an optimized version of the Debian OS.
The Mini-PC is USB-connected to the sink node, and runs a Java-based server-side version of the WSN communication software [3], which is used to access the data read by the sensors installed on the mini-kitchen, and to publish it to the PEIS tuplespace. The software publishes an index of sensors available in the WSN. In addition, it handles subscriptions to sensor data, which are fulfilled by (i) activating and configuring (through messages routed by the sink node) the sampling of sensor data, (ii) parsing the resulting data updates once they are received by the WSN sink node, and (iii) posting them as tuples in the PEIS tuplespace.
Monitor
The Mini-PC is furnished with a monitor to show presentation slides and other multimedia content, such as still pictures, html pages, slides and videos. A simple Web application was developed consisting of a web-page with JavaScript functionalities, and a Java applet. The Java applet uses a Java PEIS client to create a tuple, with the key “URL”, whose value can be set by any component that wants to broadcast specific content on the monitor. Rather than actual media content, the value of the tuple represents the URL where the content is published, including locations in the local file system of the Mini-PC, or any resource available over the Internet. The Java applet subscribes to the URL tuple and every new URL request is relayed (using LiveConnect technology) to a JavaScript that takes care of retrieving the content and displaying it on the web-page.
Turtlebot
The Turtlebot is a personal robot kit with open-source software based on the ROS software. For the purpose of the demonstration, the Turtlebot is controlled by a netbook running ROS and the body tracking OpenNI framework. This is used to process the 3D data gathered from the Microsoft’s Kinect sensor mounted on the robot, in order to detect the presence of humans in front of the demonstration, and recognize their body posture. The C PEIS client is employed to notify other components about the presence of users in front of the stand, supplemented with an estimation of the orientation of their gaze. Furthermore, the Turtlebot includes a number of pre-programmed behaviors and speech-synthesis functionalities, which can be used, respectively, to turn the robot to face the users, and to give them some basic information about the robot and its role in the intelligent environment.
Microsoft Kinect
When the Turtlebot is not present for a specific setting, then a standalone motion capture device, such as Microsoft Kinect, is utilized for human presence detection, human tracking and for retrieving 3D body metrics that are particularly indicative of various demographics groups, i.e. age and gender. One of the core capabilities of the Kinect is the possibility to capture a depth image. The Kinect for Windows SDK 1.8 includes a number of useful functionalities, which can be used to sense human users including skeletal and facial tracking, and voice and gesture recognition. The current version of the skeletal tracking function permits it to recognize people and follow their actions. Using the infrared (IR) camera, the Kinect can detect up to six people in the field of view of the sensor. Of these, up to two people can be tracked in detail. The Kinect application can locate the joints of the tracked users in space and track their movements over time.
Humanoid NAO robot
The NAO is a programmable, 58 cm tall humanoid robot which acts as the main user interface of the ubiquitous robotics. It is equipped with a vast array of sensors, including cameras, microphones, sonars and tactile sensors, and with 25 degrees of freedom. In order to interface the NAO with the system, an interface combining the NAOqi C++ SDK with the C PEIS client was developed. The interface allows us to (i) publish information about objects and the user’s speech recognized by the NAO, and (ii) activate, configure and deactivate the NAO’s behaviours by simply setting the value of control tuples. To this end, the interface subscribes to changes to those tuples and forwards their value to the proxy classes, such as ALMotion, ALTextToSpeech, that are included in the NAOqi SDK to provide direct access to the behavioural capabilities of the NAO.
Testbed engine
The last component of the system is a testbed engine that is able to execute a simple finite state machine (Fig. 2) where each state represents a possible situation or phase of the self-demonstration. Sensor and/or event updates originated by the other components in the system are processed to build a picture of the situation and of the user, and are matched against a set of conditions describing the possible phase transitions in the presentation. Each time an event triggers a state transition and a new state is reached, the testbed engine sets the PEIS tuples required to control (i) the content that must be displayed on the monitor, (ii) the specific behaviors that must be activated in the robot (s), and (iii) the message that must be uttered to the users in front of the stand.
The testbed engine builds its confidence of certain events by integrating multiple sensor readings over time. For example, in order to detect the opening and closing of a drawer of the mini-kitchen, the engine monitors the readings of both a magnetic switch and a light sensor installed on the drawer.
The magnetic switch can only tell the engine when the drawer is properly closed but not when it is being handled by the user or when the user only partially opens/closes it. For this reason, the engine uses the light sensors installed inside each drawer and performs a simple online clustering algorithm to distinguish between situations when the light sensors perceive low levels of light (corresponding to the closed/semi-closed drawer) and the situations when they perceive high levels of light (corresponding to the open/semi-open drawer). In this manner, there is no need to specifically define high and low level thresholds for the level of light, but the system automatically adapts to the actual lighting conditions it finds when it is used.

Testbed Engine as Finite State Machine.
Effective integration of multi-sensory, multi-modal data of human body and face detection, recognition and tracking, body metrics perception, sound source localization is achieved at the testbed engine and constitutes PRIveT’s ubiquitous perception.

Ubiquitous Perception.
Previous section focused on the substantial technical challenges of enabling sensors and robots to seamlessly interact and operate as an intelligent environment. This section will focus on one of the main advantages of adopting an ubiquitous robotics stance, is that of multi-sensory, multi-modal perception.
Figure 3 illustrates how different perception modalities are fused together and used in the PRIveT testbed. Ubiquitous perception here consists of readings from wireless sensors of light and motion nodes deployed in the mini-kitchen, and of human perception information such as face, 3D body and speech from networked cameras and robots. This information is used to trigger perception events such as human presence, light condition, activity recognition, emotion analysis, speech recognition, age/gender estimation, social cues and speech detection that collaboratively provide information to the testbed engine to generate adaptive HRI experience.
Smart sensing
The readings from the light sensors are used to compute the trustworthiness of the vision-based sensing. This means that in a dark environment, the system does not rely on the vision modalities. Then, the readings from the motion sensors are used to deduce human presence when the vision sensing is not taken into consideration by the ubiquitous perception module.
3D body metrics
The previous work [27] presents a new method by which to estimate the age and gender of the person in the field of view of the camera based on 3D body metrics. Useful body dimensions are extracted from the depth data to identify relevant 3D body metrics. The 3D coordinates of 20 skeleton joints are used to compute indicative body metrics such as height, shoulder breadth, hip breadth, and head length. The proportion ratios are then calculated and trained on the depth dataset of 428 volunteers. During a study conducted at the large public exhibition, the system was shown to average a 73% correct success rate when determining gender and mean absolute error was 0.94 years with a standard deviation of 1.27 years for children when estimating age. This method outperforms the state of the art face-based solution, namely the Sophisticated High-speed Object Recognition Engine (SHORE) [9]. It was evaluated at the same public exhibition: gender was correctly estimated 58.49% of the time, and age estimation had a mean error of 3.31 years and a standard deviation of 6.2 years for children. The results presented in [27] confirm that the modality of 3D body metrics provides more satisfactory results in interaction with children.
Face detection
Three networked cameras are utilized to capture the extended field of view in front of the system. Consequently, when the user is not facing or in the field of view of a particular camera, the face detection can still be captured by an alternative camera.
The first source of face capturing is a standard web camera, which primarily uses the SHORE software package for real-time emotion analysis. It is able to detect and track the position and orientation of faces of multiple users with low CPU consumption and is robust to illumination changes [9]. SHORE is able to estimate gender and age, but mainly this package gives the intensity values of the following emotional states: happiness, sadness, surprise and anger. The intensity scores of the expressed emotions are recorded and the final average scores for each emotion is then calculated for each user. Such face-based results are insufficient in child-centered environments. Thus, the testbed only uses emotion analysis deriving from the SHORE software. Real-time emotion analysis is used by the adaptive interaction generator module for real-time adaptation and evaluation. During HRI experiments with the testbed, face-based emotion data is recorded and analyzed offline as a measure of the interaction experience with the testbed.
The second source of face capturing is that of the NAO’s vision module, ALFaceDetection, which the robot can use to detect human faces. This module also provides an estimate for the position of each face detected in the frame grabbed by the NAO’s camera, as well as a list of angular coordinates for a set of important face features. The human perception module uses the NAO’s vision only during interaction with the users, for example during non-verbal social behaviors such as eye contact, eye gaze, and others.
The third source of face capturing is that of the Microsoft Face Tracking SDK engine which analyzes input from a Kinect camera to detect and track human faces in real-time. This camera is used as an alternative source for face capturing when the web camera is not used.
Speech detection and recognition
Real-time streaming of audio from the Microsoft Kinect microphone array is used for speech detection purposes. Its Speech API provides communication with the microphone array. Once the audio signal from the microphone is significant enough to be speech, a speech detection event is detected.
An alternative source of voice capturing is that of NAO’s microphones for speech recognition, ALSpeechRecognition, which is used during interactions with the NAO robot.
Studies and demonstration
The system described in the previous sections complemented with content, such as videos and presentation slides (converted to HTML format in order to be used by the presentation Applet), was exercised as HRI testbed in situations of increasing complexity. On each occasion, both the robot(s) and the mini-kitchen were placed on adjacent tables in order for them to easily engage with the public.
Laboratory study
Before deploying the testbed in a fully autonomous setting, a pilot study was conducted in the laboratory to configure the behavior and test the robustness of the system. The experiment was designed to observe whether NAO’s verbal and non-verbal behaviors were able to intuitively engage the user in human-robot collaboration task. A simple collaborative cooking scenario was created, in which the NAO acted as a cooking assistant to a number of people who volunteered to participate in the study.

Experiment Hardware Configuration.
Each participant was invited to the laboratory and instructed to go to the experimental area (Fig. 4). The system could autonomously detect the human presence, and launch the interaction session. The NAO welcomed the user and explained what food was to be prepared. All participants were separated into two user groups: (i) a “Verbal Only” user group, with which the NAO used only verbal communication, and (ii) a “Social Cues” user group, with which the NAO used gaze and gestures to emphasize and facilitate participant’s understanding of its instructions (for instance by pointing to the drawer containing the salt, etc.). Each trial lasted an average of five minutes and was fully autonomous and timed. The participants were asked to complete the Godspeed [4] questionnaire (with items measuring likeability, intelligence and safety) immediately after the experiment. As expected, the second group resulted in an increase to both the level of user engagement, and to the user’s perceived intelligence of the robotic system.
Sixteen people (age:

Execution time values in seconds for 16 participants.
Figure 5 illustrates the significant results. A series of one-way ANOVA tests was conducted to compare people’s interaction with the testbed and their evaluations of likeability, intelligence and safety in “Verbal Only” and “Social Cues” conditions. There was a significant difference in the execution time in seconds for “Verbal Only” (
A series of Fisher’s exact tests was conducted to compare responses on participants’ favorite system’s component and whether they had an impression that the robotic system acknowledged their presence. There were no significant differences for both of these questions. However, the fact that 62.5% of participants reported that their favorite system component was “Smart Kitchen” is an interesting and desirable finding since it was important to facilitate a natural, engaging and enjoyable interaction with the whole intelligent environment rather than with a single robot.
This study’s goal was to address whether people would perceive non-verbal social cues within an intelligent environment as intuitive, natural and acceptable. Gaze and gestures of the humanoid robot significantly improved the time people spent completing the cooking task, which suggests that such behaviors are important for intuitive and natural interaction with the testbed. As expected, user’s perceived acceptance of the system was significantly improved with NAO’s social behaviors. In addition, people would use this kind of service at home significantly more when non-verbal social cues enhanced the interaction experience within intelligent environment.
In summary, PRIveT behaved as specified by the testbed engine with no technical problems during any of the sessions. The NAO’s gestures and gaze behaviors were adopted in all the subsequent tests.
University exhibition for research projects
The first outing of the PRIveT testbed outside the laboratory was a half-day exhibition organized by the Department of Computer Science and Informatics at University College Dublin to showcase a number of research projects. Contrary to the more controlled laboratory settings described in the previous section, visitors and potential study participants on this occasion were wandering around the area cluttered with many posters and other demonstration stands. This opportunity was used to test the system’s ability to attract the attention, autonomously maintain it and engage with an uninterrupted flow of users over a prolonged period of time. This gave a realistic opportunity to evaluate whether the integrated verbal and non-verbal behaviors of the intelligent environment are successful.
Method
As in the first pilot study, the testbed was autonomous: it displayed a video on the screen in order to attract the attention of the visitors. As somebody approached the stand, the NAO greeted them and invited them to participate in an interactive demonstration. Firstly, the NAO talked over a few presentation slides to illustrate some introductory concepts about intelligent environments. Secondly, it illustrated its own capabilities before presenting the other members of the system, by explaining the purpose of the sensorized kitchen, and by inviting to engage with the Turtlebot. At each stage, visual content related to each of the topics illustrated by the NAO were projected on the monitor. Finally, the visitors were invited to interact with the mini-kitchen to open, close or use kitchen appliances. The testbed acknowledged each of these actions through the NAO and the monitor, in order to illustrate its ability to recognize user’s activities. If the visitors were detected leaving the stand, the NAO thanked them and waved goodbye while the monitor re-started displaying the video. Finally, after their interaction with PRIveT, users were approached and asked to complete a questionnaire evaluating their experience.
Two interaction modalities were designed to reflect on the differences in evaluation of the testbed and interaction with it by the participants, namely:
This interaction type was used to identify if a humanoid robot embodiment was enough for humans to ascribe intelligence to the system. Within the system, the NAO is regarded as an embodied communication interface between the intelligent environment and the user.
Non-verbal cues represent fundamental communication cues that influence the quality of an interaction [23]. This interaction type was used to determine if the use of social cues, such as acknowledging human presence with gaze (eye contact) and (deictic) gestures in addition to the previous study’s social behaviors (gaze and pointing at the kitchen) facilitated the improvement of the user’s engagement and acceptance of the intelligent system.
Participants
Thirty people (age:
Results

Mean values for “Useful”, “Intelligent” and “Engaging”.
A series of one-way ANOVA tests was conducted comparing people’s interaction with the testbed and their evaluations of likeability, intelligence and safety in “Asocial” and “Social” conditions. When asked to rate how interesting was the interaction with the system on a 5-point Likeart scale, there was a significant difference in participants’ responses for “Asocial” (
A series of two-way ANOVA tests was conducted that examined the effect of gender and education level on people’s evaluation of likeabiliy, intelligence and safety of the testbed. There was a statistically significant interaction between the effects of gender and education level on how “Interesting” the system was evaluated on 5-point Likeart scale,
Figure 6 illustrates the results of the conducted independent-sample t-tests showing statistical differences in how women and men rated the system being “Useful” (female (
This pilot study aimed to evaluate participants’ acceptance and overall experience of the interaction with PRIveT at the public event. The method used provides realistic conditions to address the effectiveness of the verbal and non-verbal cues within ubiquitous robotics. This pilot study indicates that some of the engagement techniques for standalone robots are transferable to ubiquitous robotic systems. These results, detailed in [29], contrasted with the past experiences when using the NAO on its own, which had the undesired effect of concentrating too much of the users’ attention onto the robot and away from the actual content presented. As in the first series of tests, there were no technical problems with the operations of the testbed in the settings of public environment. Moreover, since the findings suggest that participants of different education and gender groups had various assessment of the interaction experience, adapting robots behaviors to suit these differences need to be supported by robots deployed in public environments.
Autonomous exhibition
In order to further test the flexibility and the reliability of the PRIveT testbed, it was used in conjunction with a showcase of research projects for an entire week (Fig. 7). In order to meet this requirement, the middleware was used to prepare two configurations, respectively: (i) a configuration similar to the one used at the University Open Day (but with different content), which was supervised on the first day of the showcase, and (ii) a fully-autonomous but partially simulated setup. For the latter, the actual NAO was replaced with its simulated counterpart. The simulated NAO was visualized on the monitor connected to the Mini-PC. This meant that, while in the original setup the presentation engine was running inside the NAO’s computer, in the autonomous setup it was installed on the Mini-PC, together with the NAO simulation software. Furthermore, since the simulated NAO does not have speech synthesis capabilities, the visual content displayed on the screen was modified to include subtitles.
Preparing and switching between the two configurations was straightforward, as the use of the PEIS tuplespace meant that it did not matter from which computer the presentation engine (the NAO or the Mini-PC) was running or which component was used to communicate with the user (the NAO’s speech interface or the text included in the visual presentation).

Autonomous Exhibition Demonstration Stand.
Autonomous but partially simulated setup performed satisfactorily throughout the week. However, a couple of problems had to be tackled. The first problem was due to a fault in the USB serial port of the Mini-PC. The problem was solved by adding an ad-hoc connection recovery step, which listed the available USB ports and pro-grammatically re-initialized the connection to the sink mote upon the detection of the first communication error. The second problem was that the Turtlebot’s ROS component in charge of the skeleton tracking functionality would regularly become irresponsive if it was left operating for more than a one day-long shift. For this reason, the staff working at the exhibition center was instructed on how to launch and shutdown the Turtlebot and the testbed engine at the beginning of each day.
Experiences with PRIveT are concluded with a study conducted to evaluate how effective it would be to adapt the testbed’s interaction style to suit each particular user. Specifically, the aim was to investigate the effects of the robot’s adapted speech content on levels of engagement and acceptance of the robot by children and adults.
Method
The study was conducted at the Discover Research Dublin exhibition with children and adults that interacted with the self-presenting testbed. The autonomous experiment was set up in an exhibition hall. Participants interacted with a robotic testbed for two-three minutes and then prompted by a robot to answer a few questions.
The independent variables were NAO’s adaptive (a child or an adult content) and non-adaptive speech content conditions. Participants were randomly assigned to interact with the robot in either condition with applied counterbalancing in terms of the participant’s perceived age to the robot’s speech content. The adaptive robot changed its speech content relative to whether the participant was perceived to be a child or an adult. Otherwise, interactive demonstrations were the same.
Scenario
Each session is structured as an interactive game. Child content is presented below:
NAO: – Hi! (waving) I am a robot and my name is NAO. What is your name?
Child: (child says the name)
NAO: – It is very nice to meet you. It is my first time outside of the factory where I was created. My job is to help people be safe by reminding them to turn things off in the kitchen. Would you, please, help me practice for my new job? It’s very easy to help (pause). You can open a microwave, an oven or under the sink press one at a time. You are welcome to start (looking and pointing at the kitchen).
Child: (child opens either a microwave, an oven or a cupboard).
NAO: – The sensor inside the microwave/oven/cupboard (looking and pointing at a particular appliance) detected that you opened it. Could you, please, close it? If you want, try to open something else!
Child: (child plays with the kitchen for one minute)
NAO: – Thank you so much for your help! Hope to see you soon. Bye-bye!
A very similar presentation was delivered to the perceived adults with slight change in the verbal content:
NAO: – Hello! (waving) I am a humanoid robot and my name is NAO. What is your name?
Adult: (adult says the name)
NAO: – It am pleased to meet you. I would like to present a project that aims to combine various technologies to work together in order to achieve complex tasks, for example to help elderly with chores. I am one of the components of this project and I am aware of people’s activities in the smart environment. Would you like to try our demonstration? You can open a microwave, an oven or under the sink cupboard one at a time. You are welcome to start (looking and pointing at the kitchen).
Adult: (adult opens either a microwave, an oven or a cupboard).
NAO: – The sensor inside the microwave/oven/cupboard (looking and pointing at a particular appliance) detected that you opened it. Could you, please, close it? If you want, try to open something else!
Adult: (adult plays with the kitchen for one minute)
NAO: – Thank you so much! Hope to see you soon. Bye-bye!

Results of the final experiment. Participant’s behaviors (Vote, Stayed, Opened Doors) in non-adaptive and adaptive conditions. * indicates significance at 0.05 level. ** indicates significance at the 0.001 level.
During this scientific exhibition, people had to vote for their favorite demonstration by leaving their individual sticker from the exhibition badge on a particular table. This was the first measurement of the interaction: Vote (yes/no). The other measurements were whether people replied with their name when asked by the robot, Participant Name (yes/no), whether people stayed until the end of the demonstration, Stayed (yes/no), and whether people opened the doors of the toy kitchen, Doors (yes/no). When people stayed until the end of the demonstration they were asked to fill in their demographics information such as age and gender and answer a few questions: whether they would like a robot to adapt to their preferences, Adaptation (Yes/No/Not sure), how interesting they think the presentation was, Interesting (a 7-point Likeart scale) and whether they remembered the name of the robot, RobotName (yes/no). These questions were only answered by the people that stayed until the end of the demonstration. Emotion analysis was recorded during the interactions with the testbed as an additional measurement for the interaction experience. The expressed valence, Happiness (numerical value) was considered as the last measurement.
Participants
There were a total of 131 participants who interacted with the self-presenting testbed. Out of the total number of participants, 73 of them filled in the questionnaires and their age was between 12 and 59 years old. Other demographic information such as age group (child vs. adult) and gender (male vs. female) is estimated by the experimenter. Table 1 presents the breakdown of all participants in each condition.
Breakdown of Participants by Age and Gender
Breakdown of Participants by Age and Gender
In order to compare people’s behaviors between conditions a series of Chi-Square analyses was conducted on categorical data. There was a significant difference in the Vote measurement:
When examining whether people responded with their name when asked by the robot, there was non-significant difference between conditions. The result here is not surprising since this question was asked at the very beginning of the self-presenting demonstration.
There is also a significant difference between adaptive and non-adaptive speech content conditions in people’s duration of the stay at the robotic demonstration (
The following results are from the self-reported responses that people filled in the questionnaires. Note that the number of respondents was 55.7% of the total number of people that were at the testbed stand. The questions on Adaptation and whether participants could correctly recall robot’s name did not indicate statistical significance between adaptive and non-adaptive conditions. Thus, the questionnaires were analyzed for potential significance between age or gender groups. Nonparametric Chi-Square analysis showed a non-significant difference between age or gender groups in people’s preference for an adaptive robot, however people significantly preferred the robot to adapt to their preferences:
Additionally, a one-way ANOVA was conducted on the measurement of how interesting people evaluated the demonstration. It was a statistically significant difference in how children and adults perceived it to be interesting:
Discussion
The final study at the public exhibition aimed to investigate whether an adaptive verbal content delivered by the robot would effect the way children and adults interact and engage with an autonomous system. In order to supplement the use of questionnaires, alternative means of assessment were utilized and produced usable data showing significant differences in people’s interaction with the testbed between adaptive and non-adaptive conditions. In contrast to questionnaires that do not work particularly well for children, these alternative and reliable measurements are particularly important for evaluations of children-robot interactions.
Conclusion
This paper describes the design of PRIveT, a portable and self-presenting robotic testbed, and illustrates the experiences in using it in a number of project presentations and HRI studies. This HRI testbed is able to autonomously engage and dynamically adapt to its users, explain and demonstrate the main concepts behind robotic systems by showing people a working example consisting of robot(s), sensors and cameras working together.
The system architecture of PRIveT illustrated in Section 3 provided a lot of flexibility in the design of each self-demonstration and user study, and deployment of the testbed in different situations. The use of a centralized testbed engine simplifies the creation of synchronized behavior among the various components in the system, while the peer-to-peer PEIS tuplespace allows to engage in loosely coupled, indirect collaboration.
Over a number of experiments at public environments, the PRIveT testbed has performed and has proven to be efficient for the HRI field studies. Its flexibility and scalability of components allows it to be deployed in a wide range of settings and scenarios. In particular, self-presentation and self-demonstration at the research exhibitions attract a lot of attention to the concept of ubiquitous and smart environments, thus serve two-fold purpose: introduction of the intelligent environment concept to the public and investigation of natural human interactions with an intelligent environment. In order to supplement or replace the use of questionnaires, quantitative measures were designed as alternative means of evaluation and have worked very well for the HRI experiments.
Footnotes
Acknowledgements
This work has been supported by the EU FP7 RUBICON project (contract n. 269914), in conjunction with funding from the Irish Research Council “Embark Initiative” and Science Foundation Ireland (SFI) under grant 07/CE/l1147. We would like to thank the anonymous reviewers for their thorough review and highly appreciate the comments and suggestions, which significantly contributed to improving the quality of this paper.
