Abstract
As the world’s population continues to grow in size and in age, it remains vital to develop streamlined and affordable technologies to address the needs of those who require companionship and care. The elderly and visually impaired remain a significant and ever-increasing segment of society, yet due to the cost of at-home health care, many of these individuals are unable to afford full-time caretakers to assist with everyday tasks. Although the luxury of at-home care remains elusive to many due to the associated cost, social robots are an important and meaningful way to overcome this issue. This research explores how combining the principles of biomimicry and social robotics with the Internet of Things (IoT) and artificial intelligence (AI) can enhance assistive technology. This paper proposes a novel assistive device called the Jet-I-U, which aims to support visually impaired and elderly individuals with day-to-day tasks. The Jet-I-U mimics the behavioral and physical characteristics of a pet scarlet macaw and provides both information and companionship to this demographic. The device emulates a scarlet macaw’s morphology and movement through laser-cut features and various biomimetic components. The intelligence and behavioral characteristics of the macaw are modeled by the application of IoT Edge platforms, AI, and Machine Learning (ML). By utilizing powerful off-the-shelf microcontrollers like the NVIDIA Jetson Nano and Raspberry Pi, this device is both affordable and practical. Experimental analysis was performed to evaluate Jet-I-U’s performance based on several criteria. The results demonstrated the versatility and applicability of the proposed solution as a companion to the visually impaired and elderly.
Keywords
Introduction
The aging population of the world is increasing at an alarming rate. In 2017, the number of people over the age of 60 was 962 million, making up 12.7% of the world population; and by 2050, it is expected that this number will double to approximately 2.1 billion, making up 21.4% of the anticipated world population [1]. An increase in the population of older adults (above 65) results in an increased need for assistance in healthcare and social activities. Normal aging demands attention to maintain proper autonomy and requires extended support to meet physical and social needs. Recent advancements in robotics have provided opportunities to develop automatic and smart devices to help the elderly, including assisting as personal companions [2]. Social interaction is just as crucial for the elderly and visually impaired as it keeps them active and adaptive. Social bonding encourages the elderly to engage themselves in interactions that increase attachment and, as a result, enhances their lives [3]. Various research suggests that adopting pet animals such as dogs, cats, birds, fish, and others can help stabilize older people’s health conditions. They also recommend animal-assisted therapy for psychological comforts to cognitively unimpaired older adults. Pets positively influence elderly adults’ behavior by drastically reducing stress levels and, in turn, reducing verbal – and sometimes physical – aggression [4]. In recent years, robots are being used as companions, guides, helpers, and pets. There is ongoing research on using robotic pets in therapeutic rehabilitation at several hospitals and geriatric nursing homes. In many areas, robotic pets possess significant advantages in contrast to living animals. Robotic pets avoid infection chances and are fully customizable and controllable to fit the user’s needs) [5]. Robotic pets are a great way to balance increasing emotional interactions in elderly persons to stabilize their health condition and provide psychological comfort to reduce their feelings of loneliness, depression, and social isolation [6].
By 2030, it is expected that the number of intelligent devices with machine-to-machine communication capabilities will reach 50 billion [7]. Current advancements in technologies such as the Internet of Things (IoT) and its application to the robotics domain has a substantial role to play in the near future. Technologies such as these are heavily considered to play essential roles in healthcare and social robotics. One such common application is employing an IoT-based system for monitoring vitals and other such data points of older people. Safety is a significant challenge concerning older adults with respect to frailty and forgetfulness. Hence, it is essential to deploy a real-time monitoring system to assist older people, and IoT based approaches tackle these issues effectively [8]. Machines and devices possess autonomous functions and, as a result, require fewer human interventions. Artificial Intelligence (AI) and Machine Learning (ML) based applications and intelligence systems have gained much attention from various researchers. Allowing intelligent systems and computer systems to model our world exhibits AI, which has become the main focus for more than half a century of research [9]. The integration of robotics and AI can effectively emulate a pet to human relationship. AI and ML techniques incorporate various techniques such as Artificial Neural Networks (ANN), Fully Convolutional Networks (FCN), Recurrent Neural Network (RNN), Particle Swarm Optimization (PSO), and fuzzy logic approaches to enhance robotic communication. These techniques enable robots to optimize their behavior by adapting to a continually changing environment and effectively coordinating with humans and other smart devices. The intelligence added to the robotic communication enables the robot’s capabilities and allows it to perform tasks more effectively and efficiently [10]. Prior to the rise of AI, ML, and IoT, the types of tasks that assistive robots could perform were limited in scope. The field of assistive technology can really stand to benefit from AI, IoT, and ML integrations as these technologies can elevate and expand a device’s features and capabilities to a whole new level.
Therefore, technologies such as AI, IoT, and ML have come into prominence due to the multitude of potential applications, including advanced healthcare technologies which have the potential to act as companion devices for the elderly and visually impaired and become an integral part of modern society [11]. The presented research focuses on developing a technologically advanced robotic companion to assist visually impaired and elderly persons in their day to day life.
Literature review
Through advancements in ML, AI, IoT, and more, robots can provide exceptional companionship and physical and social assistance to humans, especially elderly persons. Such robots are known as social-robots. Social robots are employed to perform various social activities such as entertainment and interacting with humans. However, the field of social robotics also has broader applications. By looking into the progression and the development of social robots, one encounters a fascinating area of study, biomimicry, that deserves further consideration to understand how mechanical companions could address the social and physical needs of the elderly and the visually impaired.
Pyo et al. [12] presented a service robot system named ROS – TMS (Robot Operating System – Town Management System). ROST-TMS is a well-structured system that assists the elderly in day-to-day activities. The service robot performs essential daily tasks based on real-time information stored in the TMS database, along with distributed sensors and actuators installed in appliances (such as a refrigerator) to perform various functions (such as opening and closing its door). Additionally, it also performs detection and fetch-and-give tasks that are frequent for elderly people in daily activities.
Bennett et al. [13] went beyond Pyo by presenting a Socially Assistive Robots (SAR) called a “Paro” robot for reducing clinical depression in older adults. The study was conducted with eight elderly persons. Different information stages were aggregated, including robotic sensor data from a collar worn by the robot, and daily activity levels via a wristband worn by the elderly subjects. Experimental analysis suggests that using Paro robots in older people’s homes predominantly decreased the symptoms of depression in most of the adults. Variations in depression levels could be evaluated using robotic sensor data, i.e., analyzing the basic activity levels of elderly people and their interactions with the robot.
Libin et al. [14] developed an IoT-based smart Robotic Assistant (RA) to assist humans, especially elderly people, in multiple ways using voice commands and human gestures. The developed robot is controlled using IoT enabled devices such as smartphones. The RA can perform multiple operations such as start or stop and picking and placing of objects. Furthermore, the voice commands and gestures are processed using an online cloud server where the gestures and commands are converted into text form and communicated using a wifi network.
Yang et al. [15] developed a hybrid system of applying homeostatic drives theory on robots for addressing the requirements of elderly people in modern society. The study employed the Reinforcement Learning (RL) mechanism, and the robot is enabled to make proper decisions based on human input by environmental stimuli. The developed robot serves and interacts with older adults in different scenarios. Experimental analysis suggests that the proposed robot achieves 94% of human satisfaction and can maintain a stable condition for long-term service.
A recent study conducted by [16], faculty at MIT Media Lab, Northeastern University, and Boston’s Children Hospital, examined the effects of coral robots in inpatient pediatric settings. The team of researchers worked on developing and evaluating the use of a social robot teddy bear named “Huggable”, that was capable of providing social and emotional support to children in a hospital. Fifty four children participated in this survey, each given one the three emotional support givers: Huggable (the social robot), an online version of Huggable, and a plush teddy bear with human presence. It was observed that interactions with Huggable demonstrated greater levels of cheerfulness and happiness in comparison with the other emotional support givers. Researchers concluded that rigorous development and validation for social robots in pediatric settings requires further consideration and has the potential to improve patient experience in hospital environments.
Barra et al. [17], more specifically, developed a robot that went beyond performing preset tasks and attempted to build a stronger social bond with its user. This research presented a robotic mechanism with an advanced voice mail system, encrypted using a biometric key. The study proposed a humanoid robot named Pepper, which is capable of recognizing different people approaching the robot. The Pepper robot records the message from the authorized user and can forward it to another user or store it in its memory. Important information, such as messages and pictures, is encrypted and can be stored on its server. Experimental analysis was performed with five different users aged between twenty-three and thirty. Based on the interaction between the user and the Pepper robot, results showed 100% effectiveness, and could accurately identify people and relay information to the user.
Overall, these advancements in the area of science and technology have increased the significance of robots – not only as assistive devices for providing physical services but also as social agents providing different types of services such as entertainment, communication, companionship, and other features such as the capability of measuring depression levels based on sensor data. Socially Assistive Robots with different technologies such as ROS–TMS, incorporated with ML and IoT-based technologies are proven to provide daily assistance for elderly individuals using voice commands and human gestures. A comparative table of the existing work discussed in this section is depicted in Table 1.
Literature review summary
Literature review summary
The utilization of robots as social companions with respect to addressing basic needs and pointed tasks was discussed in previous literary works; however, there is an opportunity to introduce an intelligent companion in the form of a robotic, pet-like assistant that can accompany and assist the elderly and visually impaired. While there has been research conducted on the benefits of biomimetic assistive technologies, little has been done to explore the application of AI, ML, and IoT in this space. There is a need to understand human and robot pet interaction by not only focusing on essential tasks but also by studying how one can make the service robot more like its living physical counterpart. Such measures would make the pet-like robot assistant more acceptable and provide meaningful interactions for a more enriching experience.
Materials and software tools
Materials and software tools
The proposed research focuses on developing a prototype of an intelligent physical companion named “Jet-I-U” that will assist the visually impaired and elderly in their day to day challenges. There are plenty of devices for the visually impaired and elderly that can perform common obstacle detection, recognition, and avoidance tasks; however, this solution is novel because, in addition to performing a multitude of environmental-analysis tasks, the Jet-I-U is a social robot that simulates the relationship between a real pet and its owner. The Jet-I-U imitates the morphological and behavioral traits of a scarlet macaw and can communicate with its user through biomimetic behaviors such as flapping its wings, tweeting, and moving its head in the direction of points of interest. Scarlet macaws are known to be highly intelligent and compassionate creatures, and those that have been hand-raised or hand-trained can be very affectionate and make great companions [18]. The Jet-I-U emulates the physical characteristics of a scarlet macaw through laser-cut features and various biomimetic components. The intelligence of the macaw is modeled by the use of AI, IoT, and ML components. This research explores the integration of biomimicry and assistive technology with IoT and AI, through an experimental analysis approach that utilizes both qualitative and quantitative data, with the goal being to present an all-inclusive social biomimetic assistive robot for the visually impaired and elderly.
This section discusses the criteria, constraints, materials, software tools, subsystems, and assembly of the Jet-I-U in-depth.
Criteria and constraints
It is essential to understand the limitations of visually impaired and elderly users prior to developing a companion device. Many visually impaired and elderly users rely on canes to navigate their immediate surroundings. Therefore the Jet-I-U must be portable, hands-free, and independent. It is also crucial that the Jet-I-U incorporate pet-like physical and behavioral characteristics in order to form a strong bond with the user. Lastly, the Jet-I-U’s internal information must be communicated to the user through haptic and auditory feedback. When incorporating machine learning algorithms to classify and determine objects for object detection, obstacle avoidance, and object following, one must consider the microcontroller’s limitations to decide which models would work best. Lastly, when building robots, it is necessary to keep durability in mind, as many off-the-shelf components are fragile. Other constraints of the device can be determined by the various ways that the user can interact with the Jet-I-U.
Materials and software tools
Table 2 tabulates the platform, processor, materials, and software tools/services that were used for each subsystem in the Jet-I-U.
Subsystems
The proposed robot consists of the following subsystems: object identification (the right eye), object following and obstacle avoidance (the base - feet and tail), proximity sensing (the beak and the wings), and face tracking (the left eye and neck). The biomimetic features of these four subsystems are tabulated in Table 2. This section further discusses the various subsystems, its functions, and the interaction between them. Figure 1 provides an overview of by illustrating the various components that are present in each subsystem.
Jet-I-U subsystems diagram.
Object detection to voice feedback flow chart.
An important feature in the Jet-I-U is the object detection and recognition feature, which allows the Jet-I-U to detect common objects in the user’s environment and relay that information to the user [20]. This system runs on a Raspberry Pi 3B+ and a Logitech WebCam. In terms of software, the system uses TensorFlow Lite, OpenCV, and eSpeak. The Raspberry Pi 3B+, a small and affordable single-board computer, is connected to a LogiTech webcam on a pan and tilt mechanism that scans the user’s surroundings for common objects. By leveraging TensorFlow Lite, a set of tools for developers to run machine learning models on resource-constrained and IoT devices, and OpenCV, an open-sourced library of computer vision programming functions, the Jet-I-U is able to identify common objects in the environment. Finally, using an open-sourced text to speech library called eSpeak, the identified objects are converted into an audio format and outputted to the user via an embedded speaker. This section goes more in-depth regarding the methods used for object detection, recognition, and output in the Jet-I-U.
Object detection is performed based on image attributes, and a constructive representation of image features is essential to develop a robust object detection system. The proposed Jet-I-U’s main objective is to provide detailed information to the user to make appropriate decisions for themselves based on their present environment. In the proposed study, TensorFlow Lite is used for object detection. Object detection is used instead of image classification because object detection allows multiple objects to be relayed to the user, while image classification can only assign one type of label to a single image. In the proposed research, You only look once (YOLO) is used to detect objects in real-time. YOLO was designed to develop a one-step procedure for detecting and classifying the objects. The process of YOLO is different from the conventional techniques used for image detection. In the YOLO technique, unlike traditional algorithms, the bounding box and class predictions are performed simultaneously. The input image is split into a D*D grid, and the ‘N’ number of bounding boxes is defined in all the grid cells with a confidence score. Confidence is determined as the probability of the target confided to every bounding box.
The Jet-I-U mainly utilizes auditory feedback to communicate stored information with the user. Feedback from video streams are converted into speech via a text to speech API. Figure 2 illustrates the high-level flow from vision to voice feedback. Two cameras are mounted on the head of the Jet-I-U to act as the “eyes”. One camera focuses on face tracking (discussed in a later section) while the other performs object detection. The object detection subsystem utilizes a Logitech Webcam and is connected to the Raspberry Pi via a standard USB cable. The open-sourced text-to-speech API eSpeak is used to convert internal information into the form of audio output. The detected object label is converted into speech in standard English language using eSpeak; however, eSpeak supports thirty other languages that can be used to interact with diverse populations.
Face tracking and pan-and-tilt
The Jet-I-U head is controlled by a custom face-tracking and pan-and-tilt mechanism, which consists of a Raspberry Pi, an AI expansion board, a Pi camera, Open CV, pan-and-tilt mechanism, and PID (Proportional, Integral, Derivative) library. The multiple components come together to form an effective face-tracking mechanism, which acts as the “head” of the Jet-I-U [20]. The integration of a pan and tilt mechanism with face tracking capabilities is an excellent simulation of how a scarlet macaw’s head would move. Scarlet macaws don’t have muscles attached to their eyeballs and can’t move their eyes when walking, so they reflexively move their whole heads with the vestibular-ocular reflex to keep their vision steady instead. Similarly, the eyes of Jet-I-U will stay in place, but the head will move in the direction of nearby people and points of interest. The movement of the head will indicate the direction of nearby personnel. Combined with the object detection system, the visually impaired user will know the location of nearby people and familiar objects in their surroundings. Figure 3 depicts the structure of the head of the Jet-I-U. The following section further discusses the combination of various hardware and software components to form the head of the Jet-I-U.
Jet-I-U pan and tilt head diagram.
In the proposed research, a Raspberry Pi camera is used for face tracking. The camera is suitable for AI-based face detection applications and is connected to the Raspberry Pi board. The board is equipped with a professional, multifunction expansion board and uses high-quality metal servos. In order to identify the location of faces from the video stream outputted by the Raspberry Pi camera, the Jet-I-U uses OpenCV, a cross-platform, open-sourced computer vision library. OpenCV has many general algorithms in image processing and computer vision, and it is one of the most powerful research tools in computer vision. In Jet-I-U’s case, the face-tracking library is used. Face recognition is achieved by using the Haar Cascade face detection library, as illustrated in Fig. 4, OpenCV offers pre-trained Haar Cascade algorithms, organized into categories, depending on the images they have been trained on, e.g., faces in this case. Haar Cascade is a machine learning object detection algorithm proposed by [19]. The Haar Cascade approach trains models by using positive and negative images (what to detect and what not to detect).
The camera is mounted on a platform and is equipped with two metal digital steering gears, a high definition, wide-angle camera, a Raspberry Pi AI expansion board, and other external parts. The AI expansion board adds additional processing power to the Raspberry Pi and allows the sub-system to perform face tracking and pan-and-tilt in real-time. In addition, the expansion board can be directly inserted into the Raspberry Pi without cumbersome wiring and can also externally connect speakers, motors, servos, and other parts.
Face-tracking mechanism flow chart.
As a response to the data outputted by the face-tracking system in the Jet-I-U, the robot’s head turns to the person in close proximity to the user. In the face tracking process, the Pan and Tilt Library uses the PID Controller Loop to send data to the servos for tracking and correction. The PID settings can be adjusted in the code to facilitate face tracking: P – proportional, present (large corrections), I – integral, “in the past” (historical), D – derivative, dampening (anticipates the future). Figure 4 illustrates the initialization process of the vision mechanism components. Multiple servos are used to allow the Jet-I-U to turn its head based on output from the face tracking system (as shown in Fig. 4, for the head’s vertical and horizontal rotation). The assembly of the physical components is discussed later in the section.
In addition to the face-tracking and object detection/recognition system, the Jet-I-U also has the ability to follow its user around wherever they go and simultaneously avoid obstacles. The entirety of the Jet-I-U robot is mounted on the DC motor-driven chassis of a Jetbot, a vehicle controlled by the Jetson Nano. The Jetbot is an open-source AI-based platform employed for smart AI applications. It is powered by the NVIDIA Jetson Nano AI computer, which asynchronously supports multiple sensors and neural networks for object recognition, collision avoidance, and more. It consists of DC motors, motor drivers, Wifi antennas, a camera, and a rechargeable battery.
For object following, a pre-trained neural network is used in this study, which is trained on the COCO dataset to detect 90 different common objects. That includes: Person (index 0), Cup (index 47) and Bicycle (index 2). The model is sourced from the TensorFlow object detection API, which provides utilities for training object detectors for custom tasks. After training the model, it is optimized using NVIDIA TensorRT on the Jetson Nano. This makes the network incredibly fast and capable of real-time execution on the Jetson Nano. For computing object detections using a single camera image, internally, a program uses the TensorRT Python API (which allows developers to parse through models easily) to execute the engine. It also takes care of pre-processing the input to the neural network, as well as parsing the detected objects. At this time, it only works for engines created using a particular jetbot package. The package has the utilities for converting the model from the TensorFlow object detection API to an optimized TensorRT engine. Apart from the object following, obstacle avoidance is also performed concurrently to ensure that the Jet-I-U stays safe the entire time. In this study, a trained model is used on a limited dataset using the Raspberry Pi V2 camera with a wide-angle attachment to detect whether the robot is free or blocked to enable a collision-avoidance behavior on the Jet-I-U.
The Jetbot is an excellent addition to Jet-I-U because of its powerful processor. Its ability to concurrently perform obstacle avoidance & object following is significantly applicable in this scenario. As a result, the Jet-I-U is able to follow its owner wherever they go and concurrently avoid obstacles making it an autonomous and self-sufficient assistant.
Proximity sensing
While the Jet-I-U can concurrently execute and output object detection and recognition, face tracking, object following, and obstacle avoidance in the form of auditory output, the Jet-I-U also leverages a visually impaired user’s sense of touch and sound to communicate additional information on the surrounding environment. The Jet-I-U is equipped with a front-facing ultrasonic sensor that can detect when people or other objects are nearby. The ultrasonic sensor is connected to an Arduino Mega, an affordable off-the-shelf microcontroller, that is attached to various motors (limbs) of the Jet-I-U. The device is programmed so that when the ultrasonic sensor is triggered, the wings and the beak of the Jet-I-U move up and down while outputting a tweet-like sound to notify the user of nearby movement. In an ideal use case, when the Jet-I-U is sitting next to its owner, the flapping of the wings allows for potential haptic feedback between the user and robot while the tweet-like sound is the catch-all and notifies the user regardless of its placement next to the user. Figure 5 outlines the process in which the components of the Jet-I-U move based on the ultrasonic sensor being triggered.
Proximity-sensing mechanism flow chart.
In summary, the Jet-I-U consists of four subsystems that each perform individual tasks with the common goal of providing more information to the visually impaired user regarding their surroundings. These four subsystems are: object detection and recognition (the right eye), object following and obstacle avoidance (the base – feet and tail), proximity sensing (the beak and the wings), and face tracking (the left eye and neck). Figure 6illustrates the components within each subsystem, and Table 2 outlines the platforms, processors, components, and tools/services that were used to build the Jet-I-U.
Systems architecture diagram.
Component mapping on the Jet-I-U.
Inkscape design.
The body of the Jet-I-U was constructed using a combination of attractive off-the-shelf blue and green aluminum alloy chassis and custom red, green, and orange laser-cut acrylic components. These materials were used to make the aesthetics and morphology of the Jet-I-U as they were designed to evoke similar emotions and feelings to that of a real pet bird. Starting with the base chassis, the relevant components such as sensors, servos, cameras, and microcontrollers are mapped to understand and plan the mounting of components. When all the components and physical features are assembled, the Jet-I-U weighs 3 kilograms. This section discusses the physical design of the Jet-I-U and the various tools and components that went into assembling this robot.
Hardware and component mapping
Figure 7 depicts maps of various materials to locations on the Jet-I-U. The base of this device is supported on the aluminum chassis of a Jetbot. This inside of the robot holds a series of levels that mount microcontrollers, cameras, and limbs. The bottom-most level holds the Jetson-Nano and camera for object following and obstacle avoidance. The second story stores the Raspberry Pi for the face-tracking subsystem and a front-facing ultrasonic sensor for the proximity sensing subsystem. The third level holds the Arduino mega that controls all of the limbs and motors on the Jet-I-U. Lastly, the top-most level stores another Raspberry Pi with a speaker to identify objects and produce auditory output. Note that the variety of microcontrollers are used due to the foreseen strain that an off-the-shelf microcontroller would have if required to control all four subsystems at once. The Jet-I-U was designed to have realistic features that are analogous to that of a scarlet macaw. Analog servos are used to simulate wings flapping and a buzzer is used for a tweet-like sound. In addition, the device is equipped with claw mechanisms that act as a beak and the feet of a scarlet macaw. Figure 7 illustrates the complete assembled robot mounted on a custom-built perch, with claws clamped on to the perch (Fig. 9) to resemble the stance of an actual scarlet macaw.
Laser cut parts
The aesthetic components of the Jet-I-U that make up its cover were made of laser-cut acrylic on the Epilog laser cutter. The appearance was further enhanced by adding soft feathers to the upper exterior to represent the crest of the bird. The wings, eyes, and body cover were designed using Inkscape – the designs are depicted in Fig. 8.
Custom perch and claws
The Jet-I-U comes with a custom perch to store the device in a realistic and safe manner and is also equipped with claws to grip onto the base of the perch. In an ideal use-case scenario, the Jet-I-U’s claws would easily open at the push of a button. Once the claws are open, the “feet” of the Jet-I-U would automatically fold up via a magnetic mechanism that allows the device to be lifted and placed on the ground. From this point onwards, the Jet-I-U is free to move about the environment and follow its user around via the object-tracking and obstacle avoidance subsystems. The transition from being stationary on a perch to moving about the environment is easy as the Jet-I-U weighs only three kilograms. Figure 9 depicts Jet-I-U on its perch in a secure (claws down) and insecure (claws up) position. Once the Jet-I-U is unmounted from its perch, it is free to move about the environment in search of its owner and perform its functions.
Jet-I-U gripped onto perch (close-up).
End-to-end user testing methodology
Individual component testing was conducted for each of the above subsystems to ensure the adequate functioning of related sensors, components, and cameras (for face tracking, object identification, and user tracking). For end-to-end testing, blindfolded users were asked to respond to questions about their environment by only using the information provided by the Jet-I-U. In order to test each and individual component of the Jet-I-U, the following testing criteria was used to determine the success of the solution. Due to the multitude of services that the Jet-I-U has to offer, the testing was broken down into four phases:
Face tracking: One person will walk from left to right (in the perspective of Jet-I-U), and the blindfolded tester should be able to say if the user is on the left, center, or right. Object classification: 5 Common objects will be placed in front of the Jet-I-U, and the blindfolded user should be able to make out all five of them in a matter of 30 seconds. Proximity sensor: One person will stand in close range of the Jet-I-U, and the blindfolded user should be able to tell when someone is in proximity (indicated by flapping wings). Owner tracking: A blind person will walk from point A to point B on a flat and open surface, and Jet-I-U should be able to follow them from point A to B.
Robot assistant perched on park bench for user testing.
Schematic of testing environment. Human-robot Interaction: Apart from component testing, each subject was asked whether or not they felt that the biomimetic characteristics of the assistive robot had a positive impact on their well being.


Figure 10 depicts the use case in which the Jet-I-U was tested with. The testing environment consists of a user, sitting on a park bench with the companion Jet-I-U sitting on a perch next to the user as illustrated in Fig. 10. The Jet-I-U is powered on, and the non-blind participant is made to walk from point A to point E following the path, as illustrated in Fig. 11.
Now that the testing environment has been established, all four phases of the testing process will start, and the blindfolded user will be asked to answer a series of questions regarding their environment by only using feedback outputted from the Jet-I-U. Each of the four phases is tested five times to account for variability between test results. Table 3 displays a sample data collection table that was used for a participant in this study.
Sample data table with 15-year old, male participant
Quantitative analysis
Jet-I-U was tested with ten blindfolded human participants without any knowledge of the environment beforehand. An extensive analysis of the collected data was conducted to understand the benefits of the Jet-I-U.
Field testing results.
Figure 12 geographically depicts the summary of results obtained from the study. The lines and circles on the diagram represent where in the environment, the testing occurred. The colors represent the average success rate at that certain geographical point. The box labeled “Jet-I-U” symbolizes the position of the use case outlined in the previous section in Fig. 11. The lines between the circle markers represent whether or not the Jet-I-U was able to turn its head in real-time when a person moved in front of it from point A to B, B to C, C to D, and D to E. As depicted in Fig. 12, a geographical depiction of the average success rate among the ten participants, the Jet-I-U was accurately able to turn its head and detect individuals from four and six feet away. However, as people moved farther away from the Jet-I-U, it was unable to detect their presence. Similarly, at points A through E, the object detection and recognition averaged at a higher accuracy when objects were closer to the Jet-I-U. One can argue that this observation is beneficial as the Jet-I-U only relays information to the user regarding people and objects that are no more than 8 feet from the user rather than relaying all environmental information. Lastly, proximity sensing worked when objects or people were less than or equal to six feet from the device. This observation aligns with what detection scope Jet-I-U was programmed with. For complete field analysis, refer to Fig. 12.
Sample data table with 15-year old, male
In addition to success rate compared to the location of testing on the field, it was also observed that the Jet-I-U has a short learning curve. After approximately three tries with the device, the majority of users had no problem reaching an overall success rate of 90% or above. Figure 13 shows the trial number vs. average success rate for three of the participants in the study.
Success rate vs. trial #.
The average of all of the results from the five trials attempted by ten participants showed that Jet-I-U obtained an average accuracy of 85% for face tracking, 83% for object detection, 95% for proximity sensing, and 88% object following and obstacle avoidance. Table 4 clearly outlines the success rates for the four individual subsystems of Jet-I-U.
Qualitative observations were made during the end-to-end user testing of this device concerning the four physical components and the biomimetic aspects.
Alternative biomimetic options for hardware mapping.
Towards the end of the study, each participant was also asked whether or not they thought the biomimetic aspects and the characteristics of the Jet-I-U improved their well-being. 100% of the participants felt that the animalistic characteristics of the Jet-I-U resulted in a stronger bond and allowed them to connect with the device in a more meaningful way and generate a more positive emotional response as compared to that of a general assistive robot. The participants quickly warmed up to the idea of a robotic pet and were able to easily interpret environmental information outputted from the Jet-I-U through audio and haptic feedback. Many participants expressed the desire for intelligent robotic companions for the visually impaired and found the research worthy of future exploration.
In terms of limitations, it was observed that the Jet-I-U struggles to react quickly to people who walk by at a faster pace. For object detection and recognition, it was shown that low light settings could deteriorate the Jet-I-U’s performance, and in some cases, the model will confuse popular objects (example: a white door is called a refrigerator). At times, the text to speech output from the speakers was hoarse and unclear. The asynchronous object following and obstacle avoidance feature was somewhat ineffective and prone to error on sharp turns. In addition, at times, the Jet-I-U would confuse its owner with another object/person and start following it instead due to the limitations of the pre-trained neural network. Out of all the systems, the proximity sensing system was by far the most robust. The wings and beak of the Jet-I-U responded quickly when a person came in proximity of the device.
From the observations, it was evident that the idea of a bio-inspired assistive technology is beneficial to the user and allows them to create a social bond that far surpasses what is possible with conventional solutions.
This paper explores the combination of biomimicry and assistive technology with IoT, ML, and AI, by presenting an all-in-one robot assistant in the form of a pet scarlet macaw, to provide information and companionship to visually impaired and elderly individuals. The Jet-I-U is able to perform tasks such as object detection and recognition, facial recognition, obstacle detection and avoidance, and communicate internal information to the user through auditory and haptic feedback. Despite having a few limitations in certain use cases, this project combines AI, ML, and IoT edge platforms to generate a realistic and unique experience between the user and the device.
From quantitative and qualitative observations, Jet-I-U has proven to be successful for the use case it was designed for. Jet-I-U is a proof of concept, and numerous use cases can be incorporated by applying different IoT platforms and training customized models. More powerful microcontrollers and high-end cameras can also be considered for a better experience, at a higher cost. Various sensors such as LIDARs & PIR motion sensors can be added to provide more information to users. Jet-I-U can also integrate existing intelligent APIs within its services such as Google Voice, Amazon Alexa, or Microsoft Cortana to interact with users in a Q&A format. As depicted in Fig. 14, another area of research would be to explore how changing the biomimetic behavior of the Jet-I-U to represent that of another animal (such as a dog or cat) could affect its performance and usability.
Footnotes
Acknowledgments
The authors would like to thank all the researchers who have dedicated themselves to improving the lives of the visually impaired community. The first author would like to thank his teachers and parents, who guided him throughout the research and development of the Jet-I-U. Lastly, the first author would like to express sincere gratitude to the 49ers STEM Leadership Institute, Chevron, Silicon Valley Education Foundation (SVEF), and the Fab Foundation for their commitment to supporting STEM education and fostering design thinking, fabrication, and communication skills among youth.
