Abstract
BACKGROUND:
People with severe neuromuscular disorders caused by an accident or congenital disease cannot normally interact with the physical environment. The intelligent robot technology offers the possibility to solve this problem. However, the robot can hardly carry out the task without understanding the subject’s intention as it relays on speech or gestures. Brain-computer interface (BCI), a communication system that operates external devices by directly converting brain activity into digital signals, provides a solution for this.
OBJECTIVE:
In this study, a noninvasive BCI-based humanoid robotic system was designed and implemented for home service.
METHODS:
A humanoid robot that is equipped with multi-sensors navigates to the object placement area under the guidance of a specific symbol “Naomark”, which has a unique ID, and then sends the information of the scanned object back to the user interface. Based on this information, the subject gives commands to the robot to grab the wanted object and give it to the subject. To identify the subject’s intention, the channel projection-based canonical correlation analysis (CP-CCA) method was utilized for the steady state visual evoked potential-based BCI system.
RESULTS:
The offline results showed that the average classification accuracy of all subjects reached 90%, and the online task completion rate was over 95%.
CONCLUSION:
Users can complete the grab task with minimum commands, avoiding the control burden caused by complex commands. This would provide a useful assistance means for people with severe motor impairment in their daily life.
Keywords
Introduction
Spinal cord injury caused by motor neuron disease (MND), amyotrophic lateral sclerosis (ALS), or accidents leads to a reduction in the quality of life for many people. It is difficult for these people to perform some basic daily tasks, such as grabbing, lifting objects and walking. Many studies attempted to create a high-tech assistive device to enhance their quality of their life [1, 2, 3]. In the past few decades, numerous attempts had been made to design and manufacture full-bodied humanoid robots. The development of mechanics, electronics and computer science technology has promoted the development of humanoid robots, such as ASIMO, HUBO and HOAP-2 [4, 5, 6]. Advances in robotics allow individuals with disabilities to use robots to perform daily tasks more independently [7, 8]. Brain–computer interface (BCI) is a technology that can be used to help these people perform certain daily tasks with the help of a robot, such as reaching and grabbing objects.
Vidal proposed the concept of BCI in 1973 [9]. BCI is an advanced communication and control system that operates external devices by directly converting brain activity into digital signals. Therefore, BCI can enable disabled individuals to communicate with other people or control their surroundings without using any muscle activities [10, 11]. This system obtains neural responses from the human brain invasively or non-invasively, and explains human intentions by dividing the neural responses into several mental states [12]. This mind-reading technique can convey human intentions as commands to the machine. Several studies have successfully proven that the invasive BCI technique can be used to control a robotic arm to perform a series of actions [13, 14]. These invasive methods can obtain signals with high signal-to-noise ratio (SNR), which can control the peripheral devices better. However, this technology is invasive and requires electrodes to be surgically implanted, and users face the risk of post-operative complications and infections that may cause serious harm. Furthermore, long-term stability of recorded signals may be another issue that should be addressed. Therefore, non-invasive BCI methods are more suitable for humans because they can avoid health risks and related ethical issues, and the method is easy to apply and harmless to patients [15]. Different methods such as sensorimotor rhythm, moving image (MI), P300 potential and steady state visual evoked potential (SSVEP) are used in BCI technology [16, 17]. Compared to other BCI methods, SSVEP has the advantages of high SNR, higher accuracy, higher information transmission rate (ITR), and shorter training time [18, 19, 20], so it is more suitable to control peripheral efficiently.
SSVEP is the brain’s periodic response to periodic visual stimuli modulated at frequencies above 6 Hz [21]. When the subject focuses on the visual stimulation, the visual pathways will be affected and the frequency of the visual stimulation will induce the subject’s brain and generate a signal with the same frequency as the stimulation frequency or its harmonic. Multiple light sources can be used to provide visual stimulation in SSVEP applications, such as light emitting diode (LED), cathode ray tube (CRT) monitors, or liquid crystal display (LCD) monitors. Most studies verified that the strongest SSVEP response can be observed in the visual cortex [22, 23].
With the development of robotics and neural engineering, a BCI control system for robots based on EEG has been proposed, so some elderly or disabled people can control the robot naturally and intuitively merely by thinking when using the system. The ultimate goal of this BCI-based robotic control system is to generate and transmit stable, sophisticated, or even emotional, intention into robots and let them perform various complex tasks according to human intentions. BCI-based robotic control systems using EEG have been applied to mobile robots [24], manipulators [25], wheelchairs [26, 27], and humanoid robots [28]. These previous studies have effectively proven the possibility of EEG-based BCI systems for robot control.
For practical human–robot interaction applications, proposed brain-controlled robot system using EEG-based BCI employed different types of electrophysiological brain signals, such as SSVEP and sensorimotor rhythms. According to the properties of brain signals, the system can be categorized as either a reactive BCI or an active BCI [29]. The reactive BCI enables users to control equipment by detecting indirectly modulated brain signals related to specific external stimuli. SSVEP is a reactive signal and is commonly used in BCI-based robotics applications. These signals are generated when the target object visually stimulates the brain in some methods such as a sudden flash of light [30].
Schematic of system.
One of the main goals of EEG-based BCIs for human-robot interaction is to be able to directly control a robot by low-frequency visual stimulation without thinking. Therefore, our study adopts a reactive and no training BCI approach to control a new brain-actuated humanoid robot system for home service. The robot automatically navigates to the designated placement area to grab the required object and send it back to the user. With the help of machine intelligence of multi-sensors fusion, it can avoid collisions between robot and obstacles on the ground during navigation. A channel projection-based canonical correlation analysis (CP-CCA) target recognition method is adopted for signal analysis to send commands [31]. Through user interface visual feedback, when the robot recognizes the wrong target, subjects can cancel the wrong command and resend the required command. Furthermore, to further increase the practicality of the proposed system, a portable and low-cost wireless EEG device is utilized to measure SSVEP signals.
The contribution of this work is to develop an efficient and feasible brain-robot system to help users grab the needed object remotely in an indoor complex environment without moving their body, thereby improving the independence and quality of their daily life.
Subjects
Ten healthy subjects (7 males and 3 females) participated in the offline and online experiments, respectively. All 10 subjects ranged in age from 21 to 26 (average age 24). These fully BCI-naive subjects had normal or corrected vision. All subjects provided written informed consent and were clearly instructed about the purpose of the experiment and possible results. Subjects received a small monetary compensation for their participation.
System description
Figure 1 shows the overall control architecture of the proposed SSVEP BCI robot control smart home system. The wirelessly transmitted raw EEG data are recorded by the headset with dry electrodes and then transmitted to the PC for preprocessing to increase the SNR. For target recognition, the EEG analysis algorithm performs feature extraction and classification on signals. Finally, the control commands are generated by the computer according to the classification results. The humanoid will conduct the control commands to move and grab the target object. In order to get real-time feedback on the status of the robot, the data (i.e., the visual images from the humanoid monocular camera) are transmitted using wireless TCP/IP communication protocol between the humanoid and other systems.
After receiving the start command (teeth clenching), the robot Nao will automatically navigate to the object placement area with the help of Naomark. During navigation, the multi-sensor detects obstacles and prevents the robot from colliding with objects. Naomark is a special landmark that can identify location and is described in more detail in Section 2.4. When the robot reaches the object placement area, it will scan the surrounding objects from left to right. The object is placed in a specific box that the humanoid robot can grab and pasted a Naomark on its outside for rapid recognition by the robot. The different IDs of the Naomark represent each of the different objects. The robot sends the scanned ID number to the user interface and converts it into a corresponding object, for example ID “84” stands for an apple. As shown in Fig. 2a, the subject can choose the object of interest. When an object is selected, the user interface will switch to the second layer (see Fig. 2b). The selected object will appear in the center of the interface and will stay there for 4 seconds. The countdown time will appear in the upper right corner of the screen. If the selected object is wrong, the subject can send the “teeth clenching” signal to cancel the command and return to the first layer interface.
The user interface of the BCI system. (a) First layer of the interface, (b) Second layer of the interface.
The stimulus frequencies normally used in the SSVEP can be divided into three frequency bands: low (1–12 Hz), medium (12–30 Hz), and high (30–60 Hz). A study shows that the peak of SSVEP amplitude appears near 15 Hz in the 5–25 Hz range and has a high signal-to-noise ratio [21]. Therefore, in this study, four frequencies (i.e., 7.5, 8.57, 10, and 12 Hz) in the lower range are selected as stimulus frequencies, thereby covering the alpha frequency band.
A 21.5-inch liquid-crystal display (LCD) with a resolution of 1920
The chest position of the robot is equipped with two ultrasonic sensors (sonar) which can estimate the distance of obstacles in the surrounding environment. The detection range of the sonar is 0.20 m–0.80 m. Two contact sensors (bumper) are located at the tip of each foot. The bumper sensor is used to detect some obstacles about 100 mm high from the ground, and the robot will automatically step backward to avoid the obstacle on the ground if one of the bumper sensors is triggered. As for some obstacles over 100 mm high from the ground, once the sonar value reaches the set threshold (0.4 m), the robot will immediately stop walking.
In the process of navigation, if the sensor detects an obstacle, the robot will use the sonar sensor to measure the distance between the left and right sides of the robot. When the distance on one side is greater than the distance on the other side, the robot will bypass the obstacle from the side with the greater distance and continue to move forward. If the bumper is triggered, the robot will return to the previous position and bypass the other side. Figure 3 shows how the robot avoids the obstacle on the ground. Both categories of sensors guarantee the safety of Nao by avoiding the collisions between Nao and obstacles while improving the system executive efficiency.
Schematic of how the robot avoids the obstacle.
In order to grab an object, the robot needs to calculate the walking distance to navigate to the object. A special landmark called Naomark [33] can detect the location, which was used as a range finder in the experiment. An example of Naomark is shown in Fig. 4. The sector rotates around the center of the circle and the angles formed by different sectors distinguish different Naomarks with different IDs. It contains a lot of information that the robot can identify, among which SizeX and ID are needed in our study. SizeX is the target landmark image pixel size in the Nao robot’s vision in radians (rad), and MarkId is the unique number of the landmark.
Naomark.
The Nao camera defaults to the 640*480 resolution, with 640 pixels in horizontal direction and 480 pixels in the vertical direction. The equation for sizeX is:
where pixel is the diameter size of the landmark imaging pixel, HOV represents the camera’s horizontal angle of view, HOV
Through machine learning, the model of the relationship between sizeX and the distance within 1m is described by Eq. (2). In the distance between 1m–2m, the relationship between sizeX and distance is described by Eq. (3):
where
In this paper, the double-arm grabbing operation was implemented the robot Nao. Since the left arm and right arm of Nao are symmetrical, the analysis of the kinematics is performed using the left arm as an example. As shown in Fig. 5, the left arm of the robot consists of three parts (upper, lower arm and three fingers), which are connected by five joints.
Since the fingers have no relationship with the joint movement of the left arm of the robot, we ignored the degrees of freedom on the finger part and only constructed the model with five degrees of freedom for one arm. According to the Denavit-Hartenberg (D-H) method [35], a kinematics model for the left arm of the Nao robot is built. The D-H parameter table is obtained and shown in Table 1. In this Table 1,
To obtain the positive kinematics of the Nao manipulator, the D-H method is needed to determine the link parameters. The D-H parameters of the left arm of the robot are shown in Table 1. Equation (4) is the transformation matrix of the adjacent two-link coordinate system.
where
D-H parameters of the left arm
The structure of the left arm [33].
The inverse kinematics solution of the robot is the inverse solution of forward kinematics. Inverse kinematics solution means that the required pose of the robot end-effector on the reference coordinate system is known, and then the joint motion parameters of the robot need to be found. Defining the desired pose of the robot Nao’s end-effector as:
When Eqs (5) and (6) are equal, the five joints’ variable angles of robot Nao’s left arm are obtained [36].
where
Emotiv EPOC headset combined with cost-effective and portable features will be used to collect EEG signals, as shown in Fig. 6. Compared with wet sensors, the dry sensor has some advantages, including no need to inject conductive gels or glues needed during operation, easy to attach to the brain scalp through the hair, and can be reused many times. For brain activity recording, according to the 10–20 international system, 14 channels are placed on the standard positions. Moreover, CMS/DRL reference positions are also employed, which are located behind the ear of the subject. According to our previous research [31], O1, O2, P7 and P8 channels belonging to the occipital region will be used. In each EEG channel, the sampling frequency is down-sampled from a 2048 Hz to 128 Hz. The subject’s EEG signal is filtered using fourth order Butterworth band pass filter with fL (equals to 7 Hz) and fH (equals to 49 Hz).
EEG acquisition device. (a) Emotiv EPOC, (b) electrode position according to 10-20 EEG placement.
In the BCI experiment based on SSVEP, the subject comfortably sat on a chair, and the display was placed 60 cm in front of the subject. For the offline experiment, the subjects performed a simulated online experiment to record EEG data for offline analysis and the humanoid robot remained stationary. Subjects stared at one of the four stimulation targets indicated in a random order by computer prompts. Each subject completed 10 runs, and each run was composed of eight trials. After five runs, subjects were asked to rest for two minutes to reduce eye fatigue. Each trial lasted 5 s and consisted of two parts: a 1 s cue phase and a stimulation phase of 4 s. Figure 7 shows the timing scheme of the entire procedure. The subject was required to avoid blinking and eye movement during the stimulation process for less eye artifacts. The first six runs are used to train the data and optimize the parameters of each subject. The six-fold cross-validation was utilized to evaluate the precision of SSVEP recognition for one subject. The last four runs of data are used to obtain the average classification accuracy and ITR of the offline test.
The timing of the entire procedure.
(a) The real experimental environment, (b) the plan of the experimental environment.
An experimental area of 800 cm
Classification accuracy, completion time, and ITR were calculated for the online experiment to reflect the overall performance of the system. The ITR is a well-known parameter for BCI system evaluation [37]. For a trial with
In this study, the number of targets
Canonical correlation analysis (CCA) is a statistical method used to measure the underlying correlation between two multidimensional variables. Therefore, CCA extends the ordinary correlation to two sets of random variables and has been widely used for SSVEP recognition. Due to the power law distribution of the power spectrum spontaneous, the electroencephalogram (EEG) signal will affect the detectability of SSVEP at different frequencies. Thus, CCA may not give the best accuracy for SSVEP classification, even though many researchers have proven that the performance of CCA is powerful [38]. To alleviate this problem, normalized canonical correlation coefficients for CCA needed to enhance the frequency detection of SSVEP. A signal with a higher characteristic representation is used instead of the sine and cosine signal as a reference signal to improve the recognition accuracy.
Figure 9 shows the flowchart of the CP-CCA [30]. We used the CCA method to determine the best data to represent multiple trials of EEG data recorded on a single channel when subjects gazed at the same frequency of visual stimulation. Suppose that recorded EEG data of multi-trials in the specific stimulus frequency are
The reference signal
In this work, the number of target stimulus frequency n=4. For the reference signal, its fundamental and second frequency are considered.
Flowchart of the CP-CCA method for frequency recognition in an SSVEP-based BCI. For the same target frequency 
Table 2 illustrates the classification accuracy and ITR of all subjects in the target recognition task where the system sent commands at a speed of 4 s per command. A total of 32 trails are required for each subject. The BCI accuracy was evaluated by taking the ratio of the correct commands to the total commands. The average classification accuracy in the object recognition task was 91.88
Classification accuracy and ITR in the offline experiment
Classification accuracy and ITR in the offline experiment
Snapshots of the proposed brain-controlled robot system performing the move-grab-back task.
Results of the move-grab-lift robot control task
Table 3 shows the results of the real-time robot service task. The subjects were asked to perform a move-grab-back task four times using both the manual interface and BCI. The subjects grabbed objects back in the sequence they liked, including Object 1 (O1), Object2 (O2), Object3 (O3) and Object4 (O4). After some practicing trials (
In general, most subjects believe that although the time required for the BCI control robot to complete the grab task is longer than the manual control, the user experience is better. Subjects are in a relaxed state when using the system, and only need to make decisions instead of controlling the system all the time. Except for subject S5, which failed to grab O4 with the BCI control, all subjects eventually successfully grabbed the object. Figure 10 shows snapshots in which subject S3 first performed the move-grab-back robot service task by the proposed brain-controlled robotic system. Obstacles are randomly placed in each subject’s task and the obstacle positions of the two control interface tasks are the same.
In this study, an efficient BCI system was established with limited ITR. We showed how healthy subjects operate a non-invasive SSVEP BCI to control an intelligent humanoid robot to perform move-grab-back domestic service tasks. The result of the offline experiment without the robot movement session showed that the average accuracy of all subjects was 91.88
The main aim of this study was to design a new type of service robot which can help elderly users and disabled groups to grab telepresence objects with the least brain load and improve their ability of self-care. The intelligent humanoid robot is featured with navigating, scanning, grabbing and avoiding obstacles and can help users to reach any position in the house and grab objects in the complex indoor environment. SSVEP was a periodic response evoked by external visual stimulus at a constant frequency, and subjects did not need to perform additional operations in addition to focusing on a specific target. Compared with the traditional brain-controlled robot systems, this system changed the communication of BCI from one-command-to-one-motion to one-command-to-multi-motions. For target recognition, we used the special landmark (Naomark) of the robot Nao to enable the robot to quickly identify objects and locate their position in a complex background, thereby reducing the calculation time of the system to process images and improving the overall efficiency of the system. Users only need to make decisions instead of controlling the robot to walk, scan objects, grab objects and avoid obstacles. In other words, they can complete complex tasks with minimum commands by the improved interaction system. Subjects generally agreed that the user interface enable susers to feel relaxed as fewer and simpler commands are needed. In this study, a portable, lightweight and wireless EEG device was used to measure EEG signals to further increase the practicality of the proposed system in daily life. Without the wires, the subjects could move their bodies properly instead of staying still. User comfort can be improved during the process, and subjects do not feel uncomfortable on their scalp when dry electrodes are used. The high accuracy of the proposed system indicated that the portable, lightweight and wireless EEG device that is utilized here could be efficiently for SSVEP-based BCI applications.
It should also be noted that subject S5 failed once in the process of grabbing objects. So the system needs to be further improved in the following directions. First, in order to improve system response and enhance user experience, the coding and decoding method of the BCI system should be improved. Second, it is necessary to further develop the robot, optimize the program and improve the execution precision for higher success rate. Third, more advantage should be taken of the fusion of more sensors and actuators, such as an accelerometer and gyroscope. Under the premise of ensuring the balance of the robot and not falling, the walking speed of navigation and the execution efficiency of the system needs to be improved.
Footnotes
Acknowledgments
This work was supported by the Young and Middle-Aged Innovation Talents Cultivation Plan of Higher Institutions in Tianjin (Grant no. 20130830) and the National Natural Science Foundation of Tianjin (Grant no. 18JCYBJC87700).
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this manuscript.
