Deaf and hard of hearing in the United Arab Emirates interacting with Alexa,an intelligent personal assistant

Abstract

BACKGROUND:

Intelligent Personal Assistants have been booming around the world since 2014, allowing millions of users to interact with different cloud-based software via speech. Unfortunately, the Deaf and Hard of Hearing individuals have been left out without recognizable accessibility to such technologies, although it might be used to make their daily life routine easier.

OBJECTIVE:

In this research, the researcher studies the interaction and perception of Amazon’s Alexa among the Deaf and Hard of Hearing in the United Arab Emirates in its current set up (Tap-to-Alexa accessibility option) in addition to Sign Language as an input method. The researcher expands on the Technology Acceptance Model to study the acceptance of Alexa as an assistive technology for the Deaf and Hard of Hearing. Additionally, the researcher discusses more suitable input methods and solutions to allow Alexa, and other Intelligent Personal Assistants, be more accessible for the Deaf and Hard of Hearing.

METHODS:

The mixed method is used in this research in terms of collecting primary data through hands-on experiments, surveys, and interviews with the Deaf and Hard of Hearing participants.

RESULTS:

The researcher found that the Deaf and Hard of Hearing in the United Arab Emirates perceive that Sign Language combined with a Live interpreter is better than the accessibility option “Tap-to-Alexa”, which is a solution provided by Amazon. The researcher also found that Sign Language combined with a Live interpreter is the most suitable input method to make the device accessible for the Deaf and Hard of Hearing, in addition to translating the “Tap-to-Alexa” to different languages. Finally, the researcher proposes a modification to the Technology Acceptance Model to suit the research study of the Deaf and Hard of Hearing perception of Alexa.

CONCLUSIONS:

The researcher concludes that the ideal scenario for the Deaf and Hard of Hearing to interact and benefit the most from Amazon’s Alexa, and IPAs in general, is to include Sign Language as an embedded input method in the device and provide live interpreters; this sheds light on the importance of the interpreters’ jobs around the world. Additionally, “Tap-to-Alexa” must be translated into different languages for a better perception of the input method.

Keywords

Deaf Hard of Hearing intelligent personal assistants Alexa interaction

1. Introduction

Not everyone who suffers from hearing loss is identified as part of the Deaf community; people who lose their hearing at a later stage due to different reasons and the ones who were born deaf (prelingual deaf) may never be part of the Deaf community, especially if their parents have normal hearing capabilities and if they attended schools for hearing people (mainstream schools). At the same time, some hearing people may be part of the Deaf community; these individuals are either educators, counselors, or have friends and/or family members who are Deaf [1]. In this research, our focus is on the Deaf (with capital ‘D’) and Hard of Hearing who are part of the Deaf community.

According to Michelle Jay [2], in 1980, Charlotte Baker and Dennis Cokely identified four characteristics that a member of the Deaf community is associated wth. If a person fulfills at least half of these characteristics, they are then considered to be a member of this community. The first characteristic is Interaction, where a person should interact with the Deaf regularly through attending a residential school (school for the Deaf), Deaf events, or having Deaf friends and/or family members. Secondly, Deafness, which means that if a person has any degree of hearing loss, they fulfill this feature, whether they are Deaf or Hard of Hearing. Thirdly, Sign Language, where a person must know sign language and is aware of the signing etiquette. Lastly, Advocacy, which is fulfilled when a person is passionate about Deaf issues.

Although the terms deaf (with a lowercase ‘d’) and Deaf (with a capital ‘D’) and Hard of Hearing are similar to each other, each of these terms has its definition, and below is the explanation, as per Michelle Jay’s book [2].

“deaf: deaf (with a lowercase “d”) refers to physical deafness – the condition of lacking the sense of hearing to the extent that speech cannot be understood for communication purposes.”

“Deaf: Deaf (with a capital “D”) refers to embracing the cultural norms, beliefs, traditions, attitudes, and values of the Deaf Community.”

“Hard of Hearing (HoH/HH): The term “hard of hearing” refers to people with a hearing loss who can generally use the phone with amplification and can understand most speech.”

1.1 Assistive technology (AT) availability for the Deaf and Hard of Hearing (DHH)

Assistive technology (AT) is not a new concept for people with special needs, and neither it is for the Deaf and Hard of Hearing (DHH) specifically. Assistive technology includes a wide range of devices that help and “assist” people with special needs to have a better and easier daily life and to work around some of the challenges they face although it may have a negative effect on the person’s self-image [3]. These devices can be either software-based, hardware-based, or implants [4]. For the DHH, there has been a huge effort put into developing such devices to help them communicate better with the hearing world; therefore, the ATs for the DHH fall under the Information and Communication Technology (ICT) category.

Ever since the telephone was invented to help people communicate with each other and reduce isolation, the DHH were unable to benefit from it, and they were left out and isolated. However, this barrier started to collapse once the Teletypewriter (TTY) came into the picture, where text is sent and received between devices, with the ability to embed speech recognition to the device, to convert speech into text. A few decades later, when the internet started booming, the number of remote video calls started increasing, and DHH started communicating with each other using sign language [5]. They also started communicating with other DHH and hearing people through text, whether it is via Short Messages Service (SMS), Instant Messaging (IM) or e-mails.

On the other hand, the Closed Captioning (CCs), which is the “process of displaying text on visual screen to provide additional interpretative information”, has been evolving in the last few years and different machine learning applications have been studied and developed to help in automating the CCs in different media content [6].

1.2 The current input of Information and Communication Technology (ICT) among the Deaf and Hard of Hearing (DHH)

In the last decade, there have been many attempts to develop ICT-based solutions and e-services to help people with special needs [7] including software that would detect sign language and translate them to text or speech without the need of gloves in different sign languages, including American Sign Language (ASL), Indian Sign Language (ISL) and Pakistani Sign Language (PSL) [8, 9], however, despite the various efforts of many researchers, the accuracy of these products are not very high, therefore such software packages are not commonly used in the daily lives of the DHH.

One of the challenges to implement such software programs in the DHH’s daily lives is that it is very expensive to train them on the vocabulary and grammar of sign languages, which is usually done using machine learning [9]. Also, in order for the software program to be able to detect signs accurately, specific types of cameras which could detect the depth of body joints must be available, such as Microsoft Kinect Camera or RealSense Depth Camera [8]. The reason behind this is that Sign Language has five different parameters which work as input to create signs and if one parameter is misinterpreted, the whole meaning of the sign might change [2]. The five parameters include: the handshape used to create the sign, the movement of the handshape, the palm orientation while making the sign, the location of the sign in front of the body, and finally, the non-manual marker, also known as, the facial expression or the body language that is used to create the sign [2]. Therefore, in the absence of highly accurate depth cameras while using the Sign-to-Text/Speech software, the sign parameters will not be detected correctly.

Also, nowadays, some educational mobile applications are being developed to help the DHH communicate better with the hearing through learning the basics of very few languages around the world [10, 11]. Despite the fact that technology is still emerging and attempting to remove communication barriers, DHH are still unable to fully communicate with hearing people who do not understand sign language except through writing or texting [5]. The only current input methods that can be used by the DHH while interacting with ICTs are the Teletypewriter (TTY) and Text-to-Speech software.

1.3 The use of Intelligent Personal Assistants (IPAs) among the Deaf and Hard of Hearing (DHH)

For the last decade or so, we have been able to ask smart devices different questions verbally to inquire about the weather and different time zones, to set alarms and reminders, play music, and make calls. We can even give more sophisticated commands, such as asking the machine to turn the lights on and off, and other commands related to the Internet of Things (IoT). These devices are called Voice Assistants or Intelligent Personal Assistants (IPAs) and began to grow about a decade ago, starting with Apple’s Siri which was released in 2010, followed by Microsoft’s Cortana in 2013, Amazon’s Alexa in 2014 and Google’s Assistant in 2016 [12]. According to Ford and Palmer [13], Amazon controls 70% of the IPA market around the world as they sold over 8.2 million Echo devices which run on Alexa since it was introduced in 2014.

These assistants, including Amazon Echo Show device, which I used in this research, consist of software that is always listening for a specific trigger word, which is their “name”. For instance, Apple devices are continuously listening for “Hey Siri”, and Amazon’s Alexa is always listening for the trigger word “Alexa” to get activated. Once the device hears the keyword, it starts recording the voice and directs it to a server to process the task. Depending on the recorded commands, the server will respond by playing music, calling a specific contact number, answering a question, or any other task [12]. However, since speech is the main factor in using such technology, the Deaf or Hard of Hearing (DHH) people have limited access to using these devices [14].

Nevertheless, in 2018, a study by researchers in Washington [14] included analyzing 346 verified reviews published on Amazon.com written by Alexa users who are considered to be People with Disabilities. By “Verified Reviews”, the researchers meant that Amazon have confirmed that these reviewers purchased an Amazon device which is compatible with Alexa. 4.6% of these reviewers had some hearing loss and were considered either Deaf or Hard of Hearing. The use of Alexa differed by the reviewers ranged from asking simple questions, to testing their speech after speech-therapy sessions. It is believed that the findings of the study were biased when it came to DHH reviewers because they may not have thought of using voice-controlled devices because of their hearing loss [14].

In another study that took place at Gallaudet University [15], where 24 participants tried interacting with Alexa using a gesture system, which was developed by the researchers using American Sign Language (ASL), and Text to Speech (TTS). The experiment setup included a Samsung Nexus tablet and a keyboard to test TTS, plus an Amazon Echo Show device and a Logitech webcam to capture gestures by the live interpreter. In all cases, an Interpreter was available through a video call to interpret the participants’ signs and gestures. The researchers found that ASL was the most successful method of input, and based on that, they suggested duplicating the experiment in a more diverse sample than Gallaudet University and more stable setup [15].

Figure 1.

Research design.

1.3.1 Input methods studied for the Deaf and Hard of Hearing (DHH) while interacting with Intelligent Personal Assistants (IPAs)

There are three different input methods that DHH can use to interact with IPAs: Sign Language, Text-to-Speech (TTS), and Deaf Speech.

•
Sign Language: The DHH around the world communicate mainly through Sign Language; whether it is between each other or with other hearing people, therefore Sign Language is considered to be quite significant [16, 17]. Moreover, technology has recently considered Sign Language as an input method, whether it is through gestures, static signs, numbers, or alphabets, and since ASL is one of the most common Sign Language around the world, it is the most recognized Sign Language as an input method [18].
•
Text-to-Speech (TTS) Applications: The TTS method is commonly used through mobile applications where a person can type the text and press a button for the speech to be executed. Although this input method has a high level of accuracy, the time needed to text and the mobile application to speak is 3–4 times slower than when a person speaks; therefore, this method is considered to be inefficient while dealing with IPAs [15, 19].
•
Deaf Speech: Currently, Deaf Speech has a high Word Error Rate (WER) (78%) compared to hearing speech (18%) [20]; therefore, IPAs are unable to recognize Deaf Speech. Few of the prelingual deaf (the ones who were born deaf) or the ones who became deaf at an early stage in life are fluent in the spoken language, whether it’s English or not [21]; and even Deaf with “good” accent had worse accuracy than the average hearing speech WER [20]. The reason behind this is that IPAs were developed based on a massive database of hearing speech with different accents, and in order to have the Deaf speech recognized by IPAs, there should be a vast database based on Deaf speech; this is inapplicable mainly because the number of Deaf speakers are way less than hearing speakers; therefore the accuracy of Deaf speech will still be minimal. Despite the enormous efforts done using human computation and other solutions, the accuracy of this input method is still struggling [19, 22].

2. Research methods

2.1 Study design and sample

In this study, the research sample included 2.8% of the registered Deaf population in the UAE from different organizations for the Deaf. All participants in the sample were knowledgeable in Arabic Sign Language (ArSL) and/or American Sign Language (ASL) and were chosen randomly from three different Emirates. The sample included 39 DHH from Dubai, 23 from Sharjah, and 8 from Ajman. Moreover, a research framework that combined three different use cases was designed to explain the relationship and interaction between the DHH in the UAE and Alexa through a live interpreter and the Tap-to-Alexa accessibility feature as shown in Fig. 1.

2.2 Data collection

The data was collected through survey and interview, after conducting an experiment to allow the DHH to interact with the device.

2.2.1 Experiment

The research experiment was divided into multiple use cases, as follows:

•
Use Case #1: The participants were provided with a list of skills that are understandable by Alexa, they picked a few commands and signed them in ASL/ArSL while having a live interpret available through a video call. The interpreter was able to translate from ASL/ArSL to English in order for Alexa to understand the commands. This use case is illustrated in Fig. 2.

Figure 2.
Use Case #1.

Figure 3.
Use Case #2.

Figure 4.
Use Case #3.

•
Use Case #2: The second part of the experiment was to have the DHH command Alexa to turn on and off a specific light and to change its color to a certain color. Again, the commands were signed in front of a live interpreter who translated the signs from ASL/ArSL to spoken English. The use case is illustrated in Fig. 3.
•
Use Case #3: The third part of the experiment was to have the DHH interact with Alexa without using sign language. This was completed by the new update in Alexa’s accessibility options, “Tap-to-Alexa”, which is once enabled, multiple tiles appear on the screen, allowing the DHH to tap to let Alexa execute different commands; besides, Alexa Captioning was turned on to let the DHH read any response from Alexa, which they will not be able to hear. This use case is illustrated in Fig. 4.

2.2.2 Survey

For this study, closed-ended questions were provided as a survey for the participants once they completed all use cases in the experiment. The survey was written in both English and Arabic, and interpreted in Sign Language. The survey questions were based on the research conducted at Gallaudet University [15] and another research conducted on the DHH preference for wearable and mobile sound awareness [23].

Figure 5.

Training and prediction processing.

2.2.3 Interview

Participants were asked for consent to take part in the data collection including an interview after completing the survey mentioned above. The interviews were conducted using Sign Language and with the presence of certified interpreters provided by the supporting organizations to ensure the reliability and accuracy of the responses collected. The data collection process took an average of thirty minutes per participant. At the end of the process the interview responses were interpreted by the interpreters and transcribed for analysis purposes.

The purpose of the interview was to have more detailed feedback from the participants regarding their experience throughout the experiment which will assist the researchers in answering the research questions. Also, results from the interview highlighted any needed improvements to the input methods, the set-up or any other conditions that would result in a better experience for the DHH while interacting with Alexa.

2.3 Ethical considerations

This study was approved by the University of Wollongong in Dubai’s ethical committee. The data in this study was collected anonymously. Before the data collection, the researcher confirmed that the participants understand the purpose of the study and that they consented to take part in it, and then advised the participants that their participation is voluntarily, and they can withdraw any time. Data collected and notes are stored securely on a web-client at the researcher’s university.

2.4 Analysis

After completing the three stages of the data collection for both groups, the participants’ responses to the survey were analyzed using the statistics software, SPSS [24]. Moreover, the responses in the interviews were coded and analyzed using a web-based Artificial Intelligent (AI) software, where the researchers used trained custom machine learning models to interpret and classify the text based on sentiment analysis, which is a technique that detects the polarity (negative/positive opinion) of a text that is entered to the software. This type of analysis uses Natural Language Processing (NLP), which is a subfield of artificial intelligence, and various algorithms [25]. Noting that the analysis of the AI is represented in the software as positive and negative percentages.

The main algorithm that is used followed a Rule Based Approach, which helped identify the polarity of the participant’s opinion by identifying negative and positive words in the collected data, such as “bad” and “ugly” versus “good” and “beautiful”. The second algorithm used followed an Automatic Approach, which relied on machine learning techniques including “training and prediction processes”, as described Fig. 5. Lastly, a Hybrid Approach was used, which is a combination of the previous algorithms [25].

Moreover, the percentages of the results mentioned in the analysis section (Section 3) were calculated based on the average of the AI analysis of the different responses and the average of the responses of the participants themselves.

3. Results

3.1 The satisfaction of the Deaf and Hard of Hearing (DHH) with the device

The results of the interviews and surveys showed that 93% of the DHH participants were satisfied with the device and the average response of satisfaction was 88%. The reasons behind the high satisfaction percentage included the following: 1) the device is considered as a new technology; 2) the extensive features of the device; 3) the device is considered as an assistive technology for the DHH which adds a number of benefits; 4) the device is easy to use; and 5) there was a live interpreter during the experiment, and Sign Language was an input method for interacting with the device.

Figure 6.

Statistics on the device features.

According to the DHH participants, 82% stated that the concept of IPAs is relatively new to them and they never heard of such technologies prior to the experiment. Also, 90% of the DHH participants stated that they are usually interested in new technologies, hence they were interested in exploring the device in this study. The participants’ comments regarding this were in line with participant 60 who stated that this device gives him a sense of inclusion: “I am satisfied because the advancement in technology is opening doors for us to feel included”; according to the AI sentiment analyzer system, this response scored as $+$ 90% indicating a very positive response.

On the other hand, most of the DHH participants provided positive feedback on the features of the device and their responses were in line with participant 40 who he said that “It’s a nice technology, I liked that I was able to turn on the light and change its colors using sign language, and there are many good features in the device. I found it beneficial” (with AI sentiment analysis output of $+$ 99%). However, some of the participants believed that the features of the device were incomplete as participant 1 responded: “The technology should be complete to enable the Deaf to use it properly, there should be vibrations or light indication from the device to attract the Deaf attention.” (with AI sentiment analysis output of $-$ 87.8%).

These results are supported by the responses to the “reason of using the device” question of the survey, where the DHH were asked to select all features that they would use in the device. The list of features included: smart home, simple queries (e.g. weather, news, etc.), more complex queries (e.g. recipes, etc.), entertainment, utilities, or none. The descriptive statistics are shown in Fig. 6.

Moreover, 62.9% of the DHH participants mentioned that they were satisfied with the device because it is beneficial for them in their daily life and their responses were in line with participant 2 who responded saying “The device is helpful for us, the Deaf, especially at work where we are surrounded by hearing people, as it could provide us a lot of information with the help of the interpreter” (with AI sentiment analysis output of $+$ 88.4%). These results are supported by the responses of the question “Would you obtain this device?” of the survey, where results showed that 87% would obtain the device if they had suitable input methods.

Besides, 21.4% of the DHH participants stated that they were satisfied with the device because it is easy to use. The positive feedback of this percentage were in agreement with participant 5 who mentioned that the device “looks easy to use, it delivers the information easily” (with AI sentiment analysis output of $+$ 99.1%). This is supported by the average rate of ease of use for both input methods, which scored an average of 4.34/5 (86.8%) based on the responses of the survey questions.

3.2 The perception of the input methods

To determine the perception of the input methods which were tested during the experiment among the DHH, the participants were asked about the advantages and disadvantages, in their point of view, for both input methods (Sign language and ‘Tap-to-Alexa’) and the results were as follows:

3.2.1 Sign Language as an input method

85.7% of the DHH participants listed advantages of Sign Language, with an average of AI sentiment analysis output of $+$ 82.38%, while 11.4% of the participants listed both advantages and disadvantages, with an average AI sentiment analysis output of $+$ 11.21%; moreover, 2.8% of the participants listed disadvantages only, with an average of AI sentiment analysis output of $-$ 48%. An illustration of the statistics is shown in Fig. 7.

Figure 7.

Perception of Sign Language as an input method.

Advantages of this input method from the responses include the fact that “Sign Language is the Deaf’s mother tongue”, as participant 4 stated (with AI sentiment analysis output of $+$ 89.8%), and that “The interpreter made the interaction with the device much easier” as mentioned by participant 10 (with AI sentiment analysis output of $+$ 71.9%). Also, as it helped the DHH feel more independent as responded by participant 32 “I will be able to ask so many questions that occur to my mind and have an interpreter to help me with that, I don’t need any other help” (with AI sentiment analysis output of $+$ 99.3%).

However the disadvantages of the input method included the fact that it requires a strong internet connection in order to have the best video quality, as some of the participants faced a delay in the interpretation due to loss of connection. Also, the interpreter, who is supposed to be certified, he must also be a good interpreter who doesn’t keep asking for clarifications leading to time wasting, because “the whole point of the service is to provide help easily without consuming time” as stated by participant 49 (with AI sentiment analysis output of $-$ 69.1%).

The interviews responses were supported by the survey responses in terms of the rate of motivation to use, ease of use, and confidence to use the input method, where the DHH were asked to rate these three aspects from 1 to 5, where 1 is the lowest rating and 5 is the highest rating, and the results are shown in Table 1.

Table 1

Perception of Sign Language as an input method

Perception of Sign Language	Average	Percentage
Motivation to use this input method	4.64/5	92.8%
Ease of using this input method	4.67/5	93.4%
Confidence while using this input method	4.94/5	98.8%

3.2.2 Tap-to-Alexa as an input method

As for the new accessibility input method provided by Amazon, “Tap-to-Alexa”, the DHH participants were asked the same questions in the interview and the survey as the Sign Language as an input method section. Based on the responses of the advantages and disadvantages of “Tap-to-Alexa”, 43% stated that this input method has advantages only, with an average AI sentiment analysis output of $+$ 87.9%; while 20% participants specified that the input method has both advantages and disadvantages, with AI sentiment analysis output of $+$ 13.4%; on the other hand, 25.7% of the participant stated that the input method has disadvantages only, with AI sentiment analysis output of $-$ 75.43%. Finally, 11.4% had no comments because they are not very familiar with the input method yet. Figure 8 summarizes the findings.

Figure 8.

Perception of Tap-to-Alexa as an input method.

Figure 9.

Potential suitable input methods.

The advantages of the input method included easily accessible features such as the alarm and the music, it is easy to learn due to the availability of the icons, and the fact that it will enhance the DHH English reading skills because of the closed captions. However, some found the English a bit challenging and that the screen and the words should be bigger since Deafness is associated with low vision. To verify this, participant 25 stated that “There are many good features with “Tap-to-Alexa” but the Deaf here wouldn’t like it, it’s all in English and many of them are weak in English. It would be better if it’s translated to Arabic” (with AI sentiment analysis output of $+$ 1.8%). On the other hand, participant 16 commented saying “Communication through “Tap-to-Alexa” might be an issue for some Deaf because of the language gap, they would have to get help from a family member to use the device, which is not practical.” (with AI sentiment analysis output of $-$ 53.3%).

Similar to the Sign Language as an input method, which was discussed previously, the interviews responses were supported by the survey responses in terms of the rate of motivation to use, ease of use, and confidence while using the input method, and the results are shown in Table 2.

Figure 10.

AI analysis for potential suitable input methods.

Figure 11.

Modified TAM.

Table 2

Perception of Tap-to-Alexa as an input method

Perception of Tap-to-Alexa as an input method	Average	Percentage
Motivation to use this input method	4.15/5	83%
Ease of using this input method	4.25/5	85%
Confidence while using this input method	4.41/5	88.2%

3.3 Potential suitable input methods for the Deaf and Hard of Hearing to use Alexa

To determine any potential input methods that are suitable for the DHH, the participants were asked to think of some suitable input methods for themselves, and the responses fall under three categories: 1) having a live interpreter through a video call, as experimented in use case #1, 2) having a robot/programmed interpreter through an avatar that is built through machine learning, and 3) using “Tap-to-Alexa” feature, as the device is currently built.

The results are as illustrated in Fig. 8 and the average of the AI sentiment analysis with this regard is as illustrated in Fig. 9.

We can conclude from these figures (Figs 9 and 10) that according to the DHH, having a live interpreter is the suitable input method for them to interact with the device while robots or software that translate sign language to another language is not perceived positively because as participant 1 commented “The human element is very important in this device, we need a human interpreter to interact better with the device”. On the other hand, the results showed that the current input method “Tap-to-Alexa” on its own is perceived negatively among the DHH and the comments were agreeing with participant 49 who responded saying “Without Sign Language, I will be back to depending on other people”.

Table 3
Hypotheses

#	Hypotheses	Decision on H-alt
H ${}_{1}$	The Majority of DHH prefer Sign Language as an input method rather than Tap-to-Alexa while interacting	Accept
	with IPAs.
H ${}_{2}$	There is no relationship between Educational Background and the Awareness of IPAs among the DHH.	Accept
H ${}_{3}$	The DHH are more motivated to use the sign language as an input rather than the Tap-to-Alexa.	Accept
H ${}_{4}$	The mean of needing Technical Support among DHH is high regardless of the Input Ease of Use.	Accept
H ${}_{5}$	DHH need to Practice using the input methods of the device regardless of their Satisfaction.	Insufficient evidence
H ${}_{6}$	There is a relationship between the External Variables and the Perceived Usefulness.	Insufficient evidence
H ${}_{\text{6a}}$	There is a relationship between the Hearing Identity and the Perceived Usefulness.	Insufficient evidence
H ${}_{\text{6b}}$	The DHH who prefer sign language as a communication method believe that it is easy to learn how to use	Insufficient evidence
	the device.
H ${}_{\text{6c}}$	Technology Awareness among the DHH significantly affects the Perceived Usefulness.	Accept
H ${}_{7}$	There is a relationship between the External Variables and the Perceived Ease of Use.	Insufficient evidence
H ${}_{\text{7a}}$	There is a relationship between the Educational Background of DHH and the Input Ease of Use.	Accept
H ${}_{\text{7b}}$	The younger generation of DHH finds Alexa easier to use.	Insufficient evidence
H ${}_{\text{7c}}$	There is a relationship between the Hearing Identity and obtaining an IPA device.	Insufficient evidence
H ${}_{\text{7d}}$	There is a correlation between the Technology Skills and the Perceived Ease of Use.	Accept
H ${}_{\text{7e}}$	DHH who are interested in technology are more likely to obtain the device.	Insufficient evidence
H ${}_{\text{7f}}$	DHH who own the device, or a similar one, find its inputs easy to use.	Accept
H ${}_{8}$	There is a significant relationship between Perceived Ease of Use and Perceived Usefulness.	Accept
H ${}_{\text{8a}}$	There is a significant relationship between the Input Ease of Use and its Complexity for the DHH.	Accept
H ${}_{9}$	Perceived Usefulness has a positive effect and impact on Motivation when Alexa is used.	Accept
H ${}_{\text{9a}}$	DHH who believe that it is other DHH will find it easy to learn, had a significant Input Perception.	Accept
H ${}_{10}$	There is a significant relationship between Perceived Usefulness and Intention of Use.	Accept
H ${}_{\text{10a}}$	There is a negative proportion between the Input Complexity and the Reason of Use among the DHH.	Accept
H ${}_{11}$	There is a significant relationship between Perceived Ease of Use and Intention of Use.	Accept
H ${}_{\text{11a}}$	There is a positive correlation between the Input Ease of Use and its Intention of Use among the DHH.	Accept
H ${}_{12}$	There is a correlation between Experience and the Perceived Ease of Use.	Accept
H ${}_{\text{12a}}$	There is a negative correlation between the Input Ease of Use and its Awkwardness for the DHH.	Insufficient evidence
H ${}_{\text{12b}}$	There is a positive correlation between the Input Performance and its Ease of Use for the DHH.	Accept
H ${}_{\text{12c}}$	There is a positive correlation between the DHH Confidence while using the input and its Ease of Use.	Accept
H ${}_{13}$	Motivation can positively and significantly affect Experience in using Alexa.	Accept
H ${}_{\text{13a}}$	There is a positive correlation between the Input Perception and the Confidence while using the device	Accept
	among the DHH.
H ${}_{14}$	The Intention of Use is significantly related to the System Use.	Insufficient evidence
H ${}_{\text{14a}}$	There is a positive relationship between the Reasons of Use and the Satisfaction among the DHH.	Insufficient evidence

3.4 Validation for the Technology Acceptance Model (TAM)

The Technology Acceptance Model (TAM) is a theory that illustrates how the end user accepts and uses technology. The model mainly consists of three concepts (illustrated in black boxes in Fig. 11): Perceived Usefulness (PU), Perceived Ease of Use (PEOS), and the Intention of Use. And throughout the years, the model has been modified by numerous researchers to include more concepts (illustrated in green boxes in Fig. 11) such as External Variables, Motivation, and Experience, which are linked to the main concepts.

In this research, we expanded the TAM framework to include different variables, illustrated in circles in Fig. 11, that can be used to reflect the DHH as end users of such systems.

Therefore, based on the data analysis and results concluded, we propose a new modification for the TAM based on 14 main hypotheses and their test results. The hypotheses and the corresponding rejection/acceptance decisions on are shown in Table 3. The research excluded hypotheses with insufficient evidence to accept/reject (below 0.1) illustrated in purple circles in Fig. 11, and retained the ones with sufficient evidence to accept, illustrated in blue circles in Fig. 11. Accordingly, the table is then followed by the final version of the modified framework as shown in Fig. 11.

4. Discussion

4.1 The situation between the Deaf and Hard of Hearing (DHH) and Alexa

Previous research have proven that 52% of hearing users were satisfied with Alexa while 42% were neutral with the device [14]. However, since IPAs are voice-activated devices, they are not well-known for the DHH, therefore, the interaction between the DHH and IPAs is not studied in-depth in the literature. Nevertheless, through this research, we concluded that the majority of DHH (93%) were satisfied with the device mainly because they found it beneficial and interesting.

Moreover, one of the reasons behind this satisfaction with the device among the DHH is that the device is considered as a new technology for them, which sheds light on the unawareness of DHH regarding IPAs. According to this study, the majority of the DHH participants (92%) had no experience with the device before the experiment. However, this is not aligned with the findings of Rodolitz’s study [15] which reported that 58% of the same group were familiar with the technology. It seems that the difference between these percentages is due to the difference in the geographical element, since this study took place in the UAE and Rodolitz’s study took place in the US, and we can conclude that the accessibility for the DHH to use the assistive technology varies between the two countries. On the contrary, according to a study on the verified users of Amazon Echo device in 2018, the researchers found that “despite being highly accessible, challenges still arose, particularly for people with speech impairments and for users with hearing loss” [14], and that was mainly due to the ineffective input method that associates the device, noting that when this research took place, the accessibility option “Tap-to-Alexa” was not part of the input methods provided by Amazon.

Furthermore, according to our study, “Tap-to-Alexa” was not perceived as a sign language due to many reasons. Some participants pointed out that the language gap and needing assistance to use the device if sign language is not a valid input method is a problem for them, while others stated that the device is rather small for such an input method and is not clear enough for them. On the contrary, many DHH perceived this input method positively, but the AI sentiment analysis showed less positive perception with “Tap-to-Alexa”. Finally, since “Tap-to-Alexa” is the newest input method provided by Amazon, literature did not provide any evidence on the interaction with this input method.

Besides, the DHH participants in this study reported being more motivated to use Sign Language as an input method rather than “Tap-to-Alexa”, with results of the survey of 92.8% for the motivation of using Sign Language compared to 83.2% for “Tap-to-Alexa”. Plus, the DHH were more confident to use Sign Language (98.8%) rather than “Tap-to-Alexa” (88.2%) while interacting with the device. Compared to Rodolitz’s research in 2019 [15], both results are aligned, despite that other input methods were tested rather than “Tap-to-Alexa”, but the DHH interacted better with Sign Language, showing that this is the most suitable input method for the DHH.

4.2 Improvements for better perception and interaction between the Deaf and Hard of Hearing (DHH) and Alexa

Shortly after the IPAs started booming, many researchers were studying different input methods that would suit the DHH [15, 19, 22], while many input methods, such as Text-to-Speech (TTS), deaf-speech and gesture recognition method, failed [15, 19, 22], some researchers suggested studying other possible input methods that would suit the DHH best [15, 19].

In this research, we believed that the best way to search for alternative input methods is asking the DHH to imagine how would they be most comfortable to interact with IPAs after testing the current input methods provided for them (Sign Language through a live interpreter and Tap-to-Alexa), because even if we thought of programming a sign recognition software or text-to-speech software, the DHH might not perceive or accept it as we expect; therefore, during the interview, we gave the participants an open space to suggest the best way they would interact with IPAs.

According to the interview responses and the analysis which were conducted using AI, the participants’ responses fell under four suggestions: Live interpreter, Robot, or a program that recognizes sign, the current input method, and change in the current language of the system. As expected, most of the DHH believed that having a live interpreter is the ideal solution, basically because sign language is a combination of hand movement, facial expression, and body language [2], and as stated by one of the participants “The human element is very important” and this is in agreement with other research results where the participants were more comfortable using Sign Language as an input method [15].

Furthermore, the responses regarding the other alternative input methods were analyzed to be negative as per the AI analyzer tool used in this study. Since most of the participants were in favor for the live interpreter, many of them thought that having a robot or a program that recognizes signs is a disadvantage for them because it removes the human element (the feelings) from their language, which as stated previously, is one of the pillars of a complete sign language. This proves that although there are a lot of research and development efforts put into creating Sign Gloves and other sign recognition software [6, 7, 18], the DHH might not perceive it well, at least not in this region.

Moreover, although Amazon have put a great effort in creating the accessibility option “Tap-to-Alexa” and the closed-caption, some DHH perceived it positively, while others didn’t, mainly because they faced issues in the language of the system, therefore, we believe that based on the results and the analysis, unless the current method is translated to Arabic and is associated with a live interpreter, the DHH in this region would not perceive it positively or interact with it as expected.

4.3 Limitations

There are a number of limitations in our current study. The first limitation was the inability to involve the Sign Gloves as an input method in this study since they are still under development and are unavailable in the region. Secondly, Alexa was not tested to execute commands related to controlling complicated devices as the set-up of the experiment had Alexa linked to lights only, as part of the Internet-of-things (IoT) devices. Moreover, internet connection issues rose during the experiment which affected the video call quality for some participants and that may have affected the results of the findings. Lastly, since the data collection took place in public places, we had minimal control over the background noise and other environmental setups, which could have been handled in a better way.

4.4 Further research

The research did not take into consideration all the possible input methods such as TTS and Sign-recognition software, therefore, we encourage further investigation on the perception of IPAs among the DHH in different parts of the world using “Tap-to-Alexa” and other input methods and to be compared to Sign Language as an input method. Moreover, we encourage researchers to study the perception and acceptance of robots, sign-recognition software, and sign gloves among the DHH in different parts of the world and not just rely on the possibility of developing the solution.

We also recommend further investigation from Amazon to include interpreters and closed captions/subtitles in different applications of Amazon Echo Show such as the news. Additionally, we recommend studying the possibility to embed the option of having live interpreters connected to the device through a click of a button, noting that there are different sign-centric call centers available that could be part of a third-party family for Amazon.

Additionally, despite our negative results among the DHH for implementing a programmed interpreter through an avatar using machine learning, we recommend building such solution in the future to reduce the dependency on the availability of live interprets, noting that this will be a great innovation in the accessibility technology for the DHH. Avatars could possibly mimic the interpreters’ body movements and gestures including facial expressions which would possibly eliminate any lack of feelings or missing human elements in the communication.

Finally, further research should be conducted on implementing TAM on the DHH and IPAs using the same variables in addition to including and validating different variables, such as environmental preparation. Besides, TAM should be implemented on DHH who own an IPA, to validate the satisfaction concept of the model.

5. Conclusions

In an era where hearing people interact with smart devices through speech [12], the DHH are left out without benefiting from this new technology. However, to reach a better inclusion in the society, there must be a middle ground between the two extremes of either being able to only speak to a smart device or not being able to interact with it at all. Therefore, in this research, we aimed to study (i) whether IPAs are feasible for the DHH in the UAE through measuring the perception and interaction between the DHH and Alexa, and (ii) alternative input methods to help the DHH interact better with Alexa.

Through measuring the input methods: Sign Language with a live interpreter and the accessibility option provided by Amazon “Tap-to-Alexa”, we conclude that the DHH in the UAE interact better and prefer Sign Language with a live interpreter rather than “Tap-to-Alexa” because they are more motivated and confident while using Sign Language as an input method. Also, it was indicated that the Sign Language input is easier to use and to learn among the DHH. On the contrary, most of the DHH found interacting with “Tap-to-Alexa” to be more awkward because of many reasons including device size, interface dimensions and language gap. However, the DHH in the UAE perceived the device, in general, positively, mainly because of its features which can ease their daily life routine and make them feel more independent and included.

Furthermore, we conclude that the ideal scenario for the DHH to interact and benefit the most from Alexa, and IPAs, in general, is to include Sign Language as an embedded input method in the device and provide live interpreters which shed light on the importance of the interpreters’ jobs around the world. Additionally, “Tap-to-Alexa” must be translated into different languages for a better perception of the input method. Also, despite the respected efforts put into developing sign-recognition software, we believe that the DHH might not benefit from the technology, although it might be handy in other situations.

Lastly, we conclude that TAM can help us understand the perception, experience, motivation and the intention of using technology among the DHH and that different variables can be added and connected to different concepts in TAM, such as the educational level, the input’s ease of use, complexity, ease of learning, performance, perception, reason of use, and, the confidence while using the input.

Footnotes

Acknowledgments

We would like to thank the teams at the Dubai Club for People of Determination and Sharjah City for Humanitarian Services , including Al-Amal School for the Deaf, for their ultimate support during the data collection and for providing us with the participants, interpreters and suitable environment for this research.

Conflict of interest

The researcher reports no conflict of interest.

References

Higgins

. Outsiders in a hearing world. Urban Life. 1979 Apr; 8(1): 3–22.

Jay

. Don’t just “sign” – communicate!: a student’s guide to ASL and the deaf community. Los Angeles, Ca: Judea Media; 2011.

Pedersen

Söderström

Kermit

. Assistive activity technology as symbolic expressions of the self. Technology and Disability. 2019; 31(3): 129–40.

Abdallah

Fayyoumi

. Assistive technology for deaf people based on android platform. Procedia Computer Science. 2016; 94: 295–301.

Maiorana-Basas

Pagliaro

. Technology use among adults who are deaf and hard of hearing: a national survey. Journal of Deaf Studies and Deaf Education. 2014 Mar 24; 19(3): 400–10.

Catalano

Crimmins

Tsafasman

Werner

. SMARTCLOSED CAPTION POSITIONING SYSTEM FOR VIDEO CONTENT [Internet]. 2019. Available from: https://patentimages.storage.googleapis.com/d5/b3/03/fc3908ca7755cc/US10235997.pdf.

Manzoor

Vimarlund

. E-services for the social inclusion of people with disabilities: a literature review. Technology and Disability. 2017 Jul 7; 29(1–2): 15–33.

Halim

Abbas

. A kinect-based sign language hand gesture recognition system for hearing- and speech-impaired: a pilot study of pakistani sign language. Assistive Technology. 2014 Oct 6; 27(1): 34–43.

Heera

Murthy

Sravanti

Salvi

. Talking hands – An Indian sign language to speech translating gloves. 2017 International Conference on Innovative Mechanisms for Industry Applications (ICIMIA). 2017.

10.

Bala

Song

. Android app for improvising sign language communication in english and hausa. International Journal of Advances in Scientific Research and Engineering. 2020; 6(2): 15–24.

11.

Kurniawan

VRB

Wijayanti

. A House of Quality (HOQ) matrix of assistive technology for deaf students at elementary school to enhance basic-level language competencies. Journal of Physics: Conference Series. 2020 Jan; 1456: 012040.

12.

Hoy

. Alexa, siri, cortana, and more: an introduction to voice assistants. Medical Reference Services Quarterly. 2018 Jan 2; 37(1): 81–8.

13.

Ford

Palmer

. Alexa, are you listening to me? An analysis of alexa voice service network traffic. Personal and Ubiquitous Computing. 2018 Jun 28; 23(1): 67–79.

14.

Pardhan

Mehta

Findlater

. “Accessibility Came by Accident”: Use of Voice-Controlled Intelligent Personal Assistants by People with Disabilities. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI’18). 2018.

15.

Rodolitz

Gambill

Willis

Vogler

Kushalnagar

. Accessibility of voice-activated agents for people who are deaf or hard of hearing. Journal on Technology and Persons with Disabilities. 2019; (7): 144–56.

16.

Khan

Mehdi

. Sign Language Recognition using Sensor Gloves. [Internet]. 2002. Available from: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.216.456&rep=rep1&type=pdf.

17.

Chai

Lin

Tang

Chen

Zhou

. Sign Language Recognition and Translation with Kinect. [Internet]. 2013. Available from: http://vipl.ict.ac.cn/uploadfile/upload/2013123111173937.pdf [Accessed 20 Oct. 2019].

18.

Ahmed

Zaidan

Salih

Lakulu

MM bin

. A review on systems-based sensory gloves for sign language recognition state of the art between 2007 and 2017. Sensors [Internet]. 2018 Jul 9; 18(7): 2208. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6069389/.

19.

Glasser

Kushalnagar

. Deaf, hard of hearing, and hearing perspectives on using automatic speech recognition in conversation. Proceedings of the 19th Inter-national ACM SIGACCESS Conference on Computers and Accessibility. 2017. pp. 427–432.

20.

Glasser

. Automatic Speech Recognition Services: Deaf and Hard-of-Hearing Usability. [Internet]. 2019. Available from: https://arxiv.org/pdf/1909.02853.pdf.

21.

McCleary

. Technologies of language and the embodied history of the deaf. Sign Language Studies. 2003; 3(2): 104–24.

22.

Bigham

Kushalnagar

Huang

Flores

Savage

. On How Deaf People Might Use Speech to Control Devices. Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility. 2017. pp. 383–384.

23.

Findlater

Chinh

Jain

Froehlich

Kushalnagar

Lin

Carey

. Deaf and Hard-of-hearing Individuals’ Preferences for Wearable and Mobile Sound Awareness Technologies. [Internet]. 2019. Available from: http://faculty.washington.edu/leahkf/pubs/CHI2019_WearableSoundSurvey.pdf.

24.

SPSS Software [Internet]. www.ibm.com. Available from: https://www.ibm.com/ae-en/analytics/spss-statistics-software.

25.

Sentiment Analysis [Internet]. MonkeyLearn. 2018. Available from: https://monkeylearn.com/sentiment-analysis.

26.

Dubai Club for the Disabled [Internet]. dcd.org.ae. Available from: http://dcd.org.ae/dcdnew/index-en.php.

27.

Sharjah City for Humanitarian Services [Internet]. www.schs. ae. Available from: https://www.schs.ae.

Deaf and hard of hearing in the United Arab Emirates interacting with Alexa,an intelligent personal assistant

Abstract

BACKGROUND:

OBJECTIVE:

METHODS:

RESULTS:

CONCLUSIONS:

Keywords

1. Introduction

1.1 Assistive technology (AT) availability for the Deaf and Hard of Hearing (DHH)

1.2 The current input of Information and Communication Technology (ICT) among the Deaf and Hard of Hearing (DHH)

1.3 The use of Intelligent Personal Assistants (IPAs) among the Deaf and Hard of Hearing (DHH)

2.1 Study design and sample

2.2 Data collection

2.2.1 Experiment

2.3 Ethical considerations

2.4 Analysis

3. Results

3.1 The satisfaction of the Deaf and Hard of Hearing (DHH) with the device

3.2.1 Sign Language as an input method

Table 3 Hypotheses

4. Discussion

4.1 The situation between the Deaf and Hard of Hearing (DHH) and Alexa

4.2 Improvements for better perception and interaction between the Deaf and Hard of Hearing (DHH) and Alexa

4.3 Limitations

4.4 Further research

5. Conclusions

Footnotes

Acknowledgments

Conflict of interest

References

Table 3
Hypotheses