Personalized privacy assistant for digital voice assistants: Case study on Amazon Alexa

Abstract

The advancements in modern technologies permit the invention of various digital devices which are controlled and activated by people’s gestures, touch and even by one’s voice. Google Assistant, iPhone Siri, Amazon Alexa etc., are most popular voice enabled devices which have grabbed the attention of digital gadget users. Their usage definitely makes the life easier and comfortable. The other side of these smart enabled devices is incredible violation of the privacy. This happens due to their continuous listening to the user and data transmission over a public network to the third-party services. The work proposed in this paper attempts to overcome the existing privacy violation problem with the voice enabled devices. The main idea is to incorporate an intelligent privacy assistant that works based on the user preferences over their data.

Keywords

Privacy voice assistants privacy assistant privacy word bank

1. Introduction

Investigation on smart devices can hold a “common platform” where people of all liabilities can make their activities in more accessible and convenient manner. Earlier there were only wealthier people who could effort these digital assistants to manage their workplace and household things. The new set of Artificial Intelligence algorithms [1] facilitated the automation of more Natural User Interfaces (NUIs) [2] and has come up with new digital assistants at an unprecedented low cost. As a result, more people could use these devices with various dimensionalities.

For example, imagine a smart home equipped with hands-free technologies so that cognitive disabled people can make use of the facilities and remain more independent. Not only this, the elderly people also feel safe and comfortable to stay in such a smart home where their health and movement around the home is carefully monitored, and also sends alerts to emergency services as and when required [3, 4].

Another significant and prestigious invention in this era is Speech-based NUI that could probably replace a human assistant. Alexa, Siri, Cortana and Google Assistants etc., [2, 19, 20, 21, 22] are some of the Intelligent Personal Assistants (IPA) available now a days in the market. Their internal embedded system contains a microphone that exists with a wake-up word like “Alexa”, “Hello Google”, etc., to come intact with the user’s conversation. Recently Amazon made an update to Alexa where the users have no longer

needed to say wakeup word [5] every time before the conversation if its follow-up mode is enabled. However, when these devices are in continuous listening mode, lot of personal information is being collected, aggregated and analyzed all the time.

This situation definitely raises a potential threat to the user’s privacy [5]. The study by WiredMagzine also said “Amazon’s Next Big Business Is Selling You”. This is due to huge their data collection of the customers through various means [6]. Though different companies defend themselves that they have strong privacy policies, it is still becoming a challenge for the researchers to come out of this potential privacy threat.

The work proposed in this paper is organized as follows. Section 2 describes the related work followed by proposed model in Section 3. Section 4 gives the experimental setup with a discussion on results in Section 5. Conclusion is given in Section 6.

2. Related work

The work proposed in [2] made an attempt in studying the various speech based natural user interfaces. Their work had done a comparative study over Smart Personal Assistants (SPAs) in various dimensions like shopping and buying, travel and entertainment, administrative assistant etc. for their correctness and naturality. Their discussion had shown how voice assistants could become the part of human regular life. The Virtual Personal Assistants (VDA) rely on voice channels and vulnerable to various attacks due to lack of proper authentication [7]. Their proposed work studied two new attacks namely voice squatting and voice masquerading. Their in-depth study elevated various users’ misconception about the VDA’s functionalities. Their work contributed an automatic detection of realistic threats which are evidenced the attention of Amazon and Google.

The report given in [8] stated that how Amazon Echo sent a private conversation to one of the saved contacts. The report also claimed that the Amazon’s smart speaker is continuously listening and recording much more than the company claims.

The report given in [9] stated how Amazon like company would be dealing user data as its main source for creating a “360-degree view” of each and every customer so that they can reach with more and more personalized advertisements and recommendations.

Another report given in [10] said that the Alexa data services team could had an access to location data during the command auditing. This is in particular, a kind of privacy violation, and a question on user’s trust. An article stated in [10] conveyed the potential privacy concerns of the smart speakers. In their article it was mentioned that Google says the inferences made by the user query processing helps the third parties to benefit the customers by providing the customized recommendations.

A report given in [12], highlighted the heterogeneous opinions of people and how they will feel about digital assistants. Despite of their privacy worries people also expressed that they want to perform voice processing tasks. So, it is very clear that the digital assistants with a concern and protection over the data privacy are desperately needed by the customers.

Figure 1.

Study of user concerns over digital assistants.

Another report given in [13] stated that smart speakers are not smart enough to keep our data private. Because there is no such intelligence incorporated in their system to look after the user’s privacy. The insecurity of Home Digital Voice Assistants by using Amazons Alexa as a case study is done in the work proposed in [14]. Their work argued that there should be another level of authentication process while using Alexa. They proposed physical presence authentication before initiating the conversation with the Alexa.

3. Proposed model for intelligent privacy assistant (IPA)

IOT based applications are becoming increasingly ubiquitous. More the usage more is the risk factor. IOT deals with sensors that collects huge amount of data which in further may lead to legitimate issues like violating people’s privacy and also security issues like side channel attacks, data authentication, IoT also fails to provide secure communication in some cases.

In order to protect our privacy without being compromised, at the same time use these IoT devices to their full extent this proposed model would help to do so. Basically, this Intelligent Privacy Assistant (IPA) framework divides the actions (functionalities of voice assistants) into two types., one is the public tasks where these tasks do not involve any privacy leak of the user and the other being the private task where these tasks actually leak people’s privacy.

Figure 2.

Intelligent privacy assistant.

Public tasks like “what’s the weather in New York?” these types of tasks can be done without any problem as they do not compromise the user privacy, on the other hand private tasks like “private conversation of user” these tasks can be intentionally being done or the voice assistance can accidentally perform these actions, which need to be stopped from data leak. Every data packet send by the voice assistance will be passed through the framework and it verifies the tasks with the predefined user preferences, if it identifies the task as public it sends the packet for computation but if it identifies the task to be private it discards the task immediately and notifies the user. This way the proposed model introduces a privacy layer to perform the sensitivity analysis of the data being transmitted through the voice digital assistants. And through this way one can provide security and also avoid the privacy leaks created by the bugs in the voice assistance.

4. Experiment design

4.1 Use case

Bob is a person who likes to live in an ambient living environment. Artificial Intelligence discovery of intelligent voice assistants made him to live in a more comfortable and virtually connected world. He uses to give commands to voice assistants just by saying a wake-up word like “Hello Alexa” or any other customized word. The commands can be forwarded to command processing cloud service and a response is generated back to him accordingly.

In this use case scenario, the user is expected to give his privacy preferences which are required to prepare a private word bank. According to that, two views of data processing have been created. One is a public data view which does not lead to any data breach and other is private data view which may lead to a data breach and leads to privacy violation according to the privacy preferences given by the user towards his data.

Whenever the user issues a command, it is first visible to the Privacy Assistant framework. Here, the sensitivity analysis of the given command is performed against the private word bank.

During this analysis, if it finds that the given query is going to reveal private data of user then the proposed privacy assistant immediately alerts the user, otherwise it will be forwarded to the voice assistant.

This conceptual system flow is depicted in Fig. 3. The model aims to facilitate the user to comfortably go with the digital assistant without bothering about their data privacy. For this it employs a systematic approach to in order to calculate the privacy score for the given query. Tables 1 and 2 depicts various user opinions in the sample survey conducted among 1000 Alexa’s users.

Table 1
Survey on alexa users

Survey on Alexa users	Yes	No
Are you aware of how the data will be breached with voice assistants	12%	88%
Do you have an idea on how and for what purpose your data is being is used	10%	90%
Do you have a privacy concern while using voice assistants	84%	16%
Will you feel comfortable to receive privacy alerts	80%	20%
Are you willing to invoke a personalized privacy assistant	81%	19%
Willing to do an unsuccessful quit in case of privacy violation	77%	33%

Table 2

Evaluation results of private utterance with personalized privacy assistant (PPA) service

Device	# of users	# of Words in Private data bank	# of private word occurrences	% of successful services	% of unsuccessful quits due to data breach	Avg requests killed on privacy alert	Avg requests forwarded on privacy alert
Alexa	10	5000	200	73%	27%	65%	35%
Google	10	2000	250	70%	30%	68%	32%

Figure 3.

System flow diagram.

4.2 User word bank

Here the users are given a choice to add words which they feel that leaks their privacy. The system then uses this word bank to process the entire procedure. The system also has a global word bank which is a collection of all the user’s word bank.

4.3 WordNet

“WordNet” [15] is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms, each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations.” Using Wordnet module the system collects all the words which are similar to the words given by the user and stores them in the user word bank.

4.4 Stemming

Next the system uses potter stemming algorithm [16] in order to get the words in their root form. This is done in order to get effective results while performing the privacy score.

4.5 Privacy score

When the user given a command to the smart assistant, our system first detects the command tokenizes the given command, then uses the stemming algorithm in order get the words to their root form. So here, it matches each of the stem word with the words in the user word bank, and for every match a certain amount of score is provided (each word gives 100).

For matching of words, the system utilizes the N-Gram matching technique [17].

4.6 N-Gram matching

Generally, one can use some basic word to word matching techniques in order to match the strings, but those techniques fail when there is cross-language, as of here the system uses a voice to text modules, so it may interpret the text in other ascent which might be different from the users. This might create a problem of cross lingual spelling variants, in this scenario using a basic string- matching technique wouldn’t come with an efficient result. Because of this reason the proposed framework uses an N-Gram matching technique which gives better results in this context by comparing a input word with all the path words till the root. Due to this more privacy words can be caught that increases the data protection.

Figure 4.

Occurrence of private words.

Figure 5.

Comparison over individual and global word bank.

N-Gram [18] follows probabilistic simplification that is given Eqs (1) and (2).

$\displaystyle P(w_{n}|w_{n-1})=\frac{\mathrm{C}({w_{n}}{w}_{n-1})}{\sum_{w}(w_% {n-1}\mathrm{w})}$ (1)

Similarity values computation:

$\displaystyle\text{SIM}(N1,N2)=|N1\cap N2|/|N1\cup N2|$ (2)

where $N1$ and $N2$ are $n$ -gram sets of 2 different words. The system incorporates this existing $n$ -gram matching technique to extract the privacy word occurrence by the user in his individual word bank and also global bank.

Privacy assistant mechanism (PAM)

USER COMMAND (VOICE INPUT), User word bank (N1), threshold $=$ 450 (default value could be changed by the user) ALERT

Start:

Step 1: Converting the voice command into textual data (D) and initialize privacy score ( $S=$ 0)

Step 2: Tokenization of the command (D $-$ $>$ D ${}^{1}$ )

Step 3: Performing stemming on the tokenized sentence. (D ${}^{1}$ $-$ $>$ N2)

Step 4: Calculate the privacy score using n-gram matching For each value in N2 $\text{SIM}(N1,N2)=|N1\cap N2|/|N1\cup N2|$ If (SIM $>$ 0.5) then increase S by 100

Step 3: If (S $>$ threshold) alert user

Else Process the command and perform the action

Stop;

Figure 6.

Data vs privacy leak.

5. Results

For illustrating the results, a survey among 10 people is conducted and collected a word bank of about 5000 words cumulatively which they think as private to them. Here each of the ten users has their own private word bank consisting of nearly 1000 words. Taking that work bank, proposed system ran besides the Amazon Alexa, as it did not fully integrate with Alexa. The proposed system was continuously listening and identifying the private words which were given by the users. The below graphs show the analysis of the system.

The graph given in Fig. 4 shows usage of private words of a particular user identified from their respective private word bank in the common conversations among all the people. The percentage of private words occurrence in their regular conversation has illustrated in Fig. 4. It is observed that for some of the users the occurrence of private words in fact crossed the accepted threshold.

The result given in Fig. 5 illustrates usage of private words of a particular user identified from their respective word bank and other user private word bank coined as global private word bank. Here the analysis shows some users feel some words are private and some feel the other, so as the global bank have all the users private bank, it has compared the particular user usage relative to the global bank. This will help a particular user to add or delete the words from his word bank keeping the global perspective in mind.

Figure 6 illustrates the behavioral pattern of a particular user over his data leakage and privacy line. On the consequent privacy alerts to the user, a conscious behavior over his data has been noticed.

5.1 Sample user dataset words

Record, conversation, income, disease, daily routine, send, details, bank account, etc.

5.2 Sample conversation

“Hey, Alexa we need figure out what is our budget this month so sit down here, we need to have a conversation on how could we save the money. My income is xxx amount, your income is xxx, every month we are spending xxx amount of money from our savings bank account which should be lowered by some means”

In the above paragraph the underline and bold words represent the private words from the user dataset created by the user, the algorithm detects the above passage and gives a score to each private word found in the commands or in general whenever Alexa is in listening mode. So, for every private word discovered it increases the private score if the private score is greater than the threshold then it alerts the user regarding the recordings or the commands so that the user can act accordingly, also user can set his own threshold according to his needs.

6. Conclusion

Convenient and comfortable living assistance is provided by digital assistants and certainly one cannot imagine a life without these smart services. However, these services are coming up with a privacy threats which are leaving the digital users in an impulsive state during their usage. It is also evident that the users are demanding these services with privacy add on. This work mainly concentrated on facilitating the data privacy while using the smart voice assistants like Amazon Alexa. The proposed framework initially collects the privacy preferences from the individual users. Its main goal is to alert the user when he is exceeding the privacy threshold against his preference while giving the voice commands. It also monitors what details have been collected and recorded by Alexa, and accordingly it alerts the users. Further the model can be extended to completely get automated and to work as a fully intelligent personalized privacy assistant.

References

Chen

Kim

and Serikawa

, Brain intelligence: Go beyond artificial intelligence, Mobile Networks and Applications 23(2) (2018), 368.

López

Quesada

and Guerrero

L.A.

, Alexa vs. Siri vs. Cortana vs. Google assistant: A comparison of speech-based natural user interfaces, © Springer International Publishing AG 2018 Advances in Intelligent Systems and Computing, 592, doi: 10.1007/978-3-319-60366-7_23.

Aileni

R.M.

Suciu

Ciurea

and Sever

, Assistive mobile technologies for health monitoring and brain–computer interface for patients with motor impairments, In: Paiva S. (eds) Mobile Solutions and Their Usefulness in Everyday Life. EAI/Springer Innovations in Communication and Computing (2019), Springer, Cham.

Ding

Cooper

R.A.

Pasquina

P.F.

and Fici-Pasquina

, Sensor technology for smart homes, Maturitas 69(2) (2011), 131–136.

Dellinger

A.J.

, Amazon considered letting Alexa listen to you without a wake word, 2019. https://www.engadget.com/2019/05/23/amazon-alexa-recording-before-wake-word-patent/.

Wohlsen

, Amazons next business is selling you, 2012. https://www.wired.com/2012/10/amazon-next-advertising-giant/.

Zhang

Feng

Wang

X.F.

Tian

and Qian

, Understanding and mitigating the security risks of voice-controlled Third-Party skills on amazon alexa and google home, arXiv: 1805.01525v2 [cs.CR] 29 Jun 2018.

Newman

L.H.

, Don’t freak out about that amazon alexa eavesdropping situation. Retrieved October 8, 2018, from https://www.wired.com/story/the-alexa-amazon-eavesdrop-ping-situation/.

Hildenbrand

, Amazon Alexa: What kind of data does Amazon get from me? Retrieved October 8, 2018, from https://www.androidcentral.com/amazon-alexa-what-kind-data- does-amazon-get-me.

10.

Day

Turner

and Drozdiak

, Amazon’s Alexa Team Can Access Users’ Home Addresses, 2019. https://www.bloomberg.com/news/articles/2019-04-24/amazon-s-alexa-reviewers-can-access-customers-home-addresses.

11.

Lieberman

, The daily caller. Smart speakers and their potential privacy concerns, 2018. http://dailycaller.com/2018/04/14/smart-speakers-privacy-concerns-alexa/.

12.

Perez

, 2019. https://techcrunch.com/2019/04/24/41-of-voice-assistant-users-have-concerns-about-trust-and-privacy-report-finds/.

13.

Waddell

, A.I. helpers like Alexa and Siri are useful, but they’re not smart enough to keep your questions private – atleast not yet, 2016. https://www.theatlantic.com/technology/archive/2016/05/the-privacy-problem-with-digital-assistants/483950/.

14.

Lei

G.-H.

Liu

A.X.

Ali

C.-Y.

and Xie

, The Insecurity of Home Digital Voice Assistants – Amazon Alexa as a Case Study, arXiv: 1712.03327v2 [cs.CR] 8 Apr 2018.

15.

Wordnet. https://en.wikipedia.org/wiki/WordNet.

16.

http://snowball.tartarus.org/algorithms/porter/stemmer.html.

17.

Pirkola

Heikki

Erkka

Antti-Pekka

and Kalervo

, Targeted s-gram matching: A novel n-gram matching technique for cross- and monolingual word form variants, Information Research 7(2) (2002). Available at http://InformationR.net/ir/7-2/paper126.html.

18.

Speech and Language Processing. Jurafsky

19.

Amazon Inc.: Alexa skills kit. https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit.

20.

Amazon Inc.: Amazon Echo. www.amazon.com/echo%0A.

21.

Google: GoogleAssistant. https://assistant.google.com/.

22.

Apple: Siri. http://www.apple.com/ios/siri/.