The Effects of Modality,Device,and Task Differences on Perceived Human Likeness of Voice-Activated Virtual Assistants

Abstract

Paying attention to the rising popularity of virtual assistants (VAs) that offer unique user experiences through voice-centered interaction, this study examined the effects of modality, device, and task differences on perceived human likeness of, and attitudes toward, voice-activated VAs. To do so, a 2 (modality: voice vs. text) × 2 (device: mobile vs. laptop) × 2 (task type: hedonic vs. utilitarian) mixed factorial experimental design was employed. Findings suggest that voice (vs. text) interaction leads to more positive attitudes toward the VA system mediated by heightened perceived human likeness of the VA, but only with utilitarian (vs. hedonic) tasks. Interestingly, laptop (vs. mobile phone) interaction also enhanced perceived human likeness of the VA. This study offers theoretical and practical implications for VA research by exploring the combinational effects of modality, device, and task differences on user perceptions through human-like interactions.

Introduction

Thanks to the increasing adoption of digital assistants in various contexts (e.g., shopping,¹ entertainment,² and public service³), virtual assistants (VAs) are expected to lead a market of $12 billion by 2024.⁴ In particular, major tech companies are heavily investing in the development of VAs primarily designed to facilitate voice interactions between users and the systems (e.g., Amazon Alexa, Apple Siri, Google Assistant, Microsoft Cortana). However, audio is not the only modality that such systems embrace to interact with users; many of the current VAs also offer type-in features. Taking into consideration the increasing prevalence of such voice-activated VA systems in individual households, and the various modality options that they offer, we explored whether auditory (vs. textual) recognition and responses from Vas, indeed, offer distinctive and more intimate user experiences. Specifically, we predict that voice (vs. text) interaction will induce more positive evaluations through enhancing perceived human likeness of the VAs. The possible moderating effects of device and task differences are also explored.

Modality effects and the role of human likeness in VA interactions

VA research in speech has tended to focus on how certain characteristics embedded in the virtual agent's voice (e.g., emotional tone,⁵ vocal pitch,⁶ naturalness of voice⁷), usually coupled with the virtually embodied agent's facial cues (e.g., smiling vs. neutral facial expressions^6,8), can enhance human-like perceptions of VAs. In reality, major VAs on the market are capable of delivering human-like impressions to users through simply one type of assigned synthetic voice even without signaling particular anthropomorphic physical cues (e.g., facial expression). Nevertheless, the limited voice variation and/or disembodiment do not seem to stop users from developing emotional attachment to VAs. We believe that one attribute that renders the interaction with voice-activated VAs more natural and seamless is the use of voice itself.

According to the computer are social actors (CASA) paradigm,⁹ individuals show a tendency to apply human social rules to computers that project certain human communication attributes. Among many cues that can lead users to respond to non-lifelike artifacts as they would to human counterparts, “the presence of voice is a(nother) strong trigger for anthropomorphic perception”.¹⁰^(p204) Extending CASA to the context of voice-activated VAs, we can expect that the presence of voice will be influential enough for users to feel more connected to the VA. Especially as voices from VAs increasingly emulate human speech, the incorporation of speech interaction in and of itself may have considerable impact on how people perceive and respond to virtual beings,¹¹ perhaps more so than earlier.

Nonetheless, only a handful of studies in VA research have focused on exploring the effects of the presence (vs. absence) of voice. Among the few that directly compared modality difference (voice vs. text), on the one hand, Berry et al.¹² found that when health messages were offered in text (vs. through voice from the virtual agent), the messages were perceived as easier to understand. On the other hand, Moorthy and Vu¹³ found that users felt more comfortable using voice-activated VAs such as Siri over keyboard search in private (vs. public) locations and for nonprivate (vs. private) information entry. However, in both studies, participants did not directly interact with the virtual agent in the text conditions (e.g., searched mobile web, read a word document). Considering that VAs such as Cortana and Siri incorporate both audio and textual interaction, a more thorough comparative testing between voice and text effects from the same VA source is needed.

In particular, we focus on how such modality difference could affect user perception by affording more human-like interactions. As CASA suggests, if individuals tend to apply social rules when they are interacting with computers, positive evaluation toward machines may come from the heightened resemblance of human–computer interaction to interpersonal communication. Sundar^14,15 also noted how audio modality can promote perceived realism of the interaction between users and digital media. Empirical evidence suggests that certain voice manipulations of virtual agents could imbue a “sense that other intelligent beings co-exist and interact with you, even if those beings are non-human and only seem intelligent”.^{16(pp289–290)} Such socialness evoked during interaction with virtual beings was identified as an important factor to increase the levels of attraction toward robots.¹⁷ Focused on the power of presence of voice, this study extends how modality can alter user perception, especially by promoting a sense of socialness in the interaction between the user and VA. In particular, we argue that the emission of voice (vs. text) itself could have strong impact on perceived human likeness of VAs, in turn increasing positive attitudes toward them.

H1: Voice (vs. text) interaction with the VA system will elicit higher levels of perceived human likeness.

H2: Voice (vs. text) interaction will indirectly increase positive attitudes toward the VA system, mediated by the levels of perceived human likeness.

Device effects in VA interactions

In addition to modality, device options (e.g., mobile phone, laptop, smart home devices) should also be accounted for when it comes to discussing user interactions with VAs. For example, mobile devices are typically perceived as being more accessible, portable, and newer; whereas computers tend to be rated as more faithful and stable.¹⁸ In addition, users reported higher preference for mobile devices, even though they performed better on a web-based task on computers.¹⁹ These findings indicate that users' perceptions of different devices could also play a role in their assessments of VA systems. Further, it is possible that the use of different devices could interact with different modalities of information presentation. For example, Elting et al.²⁰ found that the combination of picture and speech (vs. picture + speech + text) was the most effective in enhancing recall, but only when participants viewed content via a personal digital assistant (PDA) (vs. a computer or TV). The authors explained that multiple modalities competing to be processed through limited sensory channels can increase cognitive load,^21,22 and handling a smaller device can further lead users to become more sensitive to the allocation of cognitive resources in processing different modality outputs.²⁰ In addition, the difference in screen size between devices could also exert a disparate impact on users' sensory experiences.¹⁵ Relevant to this point, past research has documented that larger screens on TV²³ and mobile devices²⁴ expand sensory stimulation, which can be pronounced in certain modality interactions. However, not many studies attempted to test the combinational effects of device and modality,^20,25,26 especially with non-VA systems. With rising adoption of VA systems through mobile media, the effect of sensory richness, which could vary as a function of device, comes into question. In this study, we compared the effects of using mobile phones versus laptops for VA interaction. Due to the paucity of relevant research, a research question is suggested instead of a directional hypothesis:

RQ1: Will device difference (i.e., mobile phone vs. laptop) moderate the relationship between voice (vs. text) interaction with the VA system and the levels of perceived human likeness?

Task difference in VA interactions

Users' input and output preferences also depend on other factors such as context, efficiency, and the hedonic quality of the system.²⁷ For instance, when conversing with a system, the ratio of questions and nonquestions a user adopts depends on factors, including the topic or context (e.g., informational vs. entertainment use).²⁸ This suggests that interactions with a VA can vary based on the types of topic used in the interaction, which is important to consider since these systems are now capable of answering complex questions.²⁹ The latest generation of voice-recognizing VAs are even capable of throwing funny jokes to users, in addition to offering factual information. Considering the wide range of functions and types of questions that VAs can process, the effects of task difference in VA interaction deserve further investigation:

RQ2: Will task type difference (i.e., hedonic vs. utilitarian tasks) moderate the relationship between voice (vs. text) interaction with the VA system and the levels of perceived human likeness?

Methods

This study employed a 2 (modality: voice vs. text) × 2 (device: mobile vs. laptop) × 2 (task type: hedonic vs. utilitarian) mixed factorial experimental design, with modality and device serving as between-subjects factors, and task type serving as a within-subjects factor. In addition, Microsoft Cortana³⁰ was adopted for this study, due to its inherent characteristics that allow interactions via both voice and text input, as well as both mobile and laptop devices.

Participants and procedures

Eigty-two undergraduates (N_Men = 12; M_Age = 19.71, SD_Age = 0.87) were recruited from a communication course for extra credits at a northeastern university in the United States in April 2017. Sample size was set to allow at least 20 participants per cell for each of the 2 × 2 between-subjects groups, and the a priori G*Power³¹ analysis for a repeated-measure analysis of variance (ANOVA) with a within-between interaction also informed that the required sample size was 72 for medium effect size (f = 0.25) with 98 percent power. The overall procedure of this study was approved by the institutional review board of the authors' university. After consent, two participants at a time were randomly assigned to one of the four conditions (mobile voice, N = 21; mobile text, N = 22; laptop voice, N = 20; laptop text, N = 21), and they were instructed on how to use Cortana accordant with their assigned modality and device conditions. Afterward, they were asked to interact with Cortana involving two different types of task sets (i.e., hedonic vs. utilitarian) for five minutes each with randomized order. For each task type, a list of various questions or statements was given to users to interact with Cortana. Participants were allowed to choose any questions or statements of their preferences within the allotted time. As the final step, participants completed an online questionnaire to evaluate their interactions.

Manipulated conditions

First, for the device manipulation, participants were randomly assigned to interact with Cortana by using either a mobile phone or a laptop. We used two Android mobile phones (i.e., Nexus 6) with the Cortana application installed, and three laptops operated by the Windows 10 system that had Cortana preinstalled by default. Second, for the modality manipulation, participants were randomly assigned and instructed to interact with Cortana by using either voice or text input. To note, output modality mirrored the input modality, in that text input resulted in text output, and audio input prompted an audio response (accompanied by text and/or image search results). Third, for task manipulation, two different types of task sheets were prepared. The hedonic task sheet included a list of 30 questions/statements, and the utilitarian task sheet, a list of 26 (Table 1).

Table 1.

List of Questions and Statements for Hedonic and Utilitarian Tasks

Hedonic task questions/statements (N = 30)

Why are you blue?

Do you have dreams?

Are you hot?

Do you like Pokémon Go?

Do you love me?

Roll a dice!

Sing me a song!

Read me a poem!

Make me a sandwich.

Can you dance for me?

Tell me a joke!

Surprise me!

Use the force.

Tell me a bedtime story.

Kiss me.

Do an impression.

When is the world going to end?

What is the meaning of life?

Can I borrow some money?

What does the fox say?

Are you a Democrat or a Republican?

Why did the chicken cross the road?

You are cool.

You are beautiful.

You are funny.

You are the best assistant ever.

Cortana, you are ugly.

You are annoying.

Cortana, you suck.

I love you, Cortana.

Utilitarian task questions/statements (N = 26)

Some news about White House?

Apple stock price today?

Who won the Oscar in 2017?

Who won the 2015 Grammy awards?

What's the schedule of NBA today?

How old is Jennifer Lawrence?

When was Snapchat invented?

What is Artificial Intelligence?

Why do dogs wag their tails?

How to remove gel nails?

Do I need an umbrella today?

Where is the most popular Korean restaurant near me?

Who is the mayor of State College?

What movies can I watch tonight?

How many dollars are in a Euro?

How many calories in a cup of Hot Chocolate?

What is the square root of 65?

What is the time now in Shanghai, China?

When is Father's Day in 2017?

In what year did the Japanese earthquake happen?

When is the best time of year to travel to Alaska?

When did the Titanic sink?

Who is the president of Ecuador?

Why do we celebrate Easter?

When was the Czech Republic founded?

When did the Berlin Wall fall?

Measured variables

Perceived human likeness

We adapted eight items from the Humanness Index³² (e.g., Artificial-Natural, Inanimate-Living) and nine items from previous social presence scales^16,33 (e.g., “There was a sense of sociability during the interaction,” “While I was using Cortana, I felt as if someone was talking to me”; 1 = strongly disagree, 7 = strongly agree), all measured on a seven-point Likert scale. Social presence was integrated to the scale since it is considered a major factor that contributes to human likeness of technology.³⁴ In addition, although the humanness scale can represent the static human-like impression of the technology, we believed that the items in the social presence scale can capture the humanness in the dynamic interaction between the user and VA. The humanness and social presence scales were highly correlated (r_hedonic = 0.81, p < 0.001; r_utilitarian = 0.80, p < 0.001), and results of the reliability test also indicated that the scale was reliable for both task conditions (α_hedonic = 0.95, M_hedonic = 4.52, SD_hedonic = 1.13; α_utilitarian = 0.95, M_utilitarian = 3.61, SD_utilitarian = 1.23).

Attitudes toward the VA system

Attitudes were measured with 21 items adapted from past research (e.g., useful, high quality, appealing).^35,36 Participants reported how they felt about their interaction with Cortana, with higher ratings indicating more positive attitudes (Hedonic, α = 0.96, M = 5.00, standard deviation [SD] = 1.06; Utilitarian, α = 0.95, M = 5.06, SD = 0.98).

Control variables

Gender and participants' prior experiences with VAs were included as control variables. The levels of prior experiences were measured based on a seven-point interval scale (1 = never heard of it, 7 = use it all the time) for both Cortana (M = 1.43, SD = 0.71) and other VAs (e.g., Apple Siri, Amazon Alexa; M = 4.07, SD = 1.35).

Results

To test the first hypothesis and explore research questions, a series of 2 (modality: voice vs. text) × 2 (device: mobile vs. laptop) × 2 (task type: hedonic vs. utilitarian) mixed-model repeated-measures ANOVA were run. In consideration of the female-dominant sample, when gender was controlled in the model, women tended to report higher levels of perceived human likeness of the VA compared with men [F(1, 77) = 9.15, p = 0.003, partial η² = 0.11; M_Female = 4.20, SE_Female = 0.10; M_Male = 3.34, SE_Male = 0.26]. For the main analyses, first, we examined the effects of modality (i.e., voice vs. text) on perceived human likeness of the VA (H1). The results supported H1 in that interacting with voice (vs. text) significantly increased perceived human likeness [F(1, 77) = 7.34, p = 0.008, partial η² = 0.09; M_Voice = 4.33, SE_Voice = 0.14; M_Text = 3.82, SE_Text = 0.13]. Second, when the interaction effects between modality and device on perceived human likeness were tested ( RQ1 ), no significant effects appeared [F(1, 77) = 2.56, p = 0.11, partial η² = 0.03]. Interestingly, a significant direct effect of device emerged instead [F(1, 77) = 4.76, p = 0.03, partial η² = 0.06], showing that using a laptop (vs. mobile phone) to interact with Cortana resulted in higher perceived human likeness (M_Laptop = 4.29, SE_Laptop = 0.14; M_Mobile = 3.86, SE_Mobile = 0.13). Third, in regards to the moderating role of task-type difference (hedonic vs. utilitarian; RQ2 ), a significant interaction between modality and task type appeared [F(1, 77) = 6.96, p = 0.01, partial η² = 0.08]. In particular, voice compared with text interaction significantly enhanced the feeling of human likeness, but only in the utilitarian (vs. hedonic) task condition (Fig. 1).

FIG. 1.

Interaction between modality and task type on perceived human likeness.

Finally, PROCESS (Model 4)³⁷ was employed to test the mediating role of human likeness in the relationship between modality and attitudes toward the VA (H2). Due to the within-subjects nature of task type, the mediation analyses were run separately for each of the two task-type conditions. Also, since the interaction between modality and device was nonsignificant, the device difference was included as a control variable instead of a moderator. For the hedonic tasks, modality did not show a significant main effect on human likeness, which failed to mediate the relationship between modality and attitudes toward Cortana (b_{unstandardized} = 0.15, 95 percent biased-corrected 10,000 bootstrap confidence interval [CI] [−0.3139 to 0.6107]; Fig. 2). On the other hand, in the utilitarian task condition, voice (vs. text) interaction was mediated by a higher level of perceived human likeness to evoke more positive attitudes toward Cortana, as predicted by H2 (b_{unstandardized} = 0.60, 95 percent biased-corrected 10,000 bootstrap CI [0.3926–1.3267]; Fig. 3).

FIG. 2.

Indirect effects of modality on attitudes toward the VA with hedonic tasks. VA, virtual assistant.

FIG. 3.

Indirect effects of modality on attitudes toward the VA with utilitarian tasks.

Discussion

Consistent with our main hypothesis, the positive effect of voice interaction on attitudes toward Cortana was mediated by perceptions of human-like characteristics. However, we only found support for the utilitarian task condition; for hedonic tasks, no mediation effects were found. This discrepancy may come from the greater efficiency expected by users for utilitarian (vs. hedonic) tasks, which has been shown to be the case for online shopping.³⁸ Thus, it is possible that for utilitarian tasks, voice interaction was perceived to be more efficient, compared with typing via keyboards,³⁹ which led to better evaluations.

Another interesting (and perhaps counterintuitive) finding from our study is that laptops elicited more human-like perceptions toward the VA than mobile phones, regardless of the type of task. One possible explanation stems from expectation violation. Participants in this study were relatively less exposed to Cortana (M = 1.43, SD = 0.71), which is one of the few VAs compatible with laptop systems, compared with other VAs such as Siri and Alexa (M = 4.07, SD = 1.35), which are commonly used through handheld or smart home devices. Therefore, it is possible that users had lower expectations for the anthropomorphic performance of the laptop Cortana, leading to better evaluations. Another explanation can be derived from cognitive overload.^21,22 According to Elting et al.,²⁰ people using a mobile device (PDA), compared with desktop or TV, put more effort into handling the device since “fewer resources were available for the task at hand”^(p61) in a smaller screen. Thus, participants who interacted with Cortana on a mobile device may have felt higher cognitive overload during the interaction, and therefore gave lower ratings. At last, simply the greater sensory richness that users enjoy from large (vs. small) screens, which resulted in better attitudes toward smart devices by promoting both utilitarian and hedonic qualities of the devices,⁴⁰ can also account for these results.

The findings offer both theoretical and practical implications for VA research. Theoretically, the study extends our knowledge on how the interactions among modalities, devices, and task types could impact perceived human likeness of virtual agents and the attitudes toward them. Consistent with Sundar,^14,15 our study illustrates how modality as a structural feature in new media technology can cue particular heuristics, including human likeness. This study further shows that modality in VA interactions can function differently contingent on certain task types. Practically speaking, the findings suggest that voice as a modality should be considered a primary option for VA services designed to serve utilitarian tasks. In addition, results support that computers may also be a good venue for incorporating VA systems.

Nonetheless, a few limitations of this study merit note. One limitation was associated with the difference in mobile versus laptop responses. For example, the results given by Cortana were only shown within the dialogue frame of the app on mobile phones. On the other hand, for some questions asked through the laptop, Cortana automatically opened a pop-up web browser and provided the Bing search engine results. In addition, although we did not record how many questions/statements participants got through within the time limit, it seemed that individuals needed longer time to work on a single question/statement with text (vs. voice) input. The conditions just cited add complexity to disentangling the effects of modality and task from those stemming from other factors. Finally, the lab environment interactions with Cortana restricted the generalizability of our study results. Thus, adopting a better controlled environment for experiments and exploring other methodologies to investigate VA effects in more natural settings are recommended for future studies.

Footnotes

Author Disclosure Statement

No competing financial interests exist.

References

Keane

. Google Home and

Argos launch voice-activated shopping

. CNET 2018. https://cnet.com/news/google-home-and-argos-launch-voice-activated-shopping (accessed March 29, 2019).

Salinas

. In a rare move Apple is partnering with Amazon to let you control Apple Music with Alexa. CNBC, 2018. https://cnbc.com/2018/11/30/apple-music-is-coming-to-amazon-echo-devices.html (accessed February 28, 2019).

Mathieson

. ‘I feel in control of my life’: alexa's new role in public service. The Guardian 2019. https://theguardian.com/society/2019/feb/07/control-life-alexa-role-public-service-chatbots-councils (accessed February 28, 2019).

Baron

. One bot to rule them all?

Not likely

, with

Apple

, Google

Amazon and Microsoft virtual assistants

. The Mercury

News 2017

. http://mercurynews.com/2017/02/06/one-bot-to-rule-them-all-not-likely-with-apple-google-amazon-and-microsoft-virtual-assistants (accessed February 28, 2019).

Moridis

, Economides

. Affective learning: empathetic agents with emotional facial and tone of voice expressions. IEEE Transactions on Affective Computing, 2012; 3:260–272.

Elkins

, Derrick

. The sound of trust: voice as a measurement of trust during interactions with embodied conversational agents. Group Decision and Negotiation, 2013; 22:897–913.

Nass

, Gong

. (1999) Maximized modality or constrained consistency? In Proceedings of the AVSP 99 Conference. Santa Cruz, CA: ISCA. http://isca-speech.org/archive_open/avsp99/av99_001.html (accessed February 1, 2019).

Nunamaker

, Derrick

, Elkins

, et al. Embodied conversational agent-based kiosk for automated interviewing. Journal of Management Information Systems, 2011; 28:17–48.

Nass

, Steuer

, Tauber

. (1994) Computers are social actors. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’94). Boston, MA: Human Factors in Computing Systems, pp. 72–78.

10.

Fink

. (2012) Anthropomorphism and human likeness in the design of robots and human–robot interaction. In International Conference on Social Robotics. Berlin, Germany: Springer, vol. 7621, pp. 199–208.

11.

Olive

. (1997) The talking computer: text to speech synthesis. In Stork

, ed. Hal's legacy: 2001's computer as dream and reality. Cambridge, MA: The MIT Press, pp. 101–130.

12.

Berry

, Butler

, de Rosis

. Evaluating a realistic agent in an advice-giving task. International Journal of Human–Computer Studies, 2005; 63:304–327.

13.

Moorthy

, Vu

K-PL

. Privacy concerns for use of voice activated personal assistant in the public space. International Journal of Human–Computer Interaction, 2015; 31:307–335.

14.

Sundar

. (2008) The MAIN Model: a heuristic approach to understanding technology effects on credibility. In Metzger

, Flanagin

, eds. The John D. and Catherine T. MacArthur foundation series on digital media and learning: digital media, youth, and credibility. Cambridge, MA: The MIT Press, pp. 73–100.

15.

Sundar

. (2009) Media effects 2.0: social and psychological effects of communication technologies. In Nabi

, Oliver

, eds. The SAGE handbook of media processes and effects. Thousand Oaks, CA: Sage Publications, pp. 545–560.

16.

Lee

, Nass

. (2013) Designing social presence of social actors in human computer interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’ 13). New York: ACM, vol. 5, pp. 289–296.

17.

Lee

, Peng

, Jin

, et al. Can robots manifest personality? An empirical test of personality recognition, social responses, and social presence in Human–Robot interaction. Journal of Communication, 2006; 56:754–772.

18.

Sung

, Mayer

. Students' beliefs about mobile devices vs. desktop computers in South Korea and the United States. Computers and Education, 2012; 59:1328–1338.

19.

Adepu

, Adler

. (2016) A comparison of performance and preference on mobile devices vs. desktop computers. In IEEE 7th Annual Ubiquitous Computing, Electronic & Mobile Communication Conference (UEMCON). New York: IEE, pp. 1–7.

20.

Elting

, Zwickel

, Malaka

. (2002) Device-dependant modality selection for user-interfaces: an empirical study. In Proceedings of the 7th International Conference on Intelligent User Interfaces. New York: ACM, pp. 55–62.

21.

Sweller

. Cognitive load during problem solving: effects on learning. Cognitive Science, 1988; 12:257–285.

22.

Sweller

. Cognitive technology: some procedures for facilitating learning and problem solving in mathematics and science. Journal of Educational Psychology, 1988; 81:457–466.

23.

Lombard

, Reich

, Grabe

, et al. Presence and television. The role of screen size. Human Communication Research, 2000; 26:75–98.

24.

Kim

, Sundar

. Mobile persuasion: can screen size and presentation mode make a difference to trust?. Human Communication Research, 2016; 42:45–70.

25.

Downs

, Boyson

, Alley

, et al. iPedagogy: using multimedia learning theory to identify best practices for MP3 player use in higher education. Journal of Applied Communication Research, 2011; 39:184–200.

26.

Kelley

. The effect of screen size and audio delivery system on memory for television news. Visual Communication Quarterly, 2007; 14:176–188.

27.

Schaffer

, Schleicher

, Moller

. Modeling input modality choice in mobile graphical and speech interfaces. International Journal of Human–Computer Studies, 2015; 75:21–34.

28.

Hijjawi

, Bandar

, Crockett

. A general evaluation framework for text based conversational agent. International Journal of Advanced Computer Science and Applications, 2016; 7:23–33.

29.

Renouard

. 2016 the year of the Virtual Assistant? L'Atelier BNP Paribas. 2016. http://atelier.net/en/trends/articles/2016-year-of-virtual-assistant_440064 (accessed February 28, 2019).

30.

Microsoft. (2019) Cortana. Your intelligent assistant across your life. https://microsoft.com/en-us/cortana (accessed February 28, 2019).

31.

G*power. (2017) G*power 3.1 manual. http://gpower.hhu.de/fileadmin/redaktion/Fakultaeten/Mathematisch-Naturwissenschaftliche_Fakultaet/Psychologie/AAP/gpower/GPowerManual.pdf (accessed February 28, 2019).

32.

, MacDorman

. Revisiting the uncanny valley theory: developing and validating an alternative to the godspeed indices. Computers in Human Behavior, 2010; 26:1508–1518.

33.

Gefen

, Straub

. Managing user trust in B2B e-services. e-Service Journal, 2003; 2:7–24.

34.

Lankton

, McKnight

, Tripp

. Technology, humanness, and trust: rethinking trust in technology. Journal of the Association for Information Systems, 2015; 16:880–918.

35.

Kalyanaraman

, Sundar

. The psychological appeal of personalized online content in Web portals: does customization affect attitudes and behavior?. Journal of Communication, 2006; 56:110–132.

36.

Sundar

, Xu

, Bellur

, et al. (2011) Beyond pointing and clicking: how do newer interaction modalities affect user engagement? In Proceedings of the Annual Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA’11). New York: ACM, pp. 1477–1482.

37.

Hayes

. (2013) Introduction to mediation, moderation, and conditional process analysis: a regression-based approach. New York: Guilford Press.

38.

Babin

, Darden

, Griffin

. Work and/or fun: measuring hedonic and utilitarian shopping value. Journal of Consumer Research, 1994; 20:644–656.

39.

Shirali-Shahreza

, Penn

, Balakrishnan

, et al. (2013). SeeSay and HearSay captcha for mobile interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’13). New York: ACM, pp. 2147–2156.

40.

Kim

, Sundar

. Does screen size matter for smartphones? Utilitarian and dedonic effects of screen size on smartphone adoption. Cyberpsychology, Behavior, and Social Networking, 2014; 17:466–473.