Speaking,but having no voice. Negotiating agency in advertisements for intelligent personal assistants

Abstract

With the popularisation of intelligent personal assistants (IPAs) like Amazon’s Alexa and the Google Assistant, natural language-based interaction with machines is increasingly becoming a part of everyday life. The conceptualisation of these tools as agentive assistants who help with a variety of tasks in both the household and at work is guided by their marketing: When Apple introduced the Siri-technology at their keynote event in 2011, the system responded to the question ‘Siri, who are you?’ with ‘I am a humble personal assistant’. This claim to a speaking subject position while at the same time locating this subject firmly in a servile social role has become a defining feature of the social place of IPAs: Designed to postulate agency, they do so not in equality with humans but as their servants. This paper offers an interdisciplinary analysis of video advertisements for IPAs, combining sociological and linguistic approaches. We treat agency and actors not as something given but as something that becomes visible through communicative acts, suggesting an understanding of these advertisements as socio-technical visions in which the negotiation of agency in human–machine interaction serves two functions: Firstly, the asymmetrical relationship between the human and the machine promises a symmetrisation of human–human relationships. In an imagined diversified world of equal human rights and relationships, social inequality is reconfigured in the relationship between human and non-human entities. Secondly, the negotiation of agency between humans and machines deflects from questions regarding the increasing agency and power of the companies behind these IPAs and their growing access to and influence on people’s private lives. Our paper will thus provide insights into how agency is ascribed in human–human and human–machine interaction considering social practices of symmetrisation and hierarchisation as well as a critical investigation into the triangular relationships between humans’, machines’, and companies’ agency.

Keywords

Advertisement agency human differentiation human–machine interaction performativity social inequality speech act theory voice user interfaces

Introduction

Natural language interaction with machines is increasingly becoming a part of everyday life with the popularisation of intelligent personal assistants (IPAs) like Amazon’s Alexa and the Google Assistant (Hepp, 2020; Hoy, 2018). These tools take the role of agentive assistants who help with a variety of tasks in both the household and on the go, guided by their marketing: When Apple introduced their IPA Siri at their keynote event in 2011, it responded to the question ‘Siri, who are you?’ with ‘I am a humble personal assistant’.¹ The choice of interrogative pronoun in this question already presumes agency: Siri is a Who, not a What. The IPA confirms this anthropomorphism (Eyssel and Kuchenbrandt, 2012) by assigning itself not only the professional – and as such strictly human – role of personal assistant, but also with the self-attribution of the similarly human-specific virtue of humbleness. This claim to a speaking subject position while at the same time locating the subject firmly in a servile social role has, as we will argue, become a defining feature of the social place of IPAs: Designed to postulate agency, they do so not in equality with humans but as their servants.

In this paper, we offer an interdisciplinary analysis of advertisements for IPAs, combining sociological and linguistic approaches. To grasp the changing of social boundaries we shift the focus from action to communication (Esposito, 2017; Luhmann, 1996; Muhle, 2018). We thus treat agency and actors not as something given but as something that becomes visible through communicative acts. The central research questions this article will address are: How is agency ascribed to IPAs through advertising discourse? What are the attributes of this agency, and how does the marketing imagery of agency affect the positionings and relations of humans, IPAs, and the companies behind them? We are interested in the ways advertisements attribute human qualities to IPAs, particularly Amazon’s Alexa and the Google Assistant, and how this attribution of human(like)ness contributes to the ascription of agency in human–machine interaction. The Amazon Alexa and the Google Assistant are chosen as the primary subjects of our investigation as these two systems currently are the most sold voice user interfaces (Strategy Analytics, 2021).

We forward an understanding of these advertisements as socio-technical visions in which the transformation of agency in human–machine interactions facilitates an egalitarian human society: The asymmetrical relationship between the human and the machine symmetrises the human side. In an imagined diversified world of equal human rights and relationships, social inequality is reconfigured as the relationship between human and non-human entities. The normalisation of this social model takes place on the basis of a communicative symmetrisation of human and machine and, at the same time, an unquestioned social hierarchisation, which justifies the unequal treatment of the machine. The negotiations of human and machine agency in advertisements, finally, will be shown to deflect from the agency of the companies involved, who rather appear as third actors in the background.

Talking machines and the problem of agency

Theoretical considerations: From ontology to communication

Intelligent personal assistants are emblematic of the deep mediatisation of the social (Hepp, 2020). New media is seamlessly and continuously integrated into public and private society, with the internet in particular now being an irreplaceable infrastructure, hosting and connecting a diversity of media and enabling, channelling and controlling diverse forms of sociality (Dolata and Schrape, 2015). These technological innovations which characterise the digital age are no longer solely mediums of communication, but are also, in some instances, to be communicated with, as independent interlocutors. The extent to which machines can serve as interactive artefacts has been a focal point of research since the dawn of the computer (Suchman, 2007). The Turing Test, which examines whether machines can be intelligent on the basis of their performance in communication (Turing, 1950), is the most well-known and enduring example.

Not only can digital artefacts occasionally be treated like humans, but they can also be explicitly constructed for the social interaction with humans. While science fiction has offered manifold portrayals of such interactions for decades and computer games have frequently provided opportunities to practice the personification of digital interaction partners, the possibility of real-world interaction with talking machines has only recently become an everyday experience that extends far beyond the cyberspace of in-game worlds. Even though digital interaction partners date back to Weizenbaum’s chatbot Eliza in the 1960s, these artificial interlocutors only became more widespread and successful with the large-scale availability of the internet and increased computational power (Dokukina and Gumanova, 2020). Depending on the specific configuration and social function, these talking machines are referred to as chatbots, social bots, work bots or artificial companions. Andreas Hepp (2020) recently suggested the term ‘communicative robots’ as a hyperonym – that is, the superordinate term – for entities that serve as everyday digital interlocutors, understanding that these entities will only continue to play a larger and larger role in the social. IPAs like Siri, which recognise spoken utterances and offer voice-based responses and thus comprise both automatic speech recognition and speech synthesis, exemplify the cutting-edge of these communicative robots.

These systems provide a challenge for social theory because the ability to communicate has long been regarded as a defining and distinctive feature of human agency (Knoblauch, 2020; Lindemann, 2005). In social theory, the question of machine agency is often framed as an ontological issue: Do humans only project agency onto these machines (Cerulo, 2009)? If so, should we regard this projection as a mere human quirk, an illusion, a misconception, or as an important shift in human–machine relations (Voss, 2021)? The promise of such a shift might require us to move beyond anthropocentric notions of agency, and to treat communicative robots as ‘real’ agents in interaction.

Within social theory, the agency of artefacts with particular respect to technological entities has long been a crucial topic. While anthropocentric theories defend the ‘boundaries of the social world’ (Luckmann, 1970) against these non-human invaders, posthumanist theories (Barad, 2003; Braidotti, 2013) argue for the dissolution of a human-centred social world and for the explicit inclusion of non-human entities (Cerulo, 2009; Henrickson, 2018). Actor-network theory has perhaps been the most prominent avenue for discussions of non-human agency, though its use is arguably limited with regard to IPAs and other communicative robots. Following Latour’s principle of generalised symmetry, every entity that has an effect on others can be understood as an actor (Latour, 2005), which extends the field of possible actors arbitrarily. Possible differentiations between machines and other tools, and even for example trees, storms, words, and gods, move out of focus, rendering the new and intriguing qualities of communicative robots invisible (Muhle, 2018: 150).

This paper seeks to move away from such ontological questions, focussing instead on the social construction of ontologies and their respective notions of agency in contemporary socio-technical imaginaries. Following Muhle (2018), we understand the question of machine agency as a dilemma unsolvable within theoretical frameworks that regard agency as an ontological building block of the social world. What kinds of entities are regarded as legitimate actors changes in different cultural and technological conditions. Although modern societies have delegated complex chains of operations to technology, they have generally denied non-human entities a kind of agency that would be comparable to humans (Lindemann, 2005). The rise of communicative robots, however, poses a challenge to these boundaries. Where digital entities are assigned attributes like autonomy and the ability to communicate and to form relationships, an anthropocentric notion of agency comes under scrutiny – not only in academic discourse but also in everyday practices.

In order to grasp the change of social boundaries we shift the focus from action to communication (Esposito, 2017; Luhmann, 1996). We treat agency and actors not as something given but as something that becomes visible through communicative acts. From this theoretical perspective, agency is not an ontological precondition of communication but rather its product. Accordingly, it is not a property that entities possess but something that is formed and successively conventionalised through social attributions. ‘Agency […] is not inherent; it is permitted’ (Henrickson, 2018: 7). Thus, if we want to study agency as a social phenomenon, we need to investigate how agency is renegotiated in discourse; its definition is not material to this endeavour.

By empirically investigating the differentiation and interrelation of humans and machines, we follow Suchman in changing the question ‘from one of whether humans and machines are the same or different to how and when the categories of human or machine become relevant, how relations of sameness or difference between them are enacted on particular occasions, and with what discursive and material consequences’ (2007: 2). Hence, we do not ask if IPAs possess agency or not. Rather, we are shifting the perspective to a ‘second-order observation’ (Luhmann, 1996: xvii) of agency by asking how agency is attributed.

Methodological considerations: Investigating the attribution of agency

Our empirical material for this investigation is not the everyday interaction with digital entities (see Muhle, 2017; Porcheron et al., 2018 for studies on this) but rather the enactment of such interactions in advertisements for IPAs. This approach offers us access to the promises, plausibilisations and justifications that are supposed to further the introduction of these devices into the privacy of the home. Adverts are particularly suitable for the study of technological imaginaries (Jackman and Jablonowski, 2021) because they not only promote a product, but also enact a reality in which the product should function, demonstrating an idealised usage and setting (Cluley and Nixon, 2019). These advertisements thus manifest the very product they are advertising, similar to how economics ‘does not describe an existing external “economy,” but brings that economy into being: economics performs the economy, creating the phenomena it describes’ (MacKenzie and Millo, 2003: 108); marketing functions performatively (Callon, 1998; Cochoy, 1998). Studies on the performativity of marketing draw on Austin’s (1962) concept of performative utterances as well as Butler’s (2010) thoughts on the perlocutionary effects of performatives, which describe the effects a performative may have on the audience if particular felicity conditions are met. Butler points out that the application of performativity in economics so far has only taken into account the illocution of performatives (i.e., the intended meaning or effect of an utterance). The perlocutionary acts – the actual effects of an utterance on the recipient – on the other hand have been largely overlooked, even though ‘perlocutionary performatives alter an ongoing situation’ if ‘a sequence of events and a felicitous set of circumstances’ (Butler, 2010: 151) fall into place. In marketing communication, consideration should thus not only be given to the illocutionary power a commercial has, that is, what it is intended to communicate and how it is intended to affect its audience, but also to its perlocutionary effect, that is, what the actual effects of the commercial on its audience are and what felicity conditions had to be met to achieve this.

Analysing commercials for IPAs, we are interested in the performative construction of IPAs as agentive, humanlike entities and the illocutions and perlocutions of positioning them as such in marketing communication. By investigating the illocutionary functions and perlocutionary effects of IPA commercials, we take an analytical approach to our empirical material that is influenced by social positioning analysis (Hausendorf and Bora, 2006) and text linguistic applications of Searle's 1969, 1975 speech act taxonomy (e.g. Brinker et al., 2018; Krieg-Holz and Bülow, 2016). Social positioning analysis is interested how communicative acts construct images of the self and the other and articulate expectations and roles (Hausendorf and Bora, 2006). We use it to uncover what kind of agency is attributed to IPAs and how this agency resembles (or differs from) social roles usually reserved for human actors. Speech act analysis offers a more detailed insight into the constituents of an utterance and into how we do things by saying things (Austin, 1962). Particularly relevant here are the illocution and perlocution of a speech act, which describe what a speaker intends to do with an utterance (illocution) and what effect this has on the listener (perlocution). Text linguistics applies this approach, which originally aimed at single utterances, to larger communicative events and assumes that there is an overall ‘global’ illocutive intention with a text or communication event and subsidiary illocutions that serve this overall function as well as a respective general perlocution (Brinker et al., 2018). These can be analysed, as will be done in this paper, by investigating the linguistic acts performed within this larger communicative event, in our case video commercials.

On this theoretical and methodological basis, we do not ask whether or to what degree machines possess agency, but to what extent agency is attributed to machines. Furthermore, we are interested in identifying the specific categories of actors that IPAs are considered to be in certain contexts, how these categories relate to the categorisations of human actors and how categories for human actors are used to make sense of machine agency.

According to Epley et al. (2007), the attribution of human qualities to non-human entities – their anthropomorphisation – takes place especially in those instances where these entities can act independently. This impacts not only the potential social status of these entities, but also how humans behave towards them: When we perceive someone, or something, as human, our behaviour becomes influenced by specific cultural and social norms. The categorisation of artificial entities in relation to categories of human differentiation plays a particularly important role with respect to anthropomorphisation: Eyssel and Kuchenbrandt (2012) and Kuchenbrandt et al. (2013) show that implicit membership of a humanoid robot to the national or gendered ingroup has a significant impact on participants’ perception of its anthropomorphisation. Participants are more likely to attribute characteristics like consciousness to a robot if they have a common gender category and nationality (indicated by name and place of production).

The participation of artificial entities in categories relevant for human differentiation and the attribution of such participation is also evident linguistically. McDaniel and Gong pointed out the anthropomorphising language use present in robotics already in 1982, particularly regarding body metaphors: robot joints are regularly referred to as shoulders when the grip mechanism is attached to a ‘body’ with a hinge; the part that serves image processing swiftly turns into eyes that are located in the head of the robot, and so on. This language use, they argue, has the potential to communicate technological procedures to a lay audience, but risks imprecise descriptions and contributing to the fear of an increasingly automated world and the subsequent loss of jobs. Linguistic anthropomorphisation, especially in terms of metaphors, is a recurrent topic in robotics (e.g. Lohmann, 2014), and it appears to increase in tow with technological progress: Focus has now shifted beyond mere physical similarities to cognitive and social abilities. Parallel tendencies can be found in the ways we speak and write about IPAs: Purington et al. (2017) use online reviews for Amazon’s Alexa and the corresponding Echo smart speakers to highlight the predominance of a variety of anthropomorphising linguistic items, ranging from primary reference with the personal name Alexa and the human-specific pronoun ‘she' instead of ‘it' to the attribution of human characteristics like family membership and friendship status. These attributions, as we shall see, are an ultimately unsurprising result of their explicit use in commercials, alongside the broadly anthropomorphised depictions of IPAs.

Our selection of cases followed the analytical strategy of identifying first a crucial case and subsequently exploring the most (dis)similar cases (Kelle and Kluge, 2010): We identified the introduction of Apple’s Siri as our first crucial case as this event was the first large-scale public presentation event for a voice-based IPA. We then took the introductions of the two main competitors, the Google Assistant and Amazon’s Echo/Alexa, as our most similar cases. We selected contrasting cases according to two dimensions: (1) Product (which IPA is advertised? Which company? Mobile or stationary artefact?), (2) types of interpersonal relations presented (e.g. family context, organisational context; degree of realism). The video commercials for the Google Home/Nest were chosen for their similarity to the introductory video for Amazon’s Echo/Alexa; the video ‘Make Google Do It’ was then selected for its contrasting qualities to illustrate the vast difference in portrayals of the same IPA between different advertisements. Continuing this analytical approach, we then added two commercials for Amazon’s Alexa to our data material based on the diversity of social roles ascribed to the IPA in these videos. We decided against the further inclusion of Apple commercials as they focused rather on the respective generations of iPhones generally than being advertisements for the integrated IPA Siri specifically. The material presented here is thus highly selective and can make no claim to being exhaustive. Nevertheless, the sample of videos presented in the following analysis shows a broad range of ways IPAs are staged in advertisements and as such provides a comprehensive insight into IPA commercials. Our methodology is informed by the maxim of interpretative social research to analyse small data samples extensively rather than analysing a large body of data. We follow Sammet and Erhard (2018: 50) who argue that for generalisations based on interpretative research, quantifying arguments are not decisive and that the robustness and scope of theory-building is not linked to the number of cases. Instead, interpretative research assumes that general social meaning is mediated in each individual case.

In the analysis of these video advertisements, we address how agency is performatively constructed both linguistically and visually, and we consider the illocutionary functions and perlocutionary effects of these constructions of agency. For this analysis, we have gone through the following steps:

1. Situation: We reconstructed the social situation portrayed in the advert, investigating the defining features of these situations and asking what types of action they offer and what kinds of problems they pose for the actors.

2. Positions: What are the relevant social positions in the shown situation? How do actors enact these positions? How do social relationships unfold during the interaction?

3. Communication: What kind of utterances are made? How do they function? How are they embedded in the multimodality of interaction?

4. (Machine) Agency: What kind and degree of agency is ascribed to the IPA? How are IPAs positioned as actors in relation to humans?

5. Performativity: How does the presentation of IPAs function within the overall persuasive character of marketing communication? How does the negotiation of IPA agency serve the communicative practices between companies and potential consumers?

The staging of IPAs in marketing communication

Introducing IPAs

The keynote announcement of Apple’s IPA Siri for the iPhone model 4s (s for Siri) in 2011, as has already been noted, positioned Siri as an entity with the potential to occupy human positions, framing it as a who and not a what – ‘a humble personal assistant’.² In their 2016 keynote event introducing the Google Assistant for Android systems, Google shifted the focus: a voice interface alone was no longer innovative, so the company chose to use the already established familiarity with their search engine Google to emphasise the system’s attraction to potential users. The assistant was introduced as a ‘Google for your world’, something that would not only help communicate results from the search engine itself but that first and foremost aids one in the organisation of one’s everyday life.³ The Google Assistant could manage appointments, make reservations, and oversee one’s schedule, just as a human assistant would do. Both companies, then, promise their customers something that so far was largely the purview of the upper echelons of society: a personal assistant, constantly available, who can take on organisational tasks as well as the administration of your everyday life. Consumers were tempted with the prospect of authority over a subordinate and the accompanying association of raised social status: this promise of authority works as an imaginary of enhanced agency for the human consumers.

In their introduction of the smart speaker Echo and its inbuilt IPA Alexa in 2014, Amazon opted for a markedly different strategy than these two main competitors, presenting their system through a video commercial rather than at a live event. The video shows a white suburban middle-class American family with three children (two girls and a boy, the youngest girl narrating the video) who have just received a parcel with the smart speaker Echo, and how they explore this new device. Every family member gets a turn testing out its functions, playing music, delivering information, and announcing the time, with the father positioned as the technology expert, guiding his family in this exploration. The possible uses of the device are depicted through highly gendered, racialised and classist family clichés (cf. Phan, 2019 for an intersectional analysis on the depiction of the Amazon Echo): the mother uses the Echo to add items to the shopping list and calculate the correct amount of ingredients while baking for the family; the parents are woken up by Alexa in their marital bed; the children ask Alexa to tell jokes and to help them with their homework. The audience then learns that Echo really loves to play music – in excellent quality! – and knows many songs, and is useable from everywhere. At the same time, the viewer is introduced to the social functions that the IPA can fulfil: After the son interrupts the music that his older sister is currently playing on the Echo, she retaliates by asking Alexa to define the term ‘annoying’, her gaze making clear that she is well aware of the word’s meaning and is rather integrating the voice assistant and its functions to comment on her brother’s behaviour. In this, the Echo/Alexa becomes an actor in the sibling’s dispute similar to the ways pets are sometimes used as a communicative resource: Tannen (2004) describes interactions where a criticism directed at another family member or partner is voiced towards a pet while the respective family member is present, thus ‘using’ the interaction with the pet as a resource for communicating with another human. The last scene of the commercial shows the parents in their house dancing to music from the smart speaker, with a concluding comment from the youngest daughter that the Echo, with all its functions, has ‘really become a part of the family’. The performative construction of family displayed here is a prime example of what is described as doing family in praxeologically oriented social sciences: A family is not something one simply has but rather something that is done (Jurczyk, 2014).

This video commercial illustrates a very different marketing strategy than Google and Apple: Amazon’s Echo/Alexa does not merely fulfil tasks and service functions but is rather designed to become a participant in familial interaction, with whom people build social relationships and who influences and supports human–human relationships. While the primary targets of the Apple and Google keynotes were technology enthusiasts, Amazon’s commercial for the Echo/Alexa apparently had a broader and less specific target audience. What is also particularly notable in all three presentations is the constant use of the inanimate pronoun ‘it' when referring to the IPA; at the time of release, no anthropomorphisation through the use of the largely human-exclusive pronouns ‘she'/‘he' takes place. Indeed, Amazon’s IPA is referred to exclusively as Echo, with Alexa only used as a wake-word for activating the IPA. It is thus rather the device itself, the smart speaker, that is referred to, not the artificial interlocutor. This practice changes significantly in later Amazon commercials, whereas the remarkably unpersonalised Google Assistant remains in later advertisements an almost nameless servant integrated seamlessly into the flawless upper-middle-class lifestyle.

Amazon’s decision to frame the Echo/Alexa in a familial setting lends itself to an emphasis on affect, love and warmth, a strategy not solely to endear the product to potential consumers but also, Kopitz (2021) argues, an intentional move to deflect from the ongoing discourse around data security and privacy concerns regarding IPAs in private homes. The performative construction of Alexa’s human virtues and family membership serves to create positive connotations of doing family (Lind, forthc.), steering attention away from concerns about data collection and the increasing market power through data concentration of the key providers of IPAs (cf. Graef, 2018).

Advertising the Google Assistant and Google home

While Amazon’s Alexa was designed for implementation in the smart speaker Echo, Google developed their Assistant primarily for use on Android phones, only later installing it in a smart speaker, the Google Home/Nest, to compete with the Echo. This dichotomous positionality, as a mobile assistant available wherever you are (or your phone is) or as a stationary IPA located in one’s home, is framed in entirely distinct ways, as can be seen through a comparison of two advertisements for the Google Home and one for the Google Assistant, all launched in 2018.

The commercials ‘Family Time’⁴ and ‘Mornings’,⁵ produced for the Google Home and the Google Home Mini, remind the viewer of Amazon’s introduction video for the Echo/Alexa in terms of setting: heteronormative, idealised middle-class families with two loving parents and three children. Both Google advertisements are recorded in the style of an interview, where the parents present their family lives and provide examples for the ways in which Google Home supports and enriches these lives, for example, by helping with homework, reminding of appointments, playing music, or informing family members in different rooms that a meal is ready. In the video ‘Family Time’, which presents a nameless white family, the focus is entirely on creating an idealised family idyll of playtime, fun and laughter, illustrating how Google Home assists a family in spending quality time together. At the end of the commercial, the father comments: ‘We have this assistant at home now that helps us. This way, we just get to play with them. We just have fun’. The Google Assistant as the IPA in the smart speaker Google Home is crucial for doing family (cf. Lind, forthc for an examination of doing family in the context of human–machine interaction) by providing the basic infrastructure (e.g. playing music or animal noises, giving information) for joyful family interactions. In taking over a variety of tasks, the IPA allows the parents to focus on interacting with their children, thus enabling more quality time together. The video commercial ‘Mornings’, which presents a black family named Bacon, provides an (additionally racial and onomastic) counterpoint to the idealised familyhood of ‘Family Time’: the parents want to sleep in, the children fight with each other, the father cannot find his phone. In spite of these ‘struggles’, ‘family Bacon’ nevertheless lives in a spacious, well-furnished and immaculately clean house, a picture of the traditional Western heteronormative ideal of the father as sole provider and the mother managing the household. In this upper-middle-class version of ‘family struggles’, the Google Home is positioned as the perfect support system to smooth things out: it can set the alarm to snooze and remind the parents of their children’s appointments, it can beatbox to distract the children from their squabbling, and can locate the missing phone. The parents aptly summarise the role that the IPA plays: ‘We are definitely a team. I have her back, she has my back. We got the kids’ back. Google Home has all our backs’. Similar to Amazon’s introductory video for the Echo/Alexa, Google Home’s IPA is constructed on the basis of the highly gendered cultural stereotype of the perfect, invisible maid who is in charge of taking care of the mundane tasks necessary for the undisturbed life of the family in such a perfect manner that one could almost forget she is there (e.g. Nyamnjoh, 2005; Ryan, 2001). The decrease in size of the smart speaker from the original to the Google Home/Nest mini, which is less than 10 cm in diameter and a little over 4 cm high, perhaps parallels this invisibilisation.

The 2018 campaign ‘Make Google Do It’,⁶ on the other hand, differs from the Google Home commercials primarily in terms of affect. The video advertisement shows the digital system as an anonymous, disembodied servant that can readily fulfil the tasks the human actor is reminding themselves to do, for example, take pictures or record a song. While the tasks the Assistant fulfils in this video do not differ much from those it performs in the Google Home advertisements, it never visibly appears as an actor, existing only implicitly in the slogan ‘Make Google Do It’. Across 13 scenes the viewer meets a variety of human actors and is presented with their thoughts, in each case an urge relating to the environment: ‘I should write this down’, ‘I should record this’, ‘I should take a selfie’. Each scene ends with the imperative ‘Make Google Do It’ fading in white font. An element of humour – usually understood as a tool to create positive brand associations (cf. Strick et al., 2013) – is added in one scene that shows a pair of feet touching in front of a fireplace and a female voice saying ‘I really gotta break up with my boyfriend’; instead of ‘Make Google Do It’, the scene ends with the overlay ‘That one’s on you’.

Here, it is the user that is constructed as the agent of every single act – the Google Assistant merely embodies its name, acting as the hands-free extension of any thought that might cross the human mind, a servile helper. The illocutionary promise lying in this construction is one of power over a mindless device whose agency is presented as little more than that of any other tool. Charlie Warzel highlights Google’s explicit staging of agency in this campaign in a Buzzfeed article, where he writes: ‘“Make Google Do It” suggests we have agency. But increasingly it’s the other way around. Offloading human tasks to a computer feels empowering. It may well increase productivity and give us more time to do the things we love. But it requires sacrificing control of the many little things that make up our daily lives – our schedules, how we write our emails, which app to use next, and even when to call Mom. It’s hiding in plain sight, right there in the ad copy. “Make Google Do It” is most definitely a command – but it’s a command from Google, not its users’ (Warzel, 2018). In this quote, Warzel highlights what is left in the dark in the advertisements themselves: the role of Google as a market-dominating company whose agency will become the more prominent the more users grant them access to their data. ‘Make Google Do It’ thus means much more than delegating mundane tasks to an IPA as doing so simultaneously translates to making Google as a company gain data access, influence, and power.

Advertising Alexa

While Amazon’s introductory commercial for the Echo smart speaker and the IPA Alexa centred around the smart speaker as an embodied device positioned in the family home, facilitating and enhancing family life, subsequent commercials treat the Echo as secondary to the IPA; it is no longer the Echo that ‘has become a part of the family’, but the personified Alexa who helps, answers, and fulfils tasks. This shift from a product to be purchased to a named person to be interacted with speaks to the desired perlocutionary effect of such advertising: firmly placing the IPA as a human(like) entity in the social world, letting customers forget that they are users of a technological device that constantly sends and receives data to and from Amazon.

The performative construction of Alexa as a much more humanlike personal assistant in Amazon’s advertisements is epitomised by the 2021 video ‘Alexa’s Body’,⁷ in which Alexa appears as a personified and sexualised fantasy. The video starts with four people standing around a table with the newest model of the Amazon Echo. A black woman admiringly says ‘I could literally not imagine a more beautiful vessel for Alexa to be inside’, while turning to the window, where she sees a bus displaying an advertisement for the 2021 film of Tom Clancy’s ‘Without Remorse’ starring Michael B. Jordan. The next 50 seconds of this 1-minute commercial show the increasingly sexually charged interaction of this female protagonist with an Alexa embodied through Michael B. Jordan. That this sexualisation takes place in the form of a woman fantasising about ‘Alexa’s body’ as male is evidently a response to discussions of sexist stereotypes built into voice assistance systems, which are almost exclusively designed with female voices as default and a whole female persona (cf. West et al., 2019). Despite this gender reversal, the commercial maintains the idea of voice assistants as always-available servants, waiting to please their human master.

Irrespective of which physical shape Alexa occupies in commercials, be it in the form of the smart speaker Echo, as actor Michael B. Jordan – named sexiest man alive in 2020 by the magazine People – or as one of the many human substitutes in commercials like ‘Alexa loses her voice’⁸ or ‘Before Alexa’,⁹ the digital system appears consequently in service roles. These depictions can be seen as a counterpoint to the disembodied and anonymous Google Assistant in Google’s commercial ‘Make Google Do It’ and as notably more humanlike servants than the Google Assistant in the video commercials ‘Family Time’ and ‘Mornings’. As a member of the household, the family, or even as a participant in a sexual relationship, the imagination of Amazon’s Alexa places it much closer to human social relationships than the mere – often in itself dehumanised (e.g. Ladegaard, 2013; Nyamnjoh, 2005) – role of domestic staff; Alexa appears much more as the beloved nanny or au-pair, who, while still strictly servile, might be almost considered family. While one might understand the illocution underlying this form of marketing communication – presenting the viewer with the promise of social companionship without the burdens of human equity – it remains open to speculation whether the marketing managers of this commercial have taken into account the potentially involuntary perlocutionary effect of necessarily disappointing customers whose expectations cannot be met by the reality of interacting with these devices (e.g. Luger and Sellen, 2016).

The advertisement ‘Before Alexa’, produced for the SuperBowl 2020, highlights Alexa’s professional superiority over its human equivalents specifically because Alexa is a technological artefact. Framed by Ellen DeGeneres asking her wife Portia de Rossi ‘What do you think people did before Alexa?’, the commercial consists of seven scenes of an imaginary time ‘Before Alexa’, ranging from a medieval fantasy world to Victorian Britain, from the ‘Wild West’ to shortly before Nixon’s impeachment. In all these scenes, Alexa’s human – and in one scene animal – equivalents, who bear names of the same base as Alexa (e.g. Alexis, Alex, Alessa), fail to perform the task they are asked to perform (e.g. tell a joke, play a song) or fulfil it badly, something that, as the commercial suggests, would never happen with Amazon’s IPA. We do not need to worry that Alexa might be shy when asked to tell a joke. We do not need to worry that Alexa refuses to delete files when we order her to. Nor does Alexa need social recognition of a service performed, such as ‘Please’ and ‘Thank You’.

The social positioning of Alexa is thus shaped by ambiguity. It is a name-bearing, addressable member of the household, and simultaneously a non-human artefact. The IPA is placed in a liminal space between human and non-human, between being and thing, between subject and object. While Alexa can speak, it does not, figuratively, have a voice, and its agency is limited to such actions that others have thought out for it to perform.

Shifting agencies of humans, machines, and companies

While voice assistants are designed to postulate agency, they do so not in equality with humans, but as their servants, and are assigned a social status that varies between human companion and less-than-human service provider. We are dealing here with a formation that Elena Esposito (2017) has described as ‘artificial communication’ – a communication in which messages are ‘heard’, ‘understood’ and ‘responded to’, but without this being linked to the contingencies of human subjectivity. These advertisements, it is important to note, do not only categorise the IPA, but also the human actors depicted. They become employers of domestic staff whom they can address and give orders to at any time, and to whom they never have to justify themselves or engage in even the most basic forms of politeness. On a perlocutionary level, these advertisements thus engage in a twofold performance: constructing their product as a service agent organising and simplifying humans’ everyday lives, and transforming the social status of this product’s users.

The stratificatory asymmetry that is created in the commercials between the human and the artificial offers the promise of human symmetrisation: delegating tasks to artificial entities offers to free human actors from being placed in such service roles, thus promising a levelling effect to interhuman interactions. This asymmetrisation of human–machine relationships in Amazon and Google’s advertisements takes place in three different forms:

• In ‘Make Google Do It’, the agency of the IPA is rendered invisible by shifting the viewer’s focus entirely onto the human actors who have the power to command someone – or rather something – to perform a task for them. The Google Assistant is reduced to a tool like any other, only distinguished by the fact that it can be utilised by voice and does not require physical action. As invisible as it is, the Assistant can follow the customer wherever they go and enable the consumer’s agency at any time and any place.

• In the Google commercials ‘Family Time’ and ‘Mornings’ as well as in Amazon’s video ‘Introducing Amazon Echo’, the respective IPAs are embodied in the form of a smart speaker that is firmly located within the family home. Any potential privacy concern is erased by the idealised doing family that the Assistant/Alexa enables. The affective positioning in the family home crystalises the illocutionary message: Our product will maximise your quality of life by giving you more time for leisure, and less need to deal with inconvenient tasks. Creating an imagery of warmth, love, and joy, the IPA is constructed as a facilitator of a successful human family life.

• Advertisements like ‘Alexa’s Body’ and ‘Before Alexa’ postulate a machine agency that most closely resembles human agency. Embodied depictions of Amazon’s IPA invite its imagination as a social actor who offers affective relationships, while nevertheless staying firmly located in the servile role. It is in this service function that they show a superiority to their professional human counterparts: lacking intention and will, artificial servants have no reason to disobey or do things differently than is asked of them.

The agency performatively assigned to IPAs in these commercials deviates from an understanding of actors in the sense of Latour (2005) principle of generalised symmetry. In Latour’s view, agency is something that non-human actors possess (as part of heterogenous networks). Our shift to an analysis of agency as something that is attributed in communication (Luhmann, 1996; Muhle, 2018) offers a more nuanced understanding of agency based on factors such as the ability to act (autonomously) and speak. IPAs may have agency – but only in a limited way. Their agency is bound to specific social positions (Hausendorf and Bora, 2006) which are service roles. The illocution across the analysed commercials is, however, internally consistent in their promise of human agency over their non-human servants. The intended perlocutionary effect of these advertisements is twofold: first, customers are endowed with an increased quality of life, more time to spend with their loved ones, and less obligations with regard to managing their own schedules. Second, the agency of the companies behind the IPAs is rendered invisible through the more conspicuous negotiation of questions of human and machine agency. Through their affective domestic emphasis, public attention is deftly steered away from the in fact growing agency that companies like Google and Amazon achieve through data concentration (Graef, 2018) and increased platform power (Culpepper and Thelen, 2020). In discussions of agency in the relations and interactions between humans and IPAs, it is crucial to be cognisant of this third actor: How does the increasing ubiquity of company-specific IPAs impact our online behaviour, the websites we visit, the information we read, the items we buy? Is there a hidden risk of use and abuse of our personal, communicative data in our daily interactions with these IPAs and their involvement in household organisation? And how do these factors in turn impact power relations on the market? Human–machine interactions do not take place in a social sphere detached from economic considerations, and their analysis demands awareness of their localisation in late-stage capitalism.

Conclusion

Public as well as academic discussions of artificial intelligence are often concerned with the question of machine agency and the differences and similarities of humans and machines. We suggest that it is fruitful to shift the perspective and to perform a ‘second-order observation’ (Luhmann, 1996: xvii) of agency in order to analyse ‘how and when the categories of human or machine become relevant’ and ‘how relations of sameness or difference between them are enacted’ (Suchman, 2007: 2). While other interpretative studies of human–machine communication are often concerned with specific interactions between humans and machines, we analysed adverts as specific socio-technical imaginaries that enact an idealised reality in which IPAs should function (Cluley and Nixon, 2019).

Advertisements for the Google Assistant and the Amazon Alexa offer ambivalent renegotiations of human agency and machine agency. While in some instances, especially in the advertisements ‘Alexa’s Body’ and ‘Before Alexa’, the IPAs were depicted in humanlike roles or even as superior to their human counterparts in their commitment to fulfilling their tasks, their agency is – as the A in IPA implies – insistently limited to the ability to assist, not to act on their own accord or express free will. Human–machine relationships in these commercials are thus constructed as inherently asymmetrical: it is the human actor who has agency over the non-human; the human actor who demands, requests, and orders; the human actor who imagines, creates, and shapes the non-human. While the communicative agency assigned to IPAs might imply a level of autonomy and intentionality, it is exactly that which they are denied; their communicability is restricted to a positionality from where they can respond, but not act. The illocutionary promise of asymmetry between the human and the IPA at the same time offers the enticing promise of symmetrising the human side. This new form of social inequality, between the human and the machine, allows a certain stabilisation between the humans, no longer needing to occupy the lower rungs of the hierarchy.

It is not only the potential of equality in human–human relationships that is conferred in these video commercials, but also their affective quality. While security concerns of surveillance and data exploitation are discussed prominently in the press, the performative incantation of the commercials firmly positions the IPA as a welcome presence in the privacy of the family home, a harbinger of warmth, connection, and love (Kopitz, 2021). Large companies may spy on you through their surveillance technology, but they make you forget that over – as the heteronormative imaginary provided in these commercials suggests – being able to watch your wife lovingly play with your child over the IPAs screen while you prepare a shared meal, and further use this technology to gather the family together for mealtime (cf. Strengers and Kennedy, 2020). Both the advertisements for the Google Home and Amazon’s Alexa construct the respective smart speaker’s function of playing music as synonymous to an ability to create emotionally close, connected, and happy bonding moments in the family (Jurczyk, 2014; Lind, forthc); illocutionarily, the evocation of the idealised family belies all potential real-life worries of letting one’s children interact with an IPA. Thus, the IPA is constructed as having the agency of impacting and shaping human relationships, of conjuring up feelings of home, belonging, and safety. These warm, familial imaginaries serve a clear perlocutionary function: dissipating any reluctance to let smart technology enter one’s house, and deflecting any security concerns in tow (Kopitz, 2021).

The ostensible negotiation of human and machine agency in IPA advertisements renders the ever-growing agency of a third actor almost invisible: the agency of multinational, market-dominating companies like Amazon and Google. ‘Make Google Do It’ is an imperative directed at the customer to not only let a servile assistance technology take over mundane tasks, but also to give agency – information, knowledge, and hence, power – to this third, concealed actor, whose agency is hidden by the commercials they produce.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Deutsche Forschungsgemeinschaft (Grand no. SFB 1482).

ORCID iD

Miriam Lind

Notes

Author biographies

Miriam Lind is a postdoctoral researcher in German linguistics at the University of Mainz. Their research interests include discourse and media linguistics, the interaction of social categorisation and language, and pragmatics and semantics in multimodal interaction.

Sascha Dickel is Professor of Sociology of Media and Theory of Society at the University of Mainz. His work focuses on technologies of future-making, novel modes of public participation in science, and implications of digital communication.

References

Austin

(1962) How to do Things With Words. London: Oxford University Press.

Barad

(2003) Posthumanist performativity: toward an understanding of how matter comes to matter. Journal of Women in Culture and Society 28(3): 801–831. DOI: 10.1086/34532.

Braidotti

(2013) The Posthuman. Cambridge: Polity Press.

Brinker

Cölfen

Pappert

(2018) Linguistische Textanalyse. 9th edition. Berlin: Erich Schmidt Verlag.

Hausendorf

Bora

(2006) Reconstructing social positioning in discourse. Methodological basics and their implementation from a conversation analysis perspective. In: Hausendorf

Bora

(eds), Analysing Citizenship Talk. Social Positioning in Political and Legal Decision-Making Processes. Amsterdam: Benjamins, 85–97.

Butler

(2010) Performative agency. Journal of Cultural Economy 3(2): 147–161. DOI: 10.1080/17530350.2010.494117.

Callon

(ed), (1998) The Laws of the Market. Oxford, Malden: Blackwell.

Cerulo

(2009) Nonhumans in social interaction. Annual Review of Sociology 35(1): 531–552. DOI: 10.1146/annurev-soc-070308-120008.

Cluley

Nixon

(2019) What is an advert? A sociological perspective on marketing media. Marketing Theory 19(4): 405–423. DOI: 10.1177/1470593119856645.

10.

Cochoy

(1998) Another discipline for the market economy: marketing as a performative knowledge and know-how for capitalism. The Sociological Review 46(1): 194–221.

11.

Culpepper

Thelen

(2020) Are we all Amazon Primed? Consumers and the politics of platform power. Comparative Political Studies 53(2): 288–318. DOI: 10.1177/0010414019852687.

12.

Dokukina

Gumanova

(2020) The rise of chatbots – new personal assistants in foreign language learning. Procedia Computer Science 169: 542–546. DOI: 10.1016/j.procs.2020.02.212.

13.

Dolata

Schrape

(2015) Masses, crowds, communities, movements. Collective action in the internet age. Social Movement Studies 15(1): 1–18. DOI: 10.1080/14742837.2015.1055722.

14.

Epley

Waytz

Cacioppo

(2007) On seeing human: a three-factor theory of anthropomorphism. Psychological Review 114(4): 864–886. DOI: 10.1037/0033-295X.114.4.864.

15.

Esposito

(2017) Artificial communication? The production of contingency by algorithms. Zeitschrift für Soziologie 46(4): 249–265. DOI: 10.1515/zfsoz-2017-1014.

16.

Eyssel

Kuchenbrandt

(2012) Social categorization of robots: anthropomorphism as a function of robot group membership. British Journal of Social Psychology 51: 724–731. DOI: 10.1111/j.2044-8309.2011.02082.x.

17.

Graef

(2018) When data evolves into market power: data concentration and data abuse under competition law. In: Moore

Tambini

(eds), Digital Dominance. The Power of Google, Amazon, Facebook, and Apple. Oxford: Oxford University Press, 71–97.

18.

Henrickson

(2018) Tool vs. agent: attributing agency to natural language generation systems. Digital Creativity 29(2–3): 182–190. DOI: 10.1080/14626268.2018.1482924.

19.

Hepp

(2020) Artificial companions, social bots and work bots: communicative robots as research objects of media and communication studies. Culture & Society Media, 42(7–8); 1410–1426. DOI: 10.1177/0163443720916412.

20.

Hoy

(2018) Alexa, Siri, Cortana, and More: an introduction to voice assistants. Medical reference Services Quarterly 37(1): 81–88. DOI: 10.1080/02763869.2018.1404391.

21.

Jackman

Jablonowski

(2021) Investments in the imaginary: commercial drone speculations and relations. Global Discourse 11(1–2): 39–62. DOI: 10.1332/204378920X16067521422126.

22.

Jurczyk

(2014) Doing family – der practical turn der familienwissenschaften. In: Steinfach

Hennig

Arránz Becker

(eds), Familie im Fokus der Wissenschaft. Wiesbaden: Springer, 117–138.

23.

Kelle

Kluge

(2010) Vom Einzelfall zum Typus. Fallvergleich und Fallkontrastierung in der Qualitativen Sozialforschung. 2nd edition. Wiesbaden: Springer.

24.

Knoblauch

(2020) The Communicative Construction of Reality. Milton Park: Routledge.

25.

Kopitz

(2021) Alexa, affect, and the algorithmic imaginary. Addressing privacy and security concerns through emotional advertising. Screen Bodies 6(1): 1–17. DOI: 10.3167/screen.2021.060103.

26.

Krieg-Holz

Bülow

(2016) Linguistische Stil- und Textanalyse. Tübingen: Narr.

27.

Kuchenbrandt

Eyssel

Bobinger

, et al. (2013) When a robot’s group membership matters. Anthropomorphization of robots as a function of social categorization. International Journal of Social Robotics 5(3): 409–417. DOI: 10.1007/s12369-013-0197-8.

28.

Ladegaard

(2013) Beyond the reach of ethics and equity? Depersonalisation and dehumanisation in foreign domestic helper narratives. Language and Intercultural Communication 13(1): 44–59. DOI: 10.1080/14708477.2012.748789.

29.

Latour

(2005) Reassembling the Social. An Introduction to Actor-Network-Theory. Oxford, NY: Oxford University Press.

30.

Lind

(Forthcoming) Doing family on unfamilar terrain: the constitution and contestation of kinship between two humans, two cats and a voice assistant. In: Muhle

Bock

(eds), Social Robots in Institutional Interaction. Bielefeld: Bielefeld University Press.

31.

Lindemann

(2005) The analysis of the borders of the social world: a challenge for sociological theory. Journal for the Social Theory of Human Behaviour 35(1): 69–98. DOI: 10.1111/j.0021-8308.2005.00264.x.

32.

Lohmann

(2014) Von vermenschlichten maschinen und maschinisierten menschen: bemerkungen zur Wortsemantik in der Robotik. In: Brändli

Harasgama

Schister

, et al. (eds), Mensch und Maschine – Symbiose oder Parasitismus? Bern: Stämpfli, 125–142.

33.

Luckmann

(1970) On the boundaries of the social world. In: Natanson

(ed), Phenomenology and Social Reality. Amsterdam: Springer, 73–100. DOI: 10.1007/978-94-011-7523-4_5.

34.

Luhmann

(1996) Social Systems. Stanford: Stanford University Press.

35.

Luger

Sellen

(2016) “Like having a really bad PA”: the gulf between user expectation and experience of conversational agents. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, San Jose, CA, United States, 7–12 May 2016, 5286–5297. DOI: 10.1145/2858036.2858288.

36.

McDaniel

Gong

(1982) The language of robotics: use and abuse of personification. IEEE Transactions on Professional Communication 25(4): 178–181. DOI: 10.1109/TPC.1982.6447798.

37.

MacKenzie

Millo

(2003) Constructing a market, performing theory: the historical sociology of a financial derivatives exchange. American Journal of Sociology 109(1): 107–145. DOI: 10.1086/374404.

38.

Muhle

Florian

(2017) Embodied Conversational Agents as Social Actors? Sociological Considerations on changing human-machine relations in online environments. In: Gehl

Robert W.

Bakardjieva

Maria

(eds). Socialbots and Their Friends. Routledge, 86–109.

39.

Muhle

Florian

(2018) Sozialität von und mit Robotern? Drei soziologische Antworten und eine kommunikationstheoretische Alternative. Zeitschrift für Soziologie 47(3): 147–163. DOI: 10.1515/zfsoz-2018-1010.

40.

Nyamnjoh

(2005) Madams and maids in Southern Africa: coping with uncertainties, and the art of mutual zombification. Afrika Spectrum 40(2): 181–196.

41.

Phan

(2019) Amazon echo and the aesthetics of whiteness. Catalyst 5(1): 1–37. DOI: 10.28968/CFTT.V5I1.29586.

42.

Porcheron

Martin

, et al. (2018) Voice Interfaces in Everyday Life. CHI 2018, April 21-26, 2018, Montreal, QC, Canada. DOI: 10.1145/3173574.3174214.

43.

Purington

Taft

Sannon

, et al. (2017) “Alexa is my new BFF”: social roles, user satisfaction, and personification of the Amazon echo. In: Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems, Denver, CO, United States, 6–11 May 2017, 2853–2859. DOI: 10.1145/3027063.3053246.

44.

Ryan

(2001) Aliens, migrants and maids: public discourses on Irish immigration to Britain in 1937. Immigrants and Minorities: Historical Studies in Ethnicity, Migration and Diaspora 20(3): 25–42. DOI: 10.1080/02619288.2001.9975021.

45.

Sammet

Erhard

(2018) Methodologische grundlagen und praktische verfahren der sequenzanalyse. Eine didaktische einführung. In: Erhard

Sammet

(eds), Sequenzanalyse Praktisch. Weinheim, Basel: Beltz Juventa, 15–71.

46.

Searle

(1969) Speech Acts. An Essay on the Philosophy of Language. Cambridge: Cambridge University Press.

47.

Searle

(1975) A taxonomy of illocutionary acts. In: Gunderson

(ed), Language, Mind and Knowledge. Minneapolis: University of Minnesota Press, 344–369.

48.

Strategy Analytics (2021) Strategy Analytics: Another Record Quarter for Smart Speakers in 3Q21, Though Supply Chain Woes are on the Horizon. Available at: https://news.strategyanalytics.com/press-releases/press-release-details/2021/Strategy-Analytics-Another-Record-Quarter-for-Smart-Speakers-in-3Q21-Though-Supply-Chain-Woes-are-on-the-Horizon/default.aspx (accessed 3 November 2022).

49.

Strengers

Kennedy

(2020) The Smart Wife. Why Siri, Alexa, and Other Smart Home Devices need a Feminist Reboot. Cambridge/MA, London: MIT Press.

50.

Strick

(2013) Humour in advertising: an associative processing model. European Review of Social Psychology 24(1): 32–69. DOI: 10.1080/10463283.2013.822215.

51.

Suchman

(2007) Human-Machine Reconfigurations. Plans And Situated Actions. 2nd edition. Cambridge, NY: Cambridge University Press.

52.

Tannen

(2004) Talking the dog: framing pets as interactional resources in family discourse. Research on language and social interaction. Research on Language and Social Interaction 37(4): 399–420. DOI: 10.1207/s15327973rlsi3704_1.

53.

Turing

(1950) Computing machinery and intelligence. Mind 59(236): 433–460. DOI: 10.1093/mind/LIX.236.433.

54.

Voss

Laura

(2021) More than Machines? The Attribution of (In)Animacy to Robot Technology. Bielefeld: Transcript.

55.

Warzel

(2018) Make Google Do It… And Then What? In: BuzzFeed News. Available at: https://www.buzzfeednews.com/article/charliewarzel/make-google-do-it-and-then-what (accessed 21 February 2022).

56.

West

Kraut

Chew

(2019) I’d Blush If I Could. Closing Gender Divides In Digital Skills Through Education. Unesco Equals Global Partnership. Available at https://unesdoc.unesco.org/ark:/48223/pf0000367416.page=7