Abstract
Machine-age technologies, including automation, robotics, and artificial intelligence, are profoundly expanding the variety of service interfaces and therefore the possible ways that customers and firms can interact across customer journeys. This expansion challenges service firms’ capabilities to deliver coherent streams of interactions for effective customer engagement. This article develops a conceptual framework of firm capabilities that enable firms to operate with “one voice” to deliver seamless, harmonious, and reliable interactions across diverse interfaces in a customer journey. The proposed framework integrates three themes: (1) service interaction space to capture the interrelationship among devices, interfaces, interactions, and journeys; (2) learning and coordination as core capabilities for generating and using intelligence, respectively, to enhance customer engagement in subsequent interactions; and (3) one-voice strategy to configure learning and coordination capabilities in combinations that meet conditions of fitness and equifinality for effective customer engagement. We provide several research questions and priorities to guide research and practice.
Keywords
Heralded as a strategic imperative, customer engagement is typically defined as a customer’s investment of valued operant and operand resources into interactions with a brand or firm within a service ecosystem (Brodie et al. 2011; Harmeling et al. 2017; Hollebeek 2019; van Doorn 2011; Verhoef, Reinartz, and Krafft 2010). While the imperative of customer engagement is well recognized by scholars and practitioners alike, securing customers’ engagement is a challenge for most service firms, and our knowledge about processes of customer engagement is still evolving such that “theoretical relationships remain nebulous, as well as debated” (Hollebeek, Srivastava, and Chen 2019, p. 163).
Current research and practice suggest three foundational insights for advancing the customer engagement literature. First, customer-firm interaction is the basic unit of analysis in the study of customer engagement (Singh et al. 2017); an interaction anchors and adjusts customer engagement by increasing it, decreasing it, or leaving it unchanged. Second, interactions are goal-directed events that occur over time and space; they constitute a “journey” of customer engagement (Lemon and Verhoef 2016). Third, customer-firm interactions rely on service interfaces that enable connections between disparate devices of customers and firms’ agents. Digital devices and interfaces continue to grow in number, diversity, capacity, and functionality (e.g., van Doorn et al. 2017). However, mismatches in customer and firm preferences for devices pose a threat to customer engagement. Mismatched preferences are key “pain points” for customers that interrupt interactions and cause delays. Service organizations are challenged to maintain and enhance customer engagement across increasingly complex and diverse interfaces.
Even with their promise, the use of powerful, flexible, AI-powered machine-age technologies in service organizations to enhance customer engagement can be perilous (Davenport et al. 2019). Service organizations risk losing customers unless they ensure seamless, harmonious, and reliable interactions throughout the customer journey. The description of “seamless” implies that, for an individual customer, a subsequent interaction picks up where the past interaction concluded; “harmonious” means that a subsequent interaction is in sync with past interactions and moves forward effectively; and “reliable” indicates that the pattern of harmony and continuity repeats across customers and time. Studies suggest that machine-age technologies in service frontlines challenge the continuity, harmony, and reliability of customer interactions. Rawson, Duncan, and Jones (2013, p. 92) show that new-customer onboarding for a cable TV provider involved a journey of 3 months (on average) and about “nine phone calls, a home visit from a technician, and numerous web and mail interactions.” Miscommunication or discordant notes across multiple interactions diminished customer engagement; though each interaction “had at least a 90% chance of going well,” average customer satisfaction levels in key segments “fell almost 40 percent over the…entire journey.” Thus, it is imperative for service firms to develop structures and processes that engage customers coherently and consistently over time and space and across a multitude of service interfaces.
To address this imperative, we develop a conceptual framework that integrates three themes: (1) service interaction space (SIS), (2) intelligence generation and use capabilities, and (3) one-voice strategy. First, we build on the concepts of interfaces, interactions, and journeys to introduce the concept of SIS, that is, the universe of possible points of customer-firm interactions, in which each point involves a specific interface that permits firms and customers to connect over time and space using varying devices. In this conceptualization, we recognize that customers may use disparate devices in different interactions during their journey.
Second, by addressing the challenges of the expanding space of customer-firm interactions, we focus on service firms’ capabilities for intelligence generation and use to manage the multiple interactions and interfaces of the customer journey. Past research has hinted at such capabilities; Huang and Rust (2018, p. 158) note that, when deploying AI, service organizations need the “capability to process and synthesize large amounts of [interaction] data and learn from them.” We advance this line of thinking to theorize that, in the context of SIS challenges, learning capabilities enable intelligence generation from customer interactions and integration with available intelligence stocks, and coordination capabilities enable intelligent action by reconfiguring resources/assets to anticipate and respond effectively in yet-to-occur customer interactions. 1 Learning and coordination thus represent sensing and responding capabilities. The firms’ success in acquiring intelligence from interaction data is predicated on their learning capabilities and their success in using intelligence to shepherd customer interactions is predicated on their coordination capabilities.
Third, we conceptualize a one-voice strategy as an organizing approach for a firm’s actions, communications, and exchanges to deliver a seamless, harmonious, and reliable stream of interactions for effective customer engagement. Central to the proposed organizing logics is the development and deployment of intelligence generation/use capabilities and the consideration of human and automated capabilities as alternative choices at the end points of the human-machine continuum. Past research has suggested various capabilities for harvesting intelligence but has not conceptualized strategy as configurations of human-machine organizing logics or designs for linking sensing (learning) and responding (coordination) capabilities to engage customers.
More broadly, our conceptualizations advance theoretical and practical understanding of how service organizations can navigate rich, complex, and rapidly expanding machine-age interactions to ensure continued customer engagement. To provide context and lend relevance to our contribution, we conduct several field interviews with industry leaders who are responsible for infusing digital and AI technologies to enhance customer interaction and engage customers. To develop our concept, we blend insights from these interviews with findings from past research. In turn, we make three main contributions to research and practice. First, we show that the concept of SIS not only provides a conceptually meaningful framework for mapping interrelationships among devices, interfaces, interactions, and journeys but also can guide future inquiries into how various individual and collective touchpoints enhance customer engagement. Second, we confirm that the combination of capabilities within a coherent one-voice strategy reflects the theoretical constructs and mechanisms that underlie a firm’s efforts to synchronize interactions over multiple interfaces. Third, we advance conceptual ideas to support a theoretically rigorous research program to investigate the dynamics, outcomes, and challenges associated with engaging customers through automated service interactions, as noted in the call for the special issue in which this article appears. We begin with the SIS framework.
SIS Framework for Customer-Firm Service Interfaces
Scholars of customer engagement often view interactions between firms and customers as a basic unit of analysis (van Doorn 2011). Brodie et al. (2011, p. 258) advance a “fundamental” proposition of customer engagement as a psychological state that “occurs by virtue of interactive customer experiences with a focal agent/object.” Focusing on the need to orchestrate interactive experiences that engage customers, several researchers address the drivers of customer engagement (e.g., Verhoef, Reinartz, and Krafft 2010). Harmeling et al. (2017) advance the concept of customer engagement marketing to examine how service firms design and develop interactions and experiences to motivate and enhance customer engagement. From the perspective of service firms, a customer interaction is both an opportunity for building customer engagement and a threat for depleting customer engagement. 2 Each interaction (at time t) potentially shapes, positively or negatively, the nature and intensity of a customer’s engagement with the service firm/brand (at t) that in turn affects, by building or depleting, future customer engagement (at t + 1). Moreover, an interaction requires an interface that serves as a point of contact between the firm and a customer and as a means to enable flows of communications and actions between them. Service firms (agents) and customers may use a (heterogeneous) variety of devices to interact, or they may interact face-to-face, such that their “devices” are human (homogenous). Both constitute single, unique interfaces that enable customer-firm interaction.
Past research has examined the distinct role of interactions and interfaces in the process of customer engagement (Kuehnl, Jozic, and Homburg 2019; Singh et al. 2017). To interact, customers and firms must find a common interface between the communication devices they use to extend their human capabilities. We broadly define device as any artifact or entity that has a defined set of communication functions and capabilities. For example, a mobile phone is a wireless, handheld artifact that allows users to make and receive calls and send text messages (among other features). However, customers have the choice to use a mobile device or a human contact person (human device) where human is an entity capable of producing and receiving spoken, written, signed, or gestured information using vestibular, olfactory, and gustatory senses. Although both enable communication, they have different strengths and weaknesses. By using a broad definition of device, we construct a conceptually meaningful framework for analyzing the increasing variety of communication options on the human-machine continuum.
Interfaces connect the disparate devices used by customers and service firm/agents. Past studies have noted the relevance of interfaces to the study of customer-firm interactions and proposed schemas for organizing the diversity of interfaces. For example, Patrício, Fisk, and Cunha (2008) compare service interfaces on three dimensions: (1) usefulness (e.g., information clarity, completeness), (2) efficiency (e.g., speed of delivery), and (3) personal contact (e.g., personalization). Wünderlich, Wangenheim, and Bitner (2012) use a two-dimensional schema: activity level of service providers (low/high) and activity level of customers (low/high). Huang and Rust (2018) distinguish interfaces using a three-dimensional schema of functional features: (1) autonomous functionality or the capacity to incorporate user (customer or front line) control (e.g., high-low user control), (2) learning functionality or the capacity to learn over time to adapt and change (e.g., cognitive computing), and (3) social functionality or a capacity to process and perform social cues (e.g., empathy, emotion). Yamakage and Okamoto (2017) instead classify interfaces according to (1) sensing and recognition (e.g., voice recognition), (2) cognitive processing (e.g., pattern discovery), and (3) decision making (e.g., real-time recommendations).
To conceptualize a simultaneous analysis of interfaces and interactions, we propose the concept of SIS, which is the universe of possible points of customer-firm interaction, in which each point involves a specific interface that permits a firm and customer to connect. An interface identifies a single point of contact in an SIS by linking a customer device with a firm device to enable the customer-firm interaction. A service interface indicates that the interface has a protocol of service functions that it performs, enables, or permits to overcome frictions (e.g., distance, intimacy) and facilitate interactions (e.g., communication flows).
The two axes of SIS represent the range of customer devices (x-axis) or service firms/agent devices (y-axis) used, such that each axis varies from “mostly automated” to “mostly human.” The intermediate point indicates a human-machine combination representing a situation where automation capabilities augment a human agent. For instance, Humana is deploying AI-assisted technology (“Cogito-Dialog”) to “listen-in” to service interactions and analyze conversations using natural language processing to detect signs of customer agitation and frustration and, if detected, to cue human service agents in real time with suggestions to resync and realign with the customer (https://www.technologyreview.com/s/603529/socially-sensitive-ai-software-coaches-call-center-workers/). In this example, machine technology augments human service agents for efficient problem-solving. A point in the SIS, which we refer to as service interface, is the linking of customer and firm devices to permit an interaction. At one extreme, a human-to-human interface involves linking humans on both sides to enable interactions (e.g., in-person customer interaction with a retail store agent). At the other extreme, a machine-to-machine interface allows automated interaction with no human involvement (e.g., automatic updating of Tesla software; Larivière et al. 2017). This multitude of possible interfaces adds both flexibility and complexity to customer-firm interactions.
Our proposed conceptualization of interfaces as distinct points of contact in SIS incorporates and advances several ideas from the extant literature. First, it accepts that interfaces vary in complexity, features (Hoffman and Novak 2017; Huang and Rust 2018; Rafaeli et al. 2017), activities (Kumar et al. 2016; Wünderlich, Wangenheim, and Bitner 2012), and/or functions (Huang and Rust 2018; Yamakage and Okamoto 2017). That is, our conceptualization is pluralistic, because it considers the nature of interfaces, and particularistic, in that it conceives each interface as a distinct point of contact between a customer and a firm/agent.
Second, we reflect on past research by emphasizing the “means” function of interfaces; Ramaswamy and Ozcan (2018) note that interfaces combine with artifacts, people, and processes to provide the means for creating value in digitalized interactive platforms. 3 Our concept expands on this idea by providing a conceptual separation between the means function of interfaces versus the broader, unconstrained consideration of the nature of different devices that enable functional interfaces. By conceiving of interfaces as a way to allow customer-firm interactions to flow separately from their “constitutions,” we focus attention on the functional qualities of a given interface and examine how they enable and constrain interactions.
Third, our development of SIS highlights the symbiotic relationship between interfaces and interactions in the process of customer engagement. Each location in an SIS is a possible point of contact, and a customer journey represents a series of related interactions that connect different contact points, each with a potentially unique interface and associated devices. Interfaces are critical to the flow of interactions, and the choice of interface in a subsequent interaction is partly conditional on the previous interaction. Thus, the interfaces and interactions they facilitate along the Customer journeys are interdependent processes over time and key to understanding the waxing and waning of customer engagement.
One-Voice Strategy Challenges of Expanding SIS
An insight from our conceptualization of SIS is that the expanding choice of devices available to customers and firms creates synchronization challenges for both firms and customers. For example, customers may use their smartphones to interact with human agents, access interactive voice response (IVR) systems, send emails and text messages, or videoconference in real time. Firms must be prepared to respond to these formats. Abundant format choice may require customers and firms to exercise preferences for selecting one or more devices for interaction. On-the-go consumers may prefer mobile devices because of convenience; efficiency-minded firms may prefer IVR-driven devices. Preference mismatches pose little challenge when the interfaces that link diverse choices are easily available; however, mismatches of interface connectivity can interrupt interactions, increase the probability of suboptimal interactions, or rule out interactions altogether. As noted earlier, Rawson et al.’s (2013) study of customer interact in a car rental context reveals that airport pickups require a half-dozen interactions, each of which constrains interactions to human-to-human interfaces, even though customers prefer mobile apps or self-help kiosks. The result is mismatched interface preferences and unnecessary interruptions in the flow of interactions. As new devices with novel features (e.g., voice activation) and functions (e.g., convenience) become available, previously used devices may be abandoned. Therefore, the SIS is dynamic, and the risk of mismatched selection and preferences is likely to increase over time.
This challenge puts connectivity at risk due to spatial and temporal gaps in customer-firm interactions over the duration of the customer journey (Edelman and Singer 2015; Voorhees et al. 2017). The concept of a customer journey provides one way to study customers’ multiple interactions with companies during the process of product/service consumption including preconsumption, consumption, and postconsumption (Baxendale, Macdonald, and Wilson 2015; De Hann et al. 2015). Even fairly commonplace service consumption experiences involve numerous points of contact between customers and firms, often referred to as touchpoints (Lemon and Verhoef 2016; Richardson 2010; Rosenbaum, Otalora, and Ramírez 2017). In the customer journey, multiple interfaces along the human-machine continuum are likely to be engaged, such that some interfaces cause interactions to flow synchronously in time and space (e.g., home visits), while others occur asynchronously with gaps in time and space (e.g., mail).
Gaps in connections across time, space, or synchronicity can increase customer dissatisfaction and destroy value (Edelman and Singer 2015). In Rawson et al.’s study noted earlier, even though each interaction had a strong chance of a successful outcome, average customer satisfaction levels in key segments continually fell, by nearly 40%, over the course of the entire customer journey, which lasted 9 months on average. Furthermore, maintaining customer engagement amid the growing variety of devices in a dynamic system is a challenge for firms that must find ways to deliver seamless, harmonious, and reliable interactions from the beginning to the end of the customer journey. To do so, they must act and communicate with one voice to engage customers, regardless of the diversity and complexity of their SISs. Verhoef, Pallassana, and Inman (2015) emphasize the significance of coherent communication to omnichannel retail environments, though most researchers focus on understanding shopper (customer) behavior across channels and seek to attribute sales performance to individual channels (Cao and Li 2015).
To situate our conceptual development, we interviewed 10 leaders, responsible for customer engagement across a wide range of service organizations (see Table 1). We asked them about the challenges of one-voice organizing. The interviews, conducted without leading or direction, focused on leaders’ open-ended answers to three questions: (1) How does your division and/or firm use the concept of customer journey in a customer engagement strategy? (2) How does your division/firm ensure a consistent and seamless customer experience during the journey? and (3) What are your current challenges, and how are you overcoming them? Table 2 summarizes the insights we extracted. In the following sections, we intersperse these insights with discussions of our conceptual development.
Profile of Industry Leaders Participating in the Interview Process.
Note. AI = artificial intelligence; AWS = Amazon Web Services; API = Application Programming Interface; B2B = business-to-business; CIO = chief information officer; CX = customer experience; CRM = customer relationship management; ERP = Enterprise Resource Planning; IoT = Internet-of-Things; NLP = Natural Language Processing; OEM = Original Equipment Manufacturer; PaaS = Platform-as-a-Service; SCM = supply chain management; SMEs = Small-to-Medium-sized Enterprises; VP = vice president.
Summary of Qualitative Insights From Interviews With Industry Participants.
Note. B2B = business-to-business; BU = Business Unit; CE = Customer Engagement; CX = customer experience; CRM = customer relationship management; f2f = face-to-face; KPI = Key Performance Indicators; NPS = Net Promoter Score; SCM = supply chain management; SLA = Service Level Agreement; AI = artificial intelligence; IT = information technology.
In general, the service leaders that we interviewed affirmed the importance of mapping customer journeys in developing competitive customer engagement strategies. However, they reported that their firms/divisions varied in the degrees to which they had effectively deployed such mapping in practice. Although the leaders emphasized that organizing for seamless, harmonious, and reliable firm-customer interactions is an imperative, they reported challenges in strategizing for this imperative and implementing it within their firms. For example, a customer experience leader in a logistics service firm explained: We do use customer journeys…it helps us to focus; identify service moments of truth. But it is a fairly new thing…[and] is challenging to implement…. Customer data is not yet organized so that it can be shared. We do not yet collect local intelligence in a systematic manner and besides vehicle information we do not share local intelligence. Our shops are not connected. System mismatch or gaps underlying the customer journey [pose challenges]; not having one homogenous system landscape supporting the customer journey end to end. Heterogeneous system landscapes hinder seamless customer experiences in most (80% of cases) due to different system owner and negative cost/benefits (perceived or real) by our customers (so they do not provide it to their customers although they could). Customers don’t want to own anything…in fact, they didn’t even want to buy a service. They want to buy an outcome. And they’d pay more…to achieve a predictable, no problem outcome [that is] customers want an ecosystem of [connected] devices, technologies, data and intelligence that can provide reliable service performance…. It is a huge management issue for customers, especially when the devices [interfaces, intelligence] are spread all over. A superior experience is one that takes all of these hassles away and delivers a predictable outcome.
Intelligence Generation/Use Capabilities and Customer Engagement
We propose a conceptual framework of one-voice strategy for the delivery of seamless, harmonious, and reliable customer engagement in an expanding SIS (see Figure 1). Our framework draws on organization science and service design literatures that establish the central significance of learning and coordination as two design dimensions of firms’ capability for intelligence generation and use (Antons and Breidbach 2018; Dosi, Teece, and Winter 1992; Huber 1990; Kogut and Zander, 1996; Nelson and Winter 1982). Kogut and Zander (1996, p. 503) propose that organizations are social communities that specialize in the “creation and transfer of knowledge [intelligence],” such that the creation function is conceptualized as learning and the transfer function as coordination. Other scholars also recognize learning (Argote and Miron-Spektor 2011; Grant 1996) and coordination (Kogut and Zander 1996; Srikanth and Puranam 2014) as core capabilities that represent two sides of the organizing challenge. Learning ensures that firms routinely generate new intelligence (e.g., from customer interactions), and coordination ensures that firms execute intelligent action (e.g., use in customer interactions). The infusion of novel knowledge motivates action that is intelligent, just as mindful action provides an opportunity to gain intelligence. We build on this literature to conceptualize learning and coordination as core capabilities for intelligence generation and use to maintain continuity in customer interactions across space and time.

A service interaction space framework for one-voice strategy, intelligence generation/use capabilities, and customer engagement.
We also advance past research on learning and coordination capabilities to conceptualize a human/machine duality that develops alternative forms of learning and coordination capabilities to achieve effective one-voice strategy. Whereas this duality permits clear development of contrasting approaches to core capabilities, our framework recognizes that combinations of contrasting approaches, in some cases and contexts, can offer compelling competitive advantages. From this perspective, we address human- versus machine-automated forms of learning and coordination capabilities. Next, we discuss how these capabilities can be organized into compelling combinations to provide a competitive, one-voice strategy. Throughout, we intersperse field interview data and provide prototypical case examples to illustrate the proposed capabilities and mechanisms.
To exemplify the significance of learning and coordination capabilities, we draw on Huang and Rust’s (2018) concept of collective intelligence and Marinova et al.’s (2017) notion of local intelligence. Opportunities for gaining both types of intelligence reside in individual customer interactions and in exploring patterns of interdependencies across interactions in the customer journey. Each customer interaction creates new data that reflect the dynamic context of the interaction and create a retrospective trace of a firm’s past interactions with the customer. These interaction data contain intelligence that can be used to make future service interactions more effective—both in the local context of individual agents (or teams) and in the broader context of firms’ interactions with customers. Local intelligence is highly contextualized, locally embedded, tacit, and heuristic knowledge; human agents/teams typically hold and apply it in local contexts of their service work. Collective intelligence consists of generalizable, explicit, and rule-based knowledge, which is typically held in algorithms and data storage systems and applied across a broad range of customer interactions. Collective intelligence ensures consistency and efficiency across customer interactions, whereas local intelligence ensures novelty and effectiveness in individual customer interactions.
Learning Capabilities and Collective/Local Intelligence
Learning capability relates to a firm’s ability to generate intelligence. Our prototypical use cases illustrate differences between human and automated learning. Emphasis on a human learning agent in customer interactions is exemplified by the Zappos shoe company’s culture of “WOW through Service” and its self-organizing “holacracy” 4 structure with frontline employees, as part of the Customer Loyalty Team (CLT) 5 as its empowered core (Frei, Ely, and Wining 2011). At Zappos, the context of human learning processes features a clear emphasis on experiential novelty and emotion in service interactions (“WOW service”); the human agent is encouraged to discover, share, and cocreate tacit knowledge during personal interactions with customers. Personalized attention in customer interactions, unfettered by administrative constraints or oversight, permits human agents to develop emotive connections and fill in gaps about customer needs that cannot be inferred from past transactions. For Zappos, “personalization” is not “making best guess recommendations”; rather, it is taking a personal interest in “holistically” understanding “what the [customer] is trying to do” in a specific instance and how this instance may present a deviation from a previous purchasing context (e.g., buying shoes for a first date versus buying shoes for office use; Howarth 2018). Such tacit and heuristic knowledge also comes from continuous dialog with other learning agents (through “serendipitous collisions” 6 ). Social interactions among agents are further encouraged by physical space designs (e.g., desks), common venues (e.g., lunchrooms), and company practices (e.g., cross-functional teams) to facilitate collective sharing, transferring, and sensemaking of individual tacit knowledge. Some tacit knowledge may be made explicit and embedded in practices and protocols, thereby contributing to collective intelligence for wider application. Nevertheless, human learning with an emphasis on personal attention in customer interactions, as it occurs at Zappos, favors uncovering, analyzing, and developing local intelligence.
Automated learning, in contrast, emphasizes sensors for automatic data capture and computational learning; it uses a wide range of algorithmic and AI processes to extract explicit knowledge from customer interactions (Huang and Rust 2018; Lim and Maglio 2018; Ting et al. 2017). Computational learning is especially effective when it is built into personalized apps, as exemplified by L’Oreal’s Makeup Genius app 7 (Edelman and Singer 2015, p. 92) that empowers customers to autonomously “design” interactions for experimenting, exploring, and sharing to make their purchasing journey “seamless and fun.” The app creates a smart, continuous, and open connection with customers by first providing them with personalized augmented realities in which they can “try” various “looks” on their own face and then select and order, in real time, the combination of products needed to achieve the looks. When the products arrive, the app proactively reconfigures itself to guide customers in using the ordered products to achieve the selected look and encourages them to return repeatedly to the app to change looks in accord with fashion trends and occasion needs. The personalized interface is a computational learning algorithm that “learns [a customer’s] preferences, makes inferences [based on similar customer’s choices] and tailors its responses [to enhance customer engagement]” (Edelman and Singer 2015, p. 94). The algorithm extracts learning as explicit knowledge by mining for meaningful patterns and tagging them to customer profiles. Such detailed, personally tagged, explicit knowledge is a source of collective intelligence that firms can use locally to infer “where a customer is in a journey” and engage in a way that “draws the customer forward” to the next step (Edelman and Singer, 2015, p. 94). Thus, collective intelligence fills in gaps about individual customer needs and journey points; however, unlike local intelligence, it is inferred from rule-based algorithms such as pattern matching, behavioral mapping, and interface tracking rather than from personal interactions with customers.
Human learning and automated learning ideally complement each other. For example, human agents may internalize or combine customer behavior patterns recognized through automated learning with local knowledge and apply it in ways that suit local contexts. Similarly, tacit knowledge externalized (i.e., made explicit) by service agents may inform the automated learning process and lead to more valuable collective intelligence. Our field interviews show that though service firms differ in their degree of mobilization of local and collective intelligence, they all recognize the significance of doing so. According to the chief strategy officer of a global CRM/SCM/financial solutions company, Local intelligence is the essence of our direct sales engagement with customers and it is very useful…but we use it in a very limited fashion [because] costs are high due to different systems and a low willingness to pay for this by our customers. Sales people in the field have their local knowledge, and they capture some of this in CRM/sales databases. But journeys are not planned on the basis of local intelligence for two reasons. First, salespeople want to control everything about their accounts; and they are busy, so they have lots of reasons not to update records until they register a sale to collect their commissions. Second, and most important, they don’t see the value in the kinds of data we need for insights about customer engagement and customers’ wants and needs. We do a lot of different initiatives to harvest data, but these different data are not often integrated to develop collective intelligence. Currently [there are] two collective intelligences rather than one. Our challenge is the integration of product tech and MarTech. This needs to be unified to arrive at true service interaction insights based on data generated both within our product (online web service) and our MarTech stack; plus closer alignment between self-service business intelligence and enterprise sales. [Collective intelligence] is the core of our business now. Data should pass quickly and correctly throughout the [service] journey. We use aggregate data to identify trends.
Coordination Capabilities and Collective/Local Intelligence
Coordination capabilities focus on processes for governing intelligent action so that “interdependent [actors/entities] are able to act as if they can predict each other’s action” to ensure continuity in customer interactions across space and time (Srikanth and Puranam 2014, p. 1253). Managerial control and codified routines are typical levers that firms use to guide coordinated action; managerial control involves supervision, feedback, and incentives; and codified routines involve explicit service scripts, best practices/norms, and deviation control (Feldman and Pentland 2003; Kogut and Zander 1996). Coordination failures are breaches of intelligent action, that is, action that is not informed by collective and local intelligence to anticipate the next step in the customer journey (Srikanth and Puranam 2014).
Examining predominately a human versus an automated instances of coordination capability is a useful way to assess these contrasting approaches and study their prototypical realizations in practice. Human-dominated coordination capabilities rely on human skills and improvisation to anticipate interdependence and execute intelligent action (Lages and Piercy 2012; Marinova et al. 2017). Human coordination capability, exemplified by Breidbach, Antons, and Salge’s (2016) illustration of “service orchestrators,” is well suited to human-centered service systems in which emergent “human behavior, human cognition, human emotion and human needs” are prominent (Magilo, Kwan, and Sphorer 2015, p. 2) and the “promise of seamless service remains elusive” (Breidbach, Antons, and Salge 2016, p. 458). In the context of a 700-bed German hospital, the service orchestrators in Breidbach, Antons, and Salge’s (2016) study are patient case managers who work outside the formal relationship between hospitals and patients during hospital stays and subsequent recoveries. Service orchestrators facilitate intelligent action by serving as single points of contact on behalf of patients to orchestrate action across multiple hospital interfaces (e.g., nurses, physicians, pharmacies, rehabilitation departments, and billing). That is, service orchestrators compensate for coordination failures that are endemic in an “increasingly efficiency-driven health care system” by infusing a human coordination system (Breidbach, Antons, and Salge 2016, p. 461). A key insight from this analysis is that local intelligence about an individual patient’s emergent conditions is a crucial input to intelligent action for human-centered service experiences. Service orchestrators do not follow standard scripts or best practices protocols. Rather, they attend to patients’ specific (physical/psychological/physiological) conditions at given points in time (local intelligence) and use their access and knowledge to (re)direct next steps in patients’ journeys according to patients’ present states. The work of service orchestrators is therefore conditional, improvisational, and novel. In this sense, human coordination, as exemplified by service orchestrators, enables what Salvato and Vassalo (2018) conceptualize as dynamic coordination capability—the capability for intelligent action based on rapid sensing of emergent “environments” (e.g., patient journeys) and reconfiguring or realigning existing resources (e.g., intelligence) for effective anticipation and response.
In contrast, automated coordination is typically rooted in collective intelligence and executed using algorithms to ensure intelligent action (Ivanov, Webster, and Berezina 2017; Lariviere et al. 2017). For example, RoboHotel developed by Henn Na Hotels 8 (Ivanov, Webster, and Berezina 2017) experimented with automated coordination to achieve efficiency-centered service systems in which quality is standardized, consistency is an objective, and productivity bestows a competitive advantage (Grönroos and Ojasalo 2004; Rust and Huang 2012). Service robots that are autonomous (e.g., self-agency), mobile (e.g., self-powered), sensing (e.g., self-sensing), and action taking (e.g., goal-directed self-acting) were central to RoboHotel’s experiment (Barrett et al. 2015; Chen and Hu 2013). Connected by a cloud computing system, multiple service robots at different touchpoints along the customer journey were auto-coordinated for efficient, intelligent action. 9 Other hotel chains have also experimented with robot use; for example, Aloft is testing a room delivery robot by Savioke, and Hilton has launched Connie, a robotic concierge (Ivanov, Webster, and Berezina 2017).
A key insight from automated coordination is that AI and computational algorithms can effectively connect numerous decentralized service robots to anticipate hotel guests’ journeys and execute intelligent action. Such coordination is effective when patterns of customer behaviors are readily identifiable and automated interactions are effective substitutes for human touchpoints. Automated coordination combined with collective intelligence provides a highly efficient approach to achieving consistency and productivity in service interaction sequences and ensuring harmonious customer interactions. More broadly, human and automated coordination are on a continuum, automated coordination leans toward stability and efficiency, and human coordination leans toward flexibility and effectiveness. Nevertheless, human and automated coordination capabilities may complement each other. Human coordination permits intelligent action in response to emergent conditions, and such action may be captured and integrated with current explicit intelligence to improve automated coordination capabilities via new routines and service scripts. Input from our field interviews substantiates the challenges and significance of coordinating intelligent action. The president and country director of a retail merchandizing supplier noted coordination gaps and needs in practice: Coordination is planned and needed for seamless CX [customer experience]…. The key challenge is to instill a true, metrics-based customer-focused culture across the organization and functions. We are developing a one-voice strategy, and have some [initial] rule-based coordination…. But there are challenges including operational execution challenges. Other challenges include system/data access, having influence/access to systems/ecosystem, especially when third-party systems are involved. The reality is that companies still work in silos like sales and marketing, and they don’t integrate their systems or processes, which causes a lot of waste. Coordination, where machine and humans engage at key times is the challenge for all of us now—how do we find the customer’s purpose and align everything we do with that?
One-Voice Strategy: Configurations of Intelligence Generation/Use Capabilities
Whereas learning and coordination capabilities are fundamental to intelligence generation and use, a one-voice strategy requires jointly configuring these fundamental capabilities for effective customer engagement. Accordingly, we develop a 2 × 2 framework by intersecting learning and coordination capabilities to guide research and practice pertaining to a one-voice strategy (see Figure 2). Our framework does not include an exhaustive set of feasible configurations; rather, in accordance with configurational theory (Meyer, Tsui, and Hinings 1993), it focuses on prototypical combinations to conceptualize conditions of fitness—contextual and environmental features that favor a particular configuration—and equifinality—mechanisms that permit different combinations to be equally effective in terms of desired outcomes. Notions of fitness and equifinality are useful design parameters for the one-voice strategy. Fitness helps us understand contingencies that render some configurations more likely to fit an environmental context than others, whereas equifinality helps narrow design tasks to alternative configurations that may be equally effective but differ in their organizing logic. As we show in the next section, our framework offers fertile ground for theoretical advances and empirical research in understanding the challenges of the one-voice strategy.

One-voice strategy as configurations of human/automated capabilities for intelligence generation and use.
Figure 2 displays two broad types of configurations: consistent configurations that lie along the diagonal, where learning and coordination capabilities are configured to be intuitively consistent (e.g., human-human, automated-automated), and inconsistent configurations that lie along the off-diagonal, where learning and coordination capabilities are configured to be inconsistent (e.g., human-automated, automated-human). Consistent configurations offer design choices with predictable fitness, whereas inconsistent configurations offer design choices that offer equifinal alternatives with unpredictable fitness, resulting from novelty. We illustrate them with examples drawn from varied industries and market contexts.
Consistent Configurations, Predictable Fitness
Conventional research, along with intuitive prediction, shows that human learning–human coordination configuration is favored for fitness in human-centered service systems (Quadrant 1; Figure 2). Conversely, automated learning–automated coordination configuration is favored for fitness in efficiency centered service systems (Quadrant 3; Figure 2). We elaborate on our illustrative examples of Zappos and T-Mobile (Quadrant 1) and RoboHotel and Tesla (Quadrant 3) to develop the intuition of predictable fitness.
At Zappos, CLT members, who interact on the front lines with customers to provide personalized attention for human learning (as previously discussed), are also service orchestrators in the sense of Breidbach, Antons, and Salge’s (2016) description of patient case managers; they work in a self-managing system of circles (teams) and lead links (coordinators) to coordinate action based on emergent needs of customers whose loyalty they seek to win. Whereas service orchestrators work outside the formal service system to coordinate patients’ journeys, Zappos’s CLT members work within the formal service system to coordinate an effective customer experience and a seamless purchase journey. Gaining local intelligence in personal interactions with customers is intuitively compatible with using novel local intelligence to coordinate the action that such intelligence demands. When barriers between generating local intelligence and executing intelligent action are removed, human learning and human coordination are harmonious for effective customer engagement. Zappos’s one-voice strategy strives to remove intelligence-action barriers by rejecting the “traditional command and control structure” for a “decentralized management where decision-making responsibilities are distributed throughout self-organizing teams” (Howarth 2018). As a result, CLT members are empowered to attend to local intelligence in customer interactions and coordinate autonomously for intelligent action in pursuit of customer loyalty.
Although the human learning–human coordination configuration is intuitively well suited to fitness in human-centered service contexts, such fitness is not guaranteed in practice. The effectiveness of the human learning–human configuration depends on the level of trust and the nature of relationships among frontline employees. Strong relationships allow for greater levels of information sharing and more productive dialogue among employees “aimed at promoting change in the context of conflicting viewpoints and motivations” (Berkovich 2014; Salvato and Vassolo 2018, p. 1728). For example, T-Mobile reorganized its customer service to enhance frontline employee relationships (Dixon 2018). Service employees who cater to customers in a geographical market form a team, sit together (in shared “pod” spaces), and collaborate openly to resolve customer issues. Information sharing and dialogue takes place both off-line (teams hold stand-up meetings 3 times a week to share best practices, lessons learned, and ideas for handling customer concerns) and online (team members collaborate in real time using an instant-messaging platform). Such dialogue and information sharing also allows managers and employees to act together to realign assets based on the local intelligence. Lack of internal cohesion—the extent to which unit members are attracted and committed to one another—is likely to make it difficult to develop dynamic routines from human learning and inhibit the coordination of future customer interactions.
Similarly, but in the opposite quadrant (Quadrant 3), an automated learning–automated coordination configuration is favored for fitness in an efficiency-centered service context. In the case of RoboHotel, an automated coordination mechanism was effective for linking together decentralized robots that digitize the interaction data at each touchpoint. When combined with computational learning, automated learning systems help extract collective intelligence from these customer interaction data. Such an automated learning–automated coordination configuration assumes particular salience when customers’ journeys can be digitized and the significance of highly contextualized tacit knowledge is limited. However, recent events at RoboHotel, which resulted in the decommissioning of many robots (https://www.wsj.com/articles/robot-hotel-loses-love-for-robots-11547484628), show what can go amiss when these underlying assumptions are invalidated. Current robotic technologies assume a degree of consistency and predictability of consumer behavior to permit their explicit modeling. However, human responses are flexible; they easily depart from past patterns and seek variety and new patterns. In the case of RoboHotel, hotel guests were intrigued by the main concierge robot’s ability to answer questions; they expanded their interactions by asking more varied and increasingly complex queries that required higher levels of contextual knowledge than were explicitly modeled. To address guests’ frustration with robots that gave unhelpful responses, RoboHotel resorted to human interventions, which led to a loss of efficiency and the eventual withdrawal of the robots.
The effectiveness of an automated learning-automated coordination configuration as a one-voice strategy thus is conditional on capture (digitization), flow (distribution), and processing (deployment) of customer interaction data gathered from diverse interfaces. For example, Tesla developed the capability to learn its cars’ performance autonomously and take corrective action as needed. In 2016, Tesla began to produce sedans (Model S and X cars) equipped with battery packs built to have 75 kWh of capacity but constrained by software to have access to only 60–70 kWh. 10 In 2017, when Hurricane Irma hit Florida, Tesla remotely enabled a free software upgrade for vehicles in the path of the storm that would allow them to gain as much as 40 extra miles of range by using full battery capacity. This was done without any action on the part of customers or the company’s frontline employees; Tesla’s internal systems were able to monitor and learn from weather forecasts about the temporary need to enhance the range and autonomously deliver updated software to its cars using cellular connections built into each vehicle. Often, devices with different interfaces do not “speak” to each other, data are limited by inadequate capture, or available data are inadequately mined for insights to guide intelligent action. Few organizations have mastered the technological and computational challenges of providing frictionless, well-functioning automated service systems.
Inconsistent Configurations, Equifinal Possibilities
Although configurations that combine human and automated capabilities are unconventional, they align with the notion of functional duality. Theory and research contrast the features of human and automated capabilities in various terms, such as high touch versus high tech or flexibility versus stability, which are relevant for substantiating the conventional wisdom of trade-offs: Substituting automated for human capabilities involves trade-offs that favor process/cost efficiency over interactional/customer effectiveness (and vice versa). However, paradox studies propose an alternative combination of contrasts in dualistic configurations (Schad et al. 2016). In this view, contradictory, “even mutually exclusive” elements, can “exist simultaneously and persist over time” in a functional duality (Smith and Lewis 2011, p. 382). 11 Expanding on this assertion, we posit that inconsistent configurations of human and automated capabilities (Figure 2; Quadrants 2 and 4) may transcend their internal contrasts to yield functional design possibilities. Inconsistent configurations are novel combinations because convention or intuition does not anticipate them. Novel combinations provide alternative design choices that are equifinal (equally effective) with those suggested by fitness intuitions, thereby increasing the degrees of freedom of the one-voice strategy.
Human learning and automated coordination (Figure 2; Quadrant 2), exemplified by Recreational Equipment Inc. (REI), 12 is a functional combination that is likely equifinal with its adjacent human learning–human coordination quadrant (Quadrant 1). This combination is especially functional in situations that enable rapid conversion between local and collective intelligence, thereby augmenting the human capability to personalize customer interactions with automated capabilities to coordinate intelligent action across multiple interfaces. The core customers of REI are outdoor and fitness enthusiasts who use wearable devices to track fitness routines, access web sources to plan outdoor activities, and seek technical resources to keep up with latest technology in outdoor gear, among other outdoor activities. Known for personalized in-store attention from knowledgeable frontline agents (selected on the basis of their outdoor experience), REI aims to extend personalization to its digital experiences to provide customers with seamless “brand experience throughout the customer journey” and control over digital content and devices (https://www.retailcustomerexperience.com/articles/sxsw-spotlight-best-practices-from-nordstrom-rei-for-mobile-retail-integration/). By using a flexible proprietary app to channel its customers to curated and certified content of interest to outdoor enthusiasts and create communities of users around common interests, REI deploys automated coordination tools to understand “who, what, where, when and how consumers meet our brand…[so] we can shape the journey they take” (https://theblog.adobe.com/6-ways-rei-shapes-digital-consumer-experience/). Using the REI app, customers not only identify specific frontline employees in their local stores for help but also call them even when they are in a different location (store). As a result, frontline agents gain considerable local intelligence about individual customers’ emergent needs and in-process journeys, while automated coordination directs frontline agent action according to collective intelligence generated by user patterns. According to industry reports, 75% of REI’s in-store purchases are preceded by visits to the company’s digital properties in the previous 7 days (https://www.mytotalretail.com/article/rei-maps-out-its-digital-journey/all/). In REI’s case, automated coordination enhances human learning by making personal interactions feasible at critical points in the customer journey and arming frontline agents with customer data that are otherwise difficult to access.
The novelty of combining human learning and automated coordination is that automated coordination permits connection between the off-line and online worlds of the customer journey. In this connection, collective intelligence enriches local intelligence, and in turn, local intelligence contributes to collective intelligence. As a result, automated coordination uses collective intelligence dynamically to decipher and coordinate new paths for the customer journey. Companies can use insights from collective intelligence to customize mobile apps for individual customers according to past interactions and current local intelligence—for example, they can offer elegant combinations of “buy now” options via mobile and “find this in a store near you,” using real-time inventory.
In practice, executing a functional human-learning and automated coordination one-voice strategy is challenging. Frontline agents must be versatile in interacting with machines (absorbing collective intelligence) and humans (attending to local intelligence); they must possess cognitive skills to integrate collective and local intelligence for a functional combination. We know little about mechanisms for nurturing and developing such dexterity. Also, automation technology must be robust to interface with a range of customer devices, without imposing constraints or undue time/effort. It also must be capable of collecting a variety of online data and provide real-time analytics that generate useful insights for frontline use. Current research is rich in detailing technical features of automation technologies but lean in understanding when and how they generate useful collective intelligence from customer interactions.
Automated learning and human coordination (Figure 2; Quadrant 4) is another unconventional combination that offers fitness possibilities in contexts that capture abundant customer journey data but require human judgment to guide customer response, as is most evident in the digitization of health care delivery. For example, Arden Syntax (Hripcsak, Wigertz, and Clayton 2018; Seitinger et al. 2018) is a clinical decision support system that uses digital representations of structured medical knowledge in a way that is extensive (e.g., complex logic), nuanced (e.g., conditional trees), dynamic (e.g., easily updated), verifiable (e.g., physician-tested), and accessible (e.g., searchable, interactive). Computational learning and AI technologies enable the Arden Syntax to be organized as a “service-oriented architecture” that is “well integrated into routine clinical workflows” to provide “patient-specific” insights that help “improve the quality of clinical practice and contribute to patient safety” (Seitinger et al. 2016, p. 8). Digital and wearable devices allow the digital service architecture (powered by Arden Syntax) to secure abundant data for individual patients dynamically and analyze them in real time to decode patient treatment responses and predict their trajectories in relation to protocol and practice guidelines for various conditions. Few humans have comparable capabilities for processing such massive data and learning insights in real time. However, whereas medical data and knowledge are accurately structured in digital service architecture, medical judgment is not easily automated. Physician review of automated learning to coordinate next steps in an individual patient’s treatment journey is needed to ensure care efficacy and patient well-being.
When the underlying context (often health care) potentially involves emotive responses from customers, human coordination becomes critical to ensure that insights from automated learning are tempered by emergent and local intelligence (not currently coded or digitized). For example, the First Response Digital Pregnancy Test not only confirms (or disconfirms) pregnancy but also uses a mobile app to communicate that information to a data center that can provide a host of tailored information for customers (including a local doctor referral list). However, given the emotive nature of the context, further coordination necessarily involves humans and implies the limits of automated learning and coordination.
Execution challenges and nascent knowledge can undermine the payoffs from the novelty of counterintuitive combinations. In theory, real-time insights from automated learning can improve human judgment when collective intelligence makes sense of local intelligence to uncover interdependencies among different points on the customer journey, as demonstrated by retailers’ use of trackers, sensors, and AI. For example, Neiman Marcus, Kroger, and Ralph Lauren use a wide range of in-store tracking technologies to learn more about their customers’ preferences, behaviors, and experiences and track the status of on-shelf products (http://customerthink.com/kroger-ralph-lauren-and-the-location-of-things-can-ai-humanize-the-employee-experience/). This learning is communicated to store employees who can combine collective intelligence with local intelligence to guide their interactions with customers and enhance customer experience in the moment. As the relative importance of real-time data in personalizing the customer journey increases, so does the value of human coordination of automated learning insights. However, human agency requires mastery of computational learning approaches to decide when collective intelligence is given priority and when it can be passed over in the face of deviating local intelligence. Such mastery is currently not common. Moreover, the massive amount of individual-level data from personal and wearable devices will challenge current knowledge of collective and local intelligence and their interrelationship.
Discussion and Implications
This article’s primary contribution is a conceptual framework of one-voice strategy for securing customer engagement that includes theorizing an SIS to understand the conditions and configurations that are relevant for how and when learning and coordination capabilities for intelligence generation/use contribute to one-voice strategy for effective customer engagement. Our motivation is that the rise of machine-age technologies presents a mixed bag of promises and challenges for service firms. On the one hand, these technologies hold promise for enhancing customer engagement by increasing the diversity and convenience of powerful service interfaces for flexible, customized, and intelligent customer-firm interactions. On the other hand, the same technologies challenge service firms’ capabilities as it is increasingly evident that customer engagement is less an outcome of any single interaction than it is a result of harmonious, seamless, and reliable interactions throughout the customer journey, which are achieved by judiciously mixing and matching human and machine capabilities. We discuss next key questions and issues that ensue from our conceptual and theoretical framework to guide future theory and practice as outlined in Table 3.
Research Issues and Questions Based on the Proposed Conceptual Framework
Note. SIS = service interaction space; AI = artificial intelligence.
Implications of the SIS Framework
Our conceptualization of SIS has important implications for systematic analyses of the ever-expanding set of possibilities that firms face while interacting with their customers and developing deeper understanding of how, when, and why service interfaces facilitate customer-firm interactions. Three associated sets of research issues/questions emerge.
SIS characteristics and interactions
We must carefully examine the characteristics or features of service interfaces to evaluate their impact on interactions. Prior studies (Hoffman and Novak, 2017; Huang and Rust, 2018; Meuter et al. 2000; Patrício, Fish, and Cunha 2008; Wünderlich, Wageheim, and Bitner 2012) have identified some characteristics; yet, our SIS conceptualization, which builds on the human-machine continuum, indicates a more systematic approach for studying service interfaces. A key question for further research asks, what are important features and functionalities of service interfaces, and how do they shape the nature and effectiveness of customer interactions? As a starting point, we propose five dimensions for consideration: (1) cost, that is, marginal cost of a particular interaction to the customer and to the firm; (2) speed, that is, time required to complete a particular interaction in the customer journey; (3) quality, that is, quality of the customer experience in a particular interaction; (4) agency, that is, ability of the customer to control the interaction; and (5) affect, that is, firm’s capacity to detect and display emotion in a particular interaction. This five-dimensional framework is a starting point for conceptualizing the different aspects and features of SIS. Other features may include interaction frequency, depth, and number of participants.
Another set of issues relate to how well such features mitigate the frictions associated with customer-firm interactions. For example, do features such as social functionality mitigate problems related to geographical or cultural distance and the extent of intimacy between customers and firms in their interactions? Similarly, do certain features facilitate more affective and emotional displays (on the part of both humans and machines) that enhance the effectiveness of service performance? Understanding the effect of specific features on important metrics associated with service interactions would facilitate more optimal selection of service interfaces.
SIS trade-offs
Whereas the study of SIS characteristics is useful to describe service interfaces, an important question concerns trade-offs in the portfolio of a firm’s service interfaces. Specifically, how should firms make trade-offs (e.g., quality/convenience, complexity/cost) when they choose a portfolio of service interfaces to deploy? It is costly to deploy unlimited service portfolios to serve customers anytime, anywhere, on any device; firms need to take into consideration the particular customer segment(s) they intend to target. Which metrics should firms use to prioritize their investments in SISs for different target customer segments? Such SIS decisions are rarely in a vacuum; it is important to align firms’ varied decisions and actions with other marketing functions. For example, the greater the extent of customer value cocreation, the greater the customer engagement and loyalty that firms will experience (e.g., Cossío-Silva et al. 2016; Jaakola and Alexander 2014). Given that different service interfaces afford different levels of customer value cocreation, how should firms align SIS decisions with their goals related to customer value cocreation? Further, SISs are constantly evolving as new devices and new service interfaces continue to emerge at different points in the spaces; so, it is important for firms to make dynamic trade-offs as part of their marketing strategies. Trade-offs that worked in yesteryears may be ill suited for changing times.
SIS and firm competitiveness
Our study also implies that firms’ SIS choices may shape their overall market competitiveness. For example, certain parts of their SISs may be sparsely populated (e.g., few available service interfaces), thereby discouraging firms from deploying those interfaces (e.g., due to cost and/or complexity). Would firms’ first-mover efforts and early presence in sparsely populated parts of their SISs enhance their market competitiveness? Firms’ SIS coverage decisions also may be predicated on specific industry characteristics. For example, the banking and hospitality sectors have taken leads in deploying robotic interfaces in customer service. Which industries and/or firms are likely to push into new SIS frontiers for competitive advantage? Insights on such issues could inform individual firms’ SIS decisions for their particular markets and industries.
Implications of the Intelligence Generation/Use Capabilities Framework
Another important set of research questions follows from our conceptual linking of learning and coordination capabilities for intelligence generation and use in service interactions.
Orchestrating effective customer journeys
A shift in focus from individual customer-firm interactions to the customer journey reveals several challenges that firms need to overcome. First, how does the customer journey perspective shape firms’ decisions about service interfaces? Second, what are the various interdependencies that exist among different service interfaces or different parts of SISs (journey touchpoints), and how do they shape customer engagement? Such questions could focus attention on issues related to the openness of technological architectures that underlie service interfaces, data sharing, and privacy policies adopted by firms (that own or operate interfaces) and the data-sharing controls exercised by customers. A consideration of these issues warrants an ecosystem perspective to acknowledge the varied entities that comprise the service ecosystem (e.g., Lusch and Nambisan 2015). Different types of interdependencies may have different impacts on the quality of customer-firm interactions and customer engagement. Deciphering the relative significance of different types of interdependencies could inform firms’ decisions related to the selection of service interfaces.
Learning capability
Firms’ ability to address the challenges related to interaction interdependencies may be conditional on their capability to learn from past interactions to generate local and collective intelligence. We discuss how humans and automated technologies generate local and collective intelligence albeit in different ways. Yet an understanding of the conditions that enhance (or diminish) firms’ abilities to learn in different service contexts is lacking. What contextual- and individual-level factors facilitate the extent of human and automated learning? How can firms create favorable conditions for human learning? How can they leverage emerging AI techniques to enhance their capabilities for automated learning? And what are the complementary firm–level resources and conditions that amplify the learning potential from AI techniques? These questions assume significance as local and collective intelligence become central to filling in gaps about individual customer needs and journey points.
Coordination capability
Our discussion conceives of a coordination continuum, anchored by human and automated capabilities, that suggests two contrasting contexts for intelligent action: an emergent service context that calls for flexibility and dynamic capabilities and a predictable and stable service context that calls for efficiency. Several related questions arise for future research. How should firms evaluate the relative appropriateness of human and automated coordination of the customer journey? In other words, what factors determine the emergent or stable natures of the service context? The salience of affect and emotional displays in service contexts may imply the limitations of automated coordination and the need for more human intervention. Similarly, what customer-, service-, and market-related factors determine the effectiveness of firms’ coordination of customer journeys?
Implications of the One-Voice Strategy Framework
Our conceptualization of the one-voice strategy as a firm’s actions, communications, and exchanges to deliver seamless, harmonious, and reliable stream of interactions for effective customer engagement, and the associated configurations of intelligence generation and use implies another important set of issues for research (see Table 3).
Contextual salience of configurations
We propose that human learning–human coordination and automated learning–automated coordination configurations are more appropriate for human- and efficiency-centered service contexts, respectively. Beyond these contexts, a more nuanced understanding of different configurations requires a more detailed examination of the internal and external factors of contextual relevance. For example, which industry-/market-related or product-/service-related factors favor the human-centered approach over an efficiency-centered approach (or vice versa)? Similarly, which industry/market or service context aspects inform the appropriateness or superiority of automated coordination of interactions over human coordination (or vice versa), when both are informed by human learning? Which industry-/market- or service-related factors indicate the salience of insights acquired through human learning when the coordination task is automated? These and other questions related to configurational fit form avenues for research. Prior categorizations of service contexts—both service act and service recipient (e.g., Lovelock 1983)—may prove beneficial in developing and validating more generalized frameworks that address these questions. More broadly, these issues imply that insights from role theory literature could be applied to gain a deeper understanding of customers’ role expectations with regard to frontline agents (both humans and machines); perhaps, a “machine role theory” could be developed to study how machines can be effectively deployed in service interactions.
Firm mechanisms of configurational fit
The four configurations also imply the need to design firm mechanisms that enable realization of configurational fit. For example, in the case of REI, the human learning–automated coordination configuration illustrates an opportunity to make connections between the off-line and online worlds of customer journeys; local intelligence gathered by human agents needs to be rapidly converted into collective intelligence and acted upon by automated technologies to chart the next steps of a customer’s journey. Similarly, in a reverse configuration, collective intelligence from the diversity of interfaces that constitute the online world needs to be integrated at the single point of contact of the frontline agent (human) for coordination in the off-line world. Both configurations imply the following research question: Which individual-, group-, and firm-level mechanisms are crucial in the rapid conversion of local intelligence into collective intelligence for use in automated coordination?
Facilitating conditions of configurational fit
Beyond firm mechanisms, the characteristics of the service context assume significance in enabling configurational fit. Our discussion highlights some of these contextual attributes, such as the level of trust and the nature of relationships among human agents. Accordingly, a key research question asks: What contextual factors—both frontline-agent related and technology related—are salient, and how do they moderate the effectiveness of individual configurations? Technological advances and applications are key to expanding the possibilities and potential of configurational fit. The managerial challenge is to orchestrate a MarTech mix for each interaction for each touchpoint in a customer’s journey and across all customers in a way that provides fluid use of human or automated coordination and learning configurations based on fit with the context. Finally, the disparate configurations imply a broader question: How and when should firms mix and match them to design effective customer journeys in a specific service context? Such a line of inquiry would require bringing together the key elements of all the three frameworks proposed in this article—SIS, intelligence generation/use capabilities, and one-voice strategy—and developing models that incorporate both mediating mechanisms and moderating contextual factors.
Concluding Notes
To orchestrate a one-voice strategy in a stream of interactions involving diverse interfaces across a customer’s journey is a compelling but challenging source of competitive advantage for service firms. Current research and practice show that machine-age technologies raise the competitive edge from intelligence generation/use capabilities while intensifying the challenges of achieving a one-voice strategy advantage. Our study provides a well-developed conceptual framework that can serve as a useful starting point to guide future research and practice. We hope the concepts of SIS, intelligence generation/use capabilities, and one-voice strategy, as well as the conceptual framework that integrates them, will help advance the understanding of causal mechanisms and consequences associated with the dynamics of customer-firm interactions for effective customer engagement.
Supplemental Material
Supplemental Material, JSR-ASI-ONE_VOICE_STRATEGY-EXEC_SUMAMRY-FINAL - One-Voice Strategy for Customer Engagement
Supplemental Material, JSR-ASI-ONE_VOICE_STRATEGY-EXEC_SUMAMRY-FINAL for One-Voice Strategy for Customer Engagement by Jagdip Singh, Satish Nambisan, R. Gary Bridge and Jürgen Kai-Uwe Brock in Journal of Service Research
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
