Abstract
Seeking to enhance our understanding of organizational knowledge creation in multimodal polysynchronous contexts, this paper empirically explores a project team, within a UK-based international company, concerned with the development of new software. Our aim is to extend current dialogical approaches to organizational knowledge creation, largely developed in the context of face-to-face communication, into virtual contexts of communication. Through close analysis of the ICT-mediated dialogical interactions between the members of a project team and the occasional face-to-face interactions between certain members of the project team and other organizational members, we show how knowledge creation emerges via three core dialogical processes: dialogues with real others, quasi-dialogues with invisible others, and quasi-dialogues with virtual artifacts. Exploring these processes in more depth, we further argue that the dialogical processes at hand are crucially shaped by team members actively working with the materiality of technology used, which enables them to: (a) mobilize multiple task-related voices when simultaneously interacting in multiple contexts; (b) alter the boundaries of communication to suit the demands of the task at hand; and (c) textualize the ongoing experience of interaction with others and artifacts.
Introduction
The dialogical or conversational basis of knowledge creation in organizations has been well recognized in the relevant literature (Carlile, 2004; Hargadon & Fanelli, 2002; Majchrzak, More, & Faraj, 2012; Nonaka & von Krogh, 2009; Nonaka, von Krogh, & Voelpel, 2006; Obstfeld, 2012; Tsoukas, 2009a). The influential work of Nonaka and his associates underscores the importance of ‘dialogue’ (Nonaka & Toyama, 2007, pp. 20–1) and ‘conversational interaction’ (von Krogh, Erat, & Macus, 2000, p. 129) throughout the knowledge creation spiral (especially at the concept creation phase). Although the notion of dialogue is not theoretically developed in the work of Nonaka, dialogical interactions have received attention in subsequent research. Tsoukas (2009a) has outlined a dialogical theory of knowledge creation, in which the latter, defined as the making of new distinctions, arises from participants engaging in dialogical interactions, in the context of dealing with an organizational task. In more recent empirical research, the importance of dialogue for knowledge creation has been confirmed (Majchrzak et al., 2012).
Although this research has generated significant insights, some of its assumptions are questionable. The most important, often unstated, assumption is that dialogue occurs primarily in face-to-face contexts. For Nonaka this is typically assumed to be the case and, sometimes, it is explicitly stated (Nonaka, 1999, p. 69). Tsoukas (2009a) admits that his model is grounded on face-to-face dialogical interactions, although, in his suggestions for further research, he argues for attention to be paid to knowledge creation in virtual environments (Tsoukas, 2009a, pp. 942, 954). Majchrzak et al.’s (2012) cross-functional teams mostly operate through face-to-face dialogues, although one of the teams met virtually. However, the medium of communication does not feature in Majchrzak et al.’s (2012) account and, therefore, its impact remains unexplored. The researchers report their findings, while tacitly assuming that the communication medium was of no major importance.
The widespread use of information and communication technologies (ICTs) and the increasing virtualization of the workplace show that the aforementioned assumption is unwarranted. Increasingly, members of the same or of different organizations work interdependently in purely virtual or hybrid contexts, in which individuals are geographically and/or temporally dispersed, communicate via e-mail, videoconferencing, teleconferencing, and several other means of virtual communication (Bailey, Leonardi, & Barley, 2012; Berry, 2011; Dixon & Panteli, 2010; Hinds & Kiesler, 2002; Sarker, Ahuja, Sarker, & Kirkeby, 2011). Moreover, in contemporary workplaces, it is not simply a question of face-to-face or virtual communication, but of both: organizational members may work with both types of media simultaneously, or at least the use of them is oftentimes interwoven.
Does the medium of communication matter for knowledge creation? There is increasing evidence that it does. Studies of ICTs, especially in organizational contexts, suggest that ICTs do not merely act as a substitute for face-to-face communication, but distinctly shape communication, thus enabling new kinds of interactions to take place (Berente, Hansen, Pike, & Bateman, 2011; Dennis, Fuller, & Valacich, 2008; Hinds & Bailey, 2003; Nicolini, 2006). Hitherto research has shown that, in ICT-mediated interaction, communicative boundaries considerably expand, thus extending the notion of co-presence, enabling multiple, often unobtrusive, types of communication in parallel (Dennis, Rennecker, & Hansen, 2010; Sarker & Sahay, 2003), and reconstruct well-known features of face-to-face communication, such as the reflexive monitoring of action (Giddens, 1990; Dennis et al., 2008). Moreover, communicative boundaries reshape communication patterns and the processes through which collaboration among virtual team members is achieved (Sarker & Sahay, 2003), and enable joint processing of virtual artifacts, made possible by the latter’s open-endedness (Kallinikos, Aaltonen, & Marton, 2013, pp. 358–61; Stigliani & Ravasi, 2012, p. 1234).
Those few studies that have explicitly focused on knowledge creation in ICT-mediated settings have shown that the latter facilitate user participation and the bringing together of multiple perspectives (Alavi & Tiwana, 2002; Lee & Cole, 2003; Lichtenstein & Parker, 2006; Markus, Manville, & Agres, 2000; Morner & von Krogh, 2009; Wickramasinghe & Lichtenstein, 2006). The insights these studies have generated call for further theorizing. How is extended interaction achieved through both face-to-face and virtual communication media, with what effects? How do ideas develop when expressed through ICT-enabled dialogical interactions? How is the open-endedness of virtual artifacts (Boland, Newman, & Pentland, 2010; Dougherty & Dunne, 2012; Kallinikos et al., 2013; Yoo, Boland, Lyytinen, & Majchrzak, 2012), conducive to organizational knowledge creation? To address these questions we need to bring together insights from two distinct (and so far independently developed) literature streams: one on dialogical knowledge creation (which, as said earlier, largely assumes face-to-face communication) and one on virtual communication (which, to a large extent, does not deal with knowledge creation). This is what we attempt to do in this paper.
More specifically, we aim to address the following question: How do virtual organizational settings shape the dialogical interactions that lead to knowledge creation? Conceptually, we adopt Tsoukas’s (2009a) dialogical approach to knowledge creation and seek to extend it to virtual settings. Empirically, we address our research question through a detailed ethnographic study of the work of a project team, charged with developing innovative software, geographically dispersed, and part of a UK-based international firm. Communication between team members is multimodal and polysynchronous. 1 It is multimodal, insofar as several modes of virtual interaction between team members are interwoven with instances of the face-to-face interaction mode between certain team members and other organizational members. And it is polysynchronous, to the extent that different degrees of synchronicity are involved in the communication media used in the team (more about this below). By focusing on the ICT-mediated dialogical interactions between team members, as well as paying attention to the interactions certain team members have had with others, we show how knowledge creation emerges through three core dialogical processes: dialogues with real others, quasi-dialogues with invisible others, and quasi-dialogues with artifacts.
The paper is organized as follows. In the next section, we provide the conceptual background to our empirical study by discussing first the dialogical roots of organizational knowledge creation, second, the ICT-mediated contexts of communication, and third, dialogicality and virtuality by drawing on the work of Goffman. We then present the research design and findings of our ethnographic study. In a subsequent section, we further theorize our findings, followed by a more general discussion of issues related to organizational knowledge creation in multimodal polysynchronous settings. Finally, we conclude and suggest implications for further research.
Conceptual Background
To prepare the ground for our empirical study, in this section we bring together two hitherto separate streams of literature: dialogicality and knowledge creation on the one hand and virtuality and ICTs on the other. We further explore how insights about dialogicality and virtuality may be merged through drawing on the work of Goffman.
Dialogicality and knowledge creation
Drawing on Bell (1999) and Dewey (1934), Tsoukas (2009a, pp. 942–3) has defined the creation of new organizational knowledge as the drawing of new distinctions by individuals engaged in a task (see also Majchrzak et al., 2012; Obstfeld, 2012). Since new distinctions are likely to be made when existing knowledge is used, ‘knowledge use and knowledge creation cannot be easily separated. The interpretative use of an idea in a new context is itself a minor act of knowledge creation’ (Eraut, 1994, p. 54; for a similar remark see Calhoun & Starbuck, 2003, p. 477; Garud & Kumaraswamy, 2005, p. 9).
More specifically, building on earlier interpretive theories of organizational knowledge creation (Carlile, 2004; Cook & Brown, 1999; Hargadon & Fanelli, 2002; Nonaka et al., 2006; Orlikowski, 2002), and drawing on dialogical psychology (Hermans & Gieser, 2014; Markova, 2003a, 2003b; Shotter, 2011), Tsoukas (2009a) has explicitly focused on dialogical exchanges as the main process through which new distinctions emerge in face-to-face communication. What is important in productive dialogue, he argues, is that it enables participants to take a distance from their customary assumptions and understandings (what Tsoukas (2009a, p. 943) calls ‘self-distanciation’), and reconceptualize a situation at hand. Thus new distinctions emerge insofar as participants take a distance from their previously held views or assumptions. Reflexivity is the motor of such change.
Typically, a dialogical interaction involves the following three logical steps: individual A talks; individual B reciprocates; and individual A further talks (rearticulates his/her views), in light of the second interlocutor’s reply (Hermans & Hermans-Jansen, 2003; Markova, 2003a, 2003b; Tsoukas, 2009a). It should be noted that rearticulation (i.e. A’s response to B’s utterance) is a reflexive utterance, since it is made bearing in mind both participants’ previous utterances (i.e. A’s initial utterance and B’s response to it) (Markova, 2003a, p. 182; Tsoukas, 2009a, p. 944). This is important since, through each participant reflexively understanding his/her own utterances, prompted by the utterances of the other, participants may become aware of the taken-for-granted distinctions and assumptions they have been employing, thus enabling the making of new distinctions (Tsoukas, 2009a, p. 943).
New distinctions vary in terms of novelty and consequences. Some new distinctions may lead to new products (Garud, Gehman, & Kumaraswamy, 2011), while others lead to revamped organizational processes (Tsoukas, 2009a). For example, thanks to the new distinction ‘twisting stretch’, dialogically arrived when a software developer who had served an apprenticeship with a master baker was prompted by a group of engineers, Matsushita was able to develop the first fully automated bread-making machine (Nonaka & Takeuchi, 1995, pp. 100–20; Tsoukas, 2011, p. 471). Majchrzak et al.’s (2012) study of knowledge integration-cum-creation in cross-functional teams has shown how new distinctions arose as a result of teams reframing an abstract representation (‘scaffold’). Those distinctions were new insofar as individuals addressed task demands in a novel way, concerning, for example, how space could be better used in an industrial design consulting. Yet other distinctions, such as, for example, new methods of improving violin sound and tone, no matter how novel they may be at the time of creation, may take a long time (even centuries!) before they are recognized (see Cattani, Dunbar, & Shapira, 2013). When approaching knowledge creation dialogically, what matters is the making of new distinctions by organizational members while addressing particular tasks. How radical or consequential such distinctions are is, analytically, a different matter.
Although in its paradigmatic form, dialogue is a face-to-face, language-based communication process, nonetheless, as several authors influenced by the work of Bakhtin have noted, dialogue can be seen in broader terms (Shotter, 2011; Wertsch, 1991; Lorino and Clot, 2011). Volosinov (1986, p. 95) defines dialogue ‘in a broader sense, meaning not only direct, face-to-face, vocalized verbal communication between persons, but also verbal communication of any type whatsoever’. From this point of view, dialogicality is an essential feature of discourse, since the very capacity to think is grounded on otherness (Holquist, 2002, p. 18; Taylor, 1991a, p. 33). Indeed, the very use of language presupposes an ‘other’ whom one addresses and, in that sense, language use is necessarily dialogical. Although particular utterances are the unique products of language speakers, they are not ex nihilo constructions, since they inescapably draw on the utterances of others (Bakhtin, 1986, pp. 92–5). As academic researchers know all too well, when we write or speak about a certain topic, we are not the first ones to do so. We draw on others’ voices and perspectives, even if it is to critique them. The other inheres in one’s utterances.
If a broad view of dialogicality is adopted, particular utterances are seen not as mere individual expressions, but as interpersonal accomplishments. This is what Bakhtin brings to our attention with his notion of ‘hidden dialogicality’. He defines the latter as follows:
Imagine a dialogue of two persons in which the statements of the second speaker are omitted, but in such a way that the general sense is not at all violated. The second speaker is present invisibly, his words are not there, but deep traces left by these words have a determining influence on all the present and visible words of the first speaker. We sense that ‘this is a conversation, although only one person is speaking. (Bakhtin, 1984, p. 197)
For example, an instance of hidden dialogicality is seen in the case of Cook and Yanow’s (1996, p. 442) flute maker who utters ‘it doesn’t feel right; it’s cranky’. It is as if the flute maker responds to a question put to him by an invisible other. This intra-mental hidden dialogue reflects earlier verbal exchanges that, most likely, occurred at the inter-mental level, namely between the flute maker and his supervisor or trainer (Wertsch, 1991, pp. 89–90).
If Bakhtin’s view of dialogicality is adopted, the dialogical partner is not necessarily a real other, but may well be an invisible other (e.g. an author) or an artifact (e.g. a drawing) (Holquist, 2002, p. 30; Tsoukas, 2009b). In such cases, we cannot speak of proper dialogue as such, since the other is not immediately available to reciprocate, but of quasi-dialogue. What is important to note, however, is that even quasi-dialogues preserve what is most distinctive in dialogicality, namely, sensitivity to otherness (Holquist, 2002, p. 41; Taylor, 1991b, pp. 310–14; Tsoukas, 2009a, p. 944). Quasi-dialogical exchanges ‘generate strangeness’ (Tsoukas, 2009a, p. 944), insofar as the individual tries to understand better his/her own earlier responses (not necessarily verbal) to invisible others or artifacts, which the individual then needs to assimilate, and by so doing he/she is stimulated to make new distinctions. As we will see later, in a virtual context of interaction, the capacity for making new distinctions is further enhanced by the affordances of virtual artifacts, insofar as they lend themselves to being editable, interactive and reprogrammable compared to physical entities (Kallinikos et al., 2013, pp. 358–60).
ICTs and virtuality
To better understand what ‘working virtually’ means, Bailey et al. (2012, p. 1485) have distinguished between ‘digitization’ and ‘virtuality’. ‘Digitization’, they note, involves ‘the creation of computer-based representations of physical phenomena’, while ‘virtuality’ involves ‘working with a [digital] representation of the physical rather than with the physical itself’ (Bailey et al., 2012, p. 1485). In virtual conditions, individuals ‘operate with’ and ‘operate on’ representations, namely, respectively: they read and respond to emails, engage in teleconferences and take part in online chats (Bailey et al., 2012, p. 1487); and manipulate representations (p. 1487).
Virtual communication is marked by three interconnected features: (a) reduced referential function of language, (b) incompleteness and (c) the dialectic of presence and absence. Below we briefly describe each one of them:
Reduced referential function of language
Some virtual artifacts have no referent (namely, they do not ostensibly refer to something definite in the world), as, for example, sale statistics in a spreadsheet (i.e. the numbers do not point to concrete objects in the world) (Bailey et al., 2012, p. 1487; Turkle, 2009). When representations lack a referent, they may be ‘operated on’ in a way that is not limited by material constraints. When it is unclear what the referent is, work is required to establish its identity, as, for example, when it is unclear whether there is someone (and if so, who that is) at the other side of a collaborative software. In such cases, the context that will enable the referential function of language to be activated must be established. ICTs that enable ‘presence awareness’ (i.e. make visible those who are available for communication) (Dennis et al., 2010, p. 849), facilitate the creation of such a context. When ‘silent interactivity’ (i.e. private communications that do not disrupt what is going on in a meeting) (Dennis et al., 2010, p. 849) is made possible by a particular ICT, the context of virtual communication is made more mutable, since what is going on is partly reconstituted by participants reacting to what is already going on (e.g. the use of instant messaging in a meeting).
Incompleteness
Virtual artifacts, insofar as they are objects ‘lack[ing] the plenitude and stability afforded by traditional items and devices’ are ‘incomplete’ (Kallinikos et al., 2013, pp. 357–8), ‘malleable’ (Yoo et al., 2012, p. 1399) or ‘underdetermined’ (Poster, 2001, p. 17) – namely, they ‘solicit social construction and cultural creation’ (p. 17). Virtual artifacts always have the capacity to become something else from what they currently are; they become actual through the operations people apply ‘on’ them (e.g. CAD drawings). As Poster (2001, p. 18) remarks, a virtual artifact ‘remains an invitation to a new imaginary’, thus being ‘editable’ and ‘reprogrammable’ (Kallinikos et al., 2013, pp. 358–9). Although the mutability of virtual artifacts is not entirely new, since all representations are partly susceptible to resignification (Taylor & Van Every, 2000), what makes ICT-mediated communication different is that it is far more immersive and open-ended (Dodgson, Gann, & Phillips, 2013; Kallinikos et al., 2013, p. 360).
The presence-absence dialectic
In contrast to communication in conditions of physical co-presence, in which time and space are linked through place, in conditions of virtuality time can be recombined with space at will; communication no longer presupposes sharing a ‘here and now’ (Giddens, 1991; Tsoukas, 1997). In virtual communication, social relations are lifted out from their local contexts of interaction and recombined across indefinite spans of time-space (Giddens, 1990, p. 21). Action at a distance thus becomes possible (Cooper & Law, 1995). The dialectic of presence and absence is crucial, since individuals, on the one hand, are present to one another through their communication in abstract space (i.e. they are present as ‘indices’), but, on the other hand, they are absent from one another, since they ‘lack a holistic sense of embodied interaction’ (Dreyfus, 2001, p. 58), occupy geographically different places, and may operate in different time zones (Hayles, 1999, pp. 247–50).
Particular types of ICTs have distinct features, affording users particular types of interactions, depending on the contexts and the purposes for which they are used (Leonardi, 2011, p. 153; Majchrzak & Markus, 2012; Yoo et al., 2012, p. 1399). Although technological features are independent of actors, the affordances of a particular technology are not, since they depend on how actors perceive those features. Affordances are relational: they are constituted by the way individuals relate to technology’s features (Gibson, 1986). As Leonardi (2011, p. 153) notes, people approach technology (and materiality at large) with diverse goals, thus perceiving technology as affording distinct possibilities of action (Carlile, Nicolini, Langley, & Tsoukas, 2013).
Since the project team we have studied made use of certain types of ICTs, we will explore here the features (and in the empirical section the associated affordances) that are relevant for these ICTs only. Drawing on Dennis et al. (2008, 2010) and Reinsch, Turner and Tinsley (2008), the use of email, teleconferencing and collaborative software (the three ICTs used in the project team under study) can be understood in terms of the following features: synchronicity, rehearsability and reprocessability. Each is briefly explained below.
Synchronicity is the extent to which a medium enables communication participants to communicate at the same time. Thus email-assisted communication can be made synchronous in various degrees, while teleconferencing and the use of collaborative software are high in synchronicity. Rehearsability is the extent to which a medium enables its user to rehearse or fine tune the message before sending it. Both email and collaborative software technologies enable medium-to-high rehearsability, depending on the context of use, whereas teleconferencing is medium-to-low in rehearsability, since it approximates face-to-face conversation. Reprocessability is the extent to which the user is enabled to re-examine or reprocess the message, especially by relating it to other prior messages stored in some archive. Both the email and collaborative software technologies are high in reprocessability, insofar as emails and traces of collaborative software-enabled work are archiveable, thus providing the opportunity for participants to revisit earlier messages, work traces or drafts. In the case of teleconferencing, reprocessability depends on the availability of teleconferencing archives. The features of these ICTs are shown in a summary form in Table 1.
Features of ICTs.
A distinctive practice that is enabled by using email, teleconferencing and collaborative software in multimodal polysynchronous teams is that of ‘multicommunicating’ (Reinsch et al., 2008, p. 391), namely, the ability to engage in several overlapping, synchronous conversations at the same time. Virtual environments, especially those in which ‘silent interactivity’ is possible (Dennis et al., 2010, p. 849), enable multicommunicating as, for example, when a participant can have multiple conversations through teleconferencing, engage in chatting and send emails, all at the same time, with different people. How the above features of the ICTs at hand are used – how, that is, they are transformed to affordances – depends on the contexts in which they are used and the goals of participants in using them (Leonardi, 2011).
Dialogical interaction in virtual contexts: Insights from Goffman
Following Goffman (1959), more than being an exchange of utterances, dialogicality is a ‘performance’, namely a reciprocal activity carried out in the continuous presence of others, whom it seeks to influence in some way (Goffman, 1959, p. 32; see also Dennis et al., 2010, pp. 851–2). The part of an individual’s performance that is shared with others and occurs by following certain conventions and standards, in which the individual ‘expressively accentuates’ the image he/she seeks to project to others in consistence with a role, is the ‘front region’ or ‘front stage’ of the performance (Goffman, 1959, pp. 110, 114). In the ‘front region’, both the performing individual and others are present. By contrast, the ‘back region’ or ‘back stage’ of the performance is the place where ‘the impression fostered by the performance is knowingly contradicted as a matter of course’ (Goffman, 1959, p. 114). In the ‘back region’ the performing individual is present, but others, for whom the performance is staged, are not, so the individual can step out of performance without fear of disrupting it. In the ‘back region’ individuals do not monitor their actions with the same degree of reflexivity as they do in the ‘front region’.
‘Back’ and ‘front regions’ are relative to one another. An example of a ‘front region’ is Orr’s (1996) photocopy technicians interacting with clients, seeking to project an image of competence and self-confidence. In the ‘front region’, actions are ‘accentuated’ and related to the image technicians project to clients, by whom they are directly observed. Actions that seem to be inappropriate with, or contradictory to, the projected image are suppressed and reserved for the ‘back region’ where others (in this case, clients) cannot intrude easily. In this example, the ‘back region’ relative to the ‘front region’ includes the ‘private’ conversations technicians have with one another, both at, and outside of, the site of a client, in which technicians discuss more openly the diagnostic challenges they face and the uncertainty the latter create for their work.
Virtual interaction critically reshapes the constitution of the ‘front’ and ‘back regions’ of interaction, and how they are related (Dennis et al., 2010). Since virtual interaction separates the contexts in which individuals are situated, an abstract shared region of interaction is established, enabling individuals, separated in space and perhaps in time, to be present in more than one front region, with each one having its own back region. For example, imagine individuals I1, I2 and I3 chatting electronically, thus forming a shared front region. At the same time, the practice of multicommunicating may be taking place. Thus I1 may be concurrently chatting on Facebook with someone else (I4) who is not part of the focal chat (i.e. the chat between I1, I2 and I3). So with respect to the focal chat, the shared region created between I1 and I4 constitutes a back region for I1. If, however, the chat between I1 and I4 is seen as the focal chat, its shared region constitutes a front region for I1 and I4, while the shared region between I1, I2 and I3 is now the back region.
Considered in light of Goffman’s model, Bakhtin’s notion of ‘hidden dialogicality’ acquires an additional meaning. Hidden dialogicality may reflect not only earlier or imaginary dialogues with others, but also concurrent, invisible dialogues with real others made possible by multicommunicating. For example, in their study of instance messaging in the context of collaborative decision making, Dennis et al. (2010) found that participants were able to influence front-stage decision making by engaging in ‘invisible whispering’ – namely, in multiple, parallel backstage conversations, through instant messaging. In this instance, hidden dialogicality takes the form of contributions to an ongoing communication process, which are invisible in the front region.
Research Setting and Methodology
In this section, we report the findings of an ethnographic study focusing on a single team that blends synchronous/asynchronous and virtual/collocated interactions through multimodal communication channels. The research design adopted follows an approach similar to that adopted by dialogical psychologists (see Mercer, 1995; Shotter, 2011) and communication theorists (Benoit-Barné & Cooren, 2009).
We have chosen to study a project team involved in the development of new software. Although we have tried to keep technical terminology to a minimum, some technical terms are inevitably needed; hence our use of endnotes. The team is based in McKay (a pseudonym to preserve anonymity), an international, UK-headquartered company with employees and clients around the world, that develops software to support the entire life cycle of engineering projects (document management and control, project collaboration, etc.). Following Griffith, Sawyer and Neale’s (2003) classification of virtual teams, we consider the project team (PT) under study as a virtual team, since the communication of its members is entirely ICT-mediated and some of its members have never met face-to-face. However, the virtual communication between PT members, punctuated with instances of face-to-face communication between some PT members and other McKay specialists who are occasionally brought in for assistance, renders the team under study a multimodal polysynchronous one, which is richer than a simply virtual or collocated team.
More specifically, the PT uses a range of ICTs that mediate the communications of its members, such as: (a) synchronous teleconferencing similar to Skype (with the camera option off due to speed connection delays once more than two members are connected); (b) collaborative software, designed by the company, allowing the collaborative creation, sharing and editing of files (such as documents, presentations, CAD drawings, etc.), managing events (i.e. scheduling online meetings) and posting comments; and (c) emails.
PT consists of three members: Jack, based in Glasgow, UK; Rob, based in California, USA; and Mark, based in Washington, DC, USA. Another McKay software specialist, Tom, joins the team from time to time to help Jack when this is considered necessary. Tom is physically located at the company HQ in Glasgow and interacts face-to-face on a daily basis with Jack. The team has had a two-year history working on other projects. It is managed by Jack, who is responsible for setting the aims of the team; for charging each individual with specific tasks; for monitoring progress; and for coordinating the weekly meetings of the team. Jack and Rob have never met face-to-face. Mark has the highest technical expertise in the team.
Multiple forms of data have been collected, as part of a longitudinal ethnographic research (June 2002–February 2006) to capture as many of the interactions of the project team as possible.
The PT had been given the task of customizing an existing software to meet the needs of a new client. The client was an engineering company, bidding for one of the UK’s largest construction projects. It needed the software to manage bidding documentation across all parties involved in submitting the bid over a ten-month period. The client had asked for a technology solution in the form of a document management and collaboration software system which would be accessible to the bid team and their partners, allowing 200 users to log in from 35 different partner organizations. For the PT this entailed customizing an existing document management and collaboration software system to the needs of the client, integrating various pieces of software and adding new functionalities (mainly related to editing and sharing technical drawings) to ensure fast turnaround of most documents and drawings to reduce the risk of errors.
Having spent the first six months familiarizing herself with the research setting and earning the trust of the company, the first author was given extensive access to the workings of the team. Thus, she could now shadow Jack conducting routine work and observe, from Jack’s perspective, all PT interactions related to the task at hand. In total, 28 observations were conducted. Each observation lasted from three to five hours and consisted of observing Jack interact with his fellow PT members for the purposes of this particular project. These observations were done by the first author acting as a non-participant observer (see Leonardi, Neeley, & Gerber, 2012). Six teleconference meetings were audio-recorded varying from one to five hours, with an average of three hours. Each audiotape was transcribed verbatim, yielding 70 pages of transcript data. Moreover, 262 relevant emails were collected.
After each observation was completed, handwritten notes were immediately typed, producing 8–10 pages of single-spaced text. In addition, a round of exploratory interviews with team members was held (see Dennis et al., 2010), focusing on interviewees’ perceptions of working on the particular task. Six hours of interview data were gathered and transcribed, producing 35 pages of transcript data; 112 research hours were spent, sitting behind Jack at his desk, watching him interact polysynchronously and multimodally (face-to-face and virtually) at the same time. Observations stopped when we felt that they had reached a saturation point whereby few new insights were being obtained by more observations (Leonardi et al., 2012). Table 2 summarizes the data collected.
Summary of data collection.
All the different forms of data gathered constitute what discourse theorists call ‘texts’, namely ‘any kind of symbolic expression requiring a physical medium and permitting of permanent storage’ (Taylor & Van Every, quoted in Phillips, Lawrence, & Hardy, 2004, p. 636; Leonardi et al., 2012; Benoit-Barné & Cooren, 2009). In this case, texts took a variety of forms, including written documents, email exchanges, recorded audio interactions, interview transcripts and field notes on how project team members interact with ICTs.
Data analysis consisted of iterative readings of all of the texts collected, and was conducted in three steps. Step 1 aimed at identifying the front and back regions of interactions, and how they are related. Focal interactions involving all PT members were identified as front regions. Then any interactions identified as related to the focal interactions, but not intended to involve all team members, were marked as back regions. In Step 2, we were interested in relating front and back regions with the use of specific ICT affordances that team members worked with to move between front and back regions (see also Dennis et al., 2010). In Step 3, we applied Tsoukas’s framework to identify ICT-mediated dialogues that included instances of distantiation (i.e. participants taking a distance from their hitherto ways of understanding and acting) in order to assess whether they led to new distinctions. We identified particular micro problem-solving episodes, consisting of initial puzzles which led individuals to exchange, critique, combine and/or reject ideas, invent new approaches, and finally, make new distinctions to improve software design. More specifically, we looked for instances where team members engaged in a communication act that further rearticulated their views in light of someone’s reply to their initial message. We marked the further articulation as an instance of reflexivity since, in line with the dialogical model discussed earlier, it had been made bearing in mind both the initial message and the reply to it. Then we looked for instances of distantiation, in which team members appear to become aware of their taken-for-granted assumptions and review a situation at hand, thus enabling the making of new distinctions.
We identified six micro problem-solving episodes, one of which, arguably the most creative, we present below. In this episode, PT members, having developed two new applications 2 (related to the editing and sharing of technical drawings) for the original document management and collaboration software system that was already in place, test for the first time the integration of these applications to the original software system.
It should be noted that software testing is widely regarded as a software development practice and thus it is considered to be a creative project (Madeyski, 2008). A piece of software is seldom designed to operate as a standalone system, but rather to become integrated with other software applications, in order to add new features (Mookerjee, 2005). In this respect, the attempt to integrate and test the two applications at hand involves the making of new code, since it enables software developers to detect faults and add new functionalities and novel features, whose novelty varies from incremental refinements to completely new solutions (see also Rompf & Odersky, 2012). As Tiwana (2012) notes, software development is a process of iterative problem solving among experts, facing mainly complex tasks, until experts reach consensus on critical conflicting requirements and code testing yields the desired functionality. It should be noted that, while what will be presented below may seem as a simple, linear conversation between colleagues, it rather is a complex dialogical interaction (see also Wan, Compeau, & Haggerty, 2012), involving the use of multiple media, parallel conversations, varying degree of synchronicity afforded by technology, textualized discussions in the form of drawing on past dialogues, and a sense of reduced social presence. We will explore these interactions below.
Considering the above, four synchronous teleconference meetings were held over a period of a week, and daily interventions to the software were observed as part of a multi-stage process of software testing. Emails were exchanged between team members and between Jack and other company members (Tom, Colin and Ian) on the same task.
The problem
As is common in studies engaging in conversation analysis (Boden, 1994; Edwards, 1999; Fox, 2008; Middleton, 1998; Samra-Fredericks, 2003, 2010) and, more generally, in exploring naturally occurring dialogical interaction (Ligorio, 2010; Mercer, 1995; Shotter, 2011), the mere presentation of naturally occurring talk among the team members necessarily involves some analysis of it, by commenting on how it is played out. Such presentation-cum-analysis of findings is later followed by further interpretation of the dialogical exchanges at hand, while, in the Discussion section, broader theoretical reflections are offered.
Findings and initial analysis
In our study of PT, we have focused on a series of synchronous and asynchronous interactions, in which team members work to solve a particular technical problem (called here P1), concerning the integration of two independent existing software applications (S and W) into one new application (X), and then testing the new application (which, after successful testing, will become a new piece of software). The new application must contain a new functionality requested by the client. Thus PT members (Jack, Rob and Mark) engage in four synchronous meetings (M1, M2, M3 and M4), using teleconferencing, emails and an in-house developed collaborative software (Xware 3 ), to address P1. Since of the three media used Xware was the least known to us, the first author spent some time familiarizing herself with it. Xware allows users to share files dynamically, and any changes on files performed by one user are propagated through the network, so that all copies remain synchronized. Changes in any of the files can be observed by all users (provided they are given access to the shared virtual front region). Xware thus allows for both ‘presence awareness’ and ‘silent interactivity’ (Dennis et al., 2010, p. 869), since if any of the users wants to perform actions privately, he/she can switch his/her screen to ‘private’, perform any actions he/she wants, and then switch their screen back to ‘public’. The rest of the team members are not able to view each other’s ‘private’ screens.
All but one of the meetings were rather short, setting and checking individual tasks. Integrating the two applications into a new one had not been an easy task for the team, since it required new programming code to be produced and existing programming code to be modified. M3 was the longest virtual meeting, in which team members, using both teleconferencing and Xware, attempted to integrate the two existing applications into a new one, and then run the compatibility test for the first time. Thus M3 was the most crucial meeting of the four. In our terminology, it constitutes the front region (Chat 1) (see Figure 1).

ICT-mediated and face-to-face dialogical exchanges between team members during problem-solving P1.
The first author was located in the same office as Jack and was able to observe and take notes on his actions throughout M3. While the teleconference allowed videos to be transmitted, it was noticed that team members kept the camera option off, since having it on considerably lowered the speed and quality of sound transmission. Furthermore, Jack, being the coordinator of the teleconference meetings, initiates the calls, adds people to the conversation and can, at any point, mute out individuals during the conversation. The practice of muting out Rob or Mark, or both, was observed in all instances of teleconferencing, enabling Jack to often alter the boundaries of communication. Other team members could always mute themselves out during the meeting. While there is no evidence of how the other team members worked with this affordance, Jack often used muting out, in order to collect relevant information from individuals outside the virtual meeting.
Tom, a software developer, who physically shares the same office with Jack, was observed by the first author to help Jack in several synchronous PT meetings, without the other team members always being aware of it. For example, during M3, a face-to-face chat (Chat 2) between Jack and Tom, unknown to the other PT members, served as a back region to Chat 1 and helped Jack when he was struggling to work on a certain link 4 to the application about to be tested. Jack was also observed to send and receive emails, not always related to M3, or revisit past emails to retrieve critical information for M3. More specifically, during M3, Jack several times visited an email exchange (Chat 3) between Tom, Colin, Ian (all McKay members) and Jack, which took place prior to M3 and appeared to contain useful information related to the meeting at hand. Figure 2 summarizes how Jack is connected at various points in time, before and during M3, using specific media channels. All individuals had been informed that their meetings would be observed and tape-recorded for research purposes.

Timeline of Jack’s interactions before, during and after M3.
We begin our empirical exploration with excerpts from Chat 1. Jack begins to alter the programming code to integrate the relevant applications S and W, before testing starts. This is a very important step in the process of code testing, since two applications may work without problems separately, but a poor link may create conflict between them and thus block the integration altogether. A link is a piece of programming code that aims to put together separate pieces of software, in this case applications S and W. A simplified example of a link would be:
#include <boost/lexical_cast.hpp> void Run(const std::string& w, int tgt, int s).
So, while in the dialogue below, natural language is mostly used, it should be remembered that apparently colloquial words, such as ‘link’ (e.g. links Y, N, M and V in the excerpt below) stand for technical terms and that PT members engage with complex programming language that has to be loaded into the running program (see Figure 3).
1. Jack: …I didn’t know to look under Y for application S and I looked under N and
2. now just wonder if we should move the archive or if they are put under
3. another folder so that mistake isn’t made again…
4. Mark: …The wide cycle performance 5 is also part of application S. Take a link
5. from the Y through that other one…That’s why the life cycle 6 closes under
6. application S.
7. Jack: Alright…So where are we? Can you see what I’m doing?…
8. Mark: Under M link!…
9. Jack: Don’t seem to be getting it… (Pause) Right! That was the first place I went
10. and there didn’t seem to be anything.
11. Mark: Ok the life cycle, link that under application W under V.
12. Jack: Application W?
13. Mark: Yea
14. Rob: Right. (Long Pause)
15. Jack: [Murmurs to himself: ‘I don’t know why it’s taking so long to work though, it
16. hasn’t taken that long before] (pause). There we go, there is the template. 7
17. Mark: Mhm. That’s good.

Example of programming code.
In this excerpt, Jack articulates an utterance (lines 1–3), based on his understanding of the beginning of the integration process. He works with the affordances made available by the particular ICTs to concurrently interact with Rob and Mark. His initial actions are not seen by others. Jack’s screen is, at the time, switched to ‘private’, so Mark and Rob cannot view his actions. Jack has often been observed to use Xware’s affordance of silent interactivity, without warning Rob and Mark that he would switch his screen to ‘private’. At the same time, Jack disables the teleconference by muting out Mark and Rob, so that they do not hear anything that is being said in Jack’s office. In that way, using Xware’s and teleconference’s affordances, Jack controls information about his actions, reshapes participants’ communication boundaries and reconstructs the shared front region. It should be noted that the other two team members can see when Jack uses the ‘private’ button and they also know when they are muted out.
Jack first works on link N. Working on a link requires adding new, or modifying existing, programming code. The importance of every character added in the link is crucial, since even a missing comma or an extra space can stop the program from running. Jack early on realizes that there is a technical problem that impedes integration of applications S and W, referring to a ‘mistake’ that has been made. Technically, this could mean that an important part of programming code is missing or is misplaced. His initial action to work on link N creates an artifact that incorporates what Jack focally knows so far in relation to this piece of software. This artifact serves as a display for Jack: it enables him to see what he has done and reflect on how he has done it. Jack looks back at the actions he took while working on link N and realizes it was a ‘mistake’ (line 3). According to Tsoukas’ (2009:944) model, this qualifies as a new distinction (see also Dodgson et al., 2013), since Jack has now formed a new understanding of something being wrong with the programming code. Xware alerts users to problematic parts of the programming code by highlighting them in red, so it is quite clear at this point that something is wrong.
When Jack says ‘I looked under N’ (meaning ‘I have worked on link N’), one might think that Jack’s work on N was entirely his own choice. Upon closer inspection, however, Jack’s utterance reveals another, hidden voice, not known to his front region audience. The hidden dialogue, in this case, is a dialogue that runs in Jack’s back region, thus bringing into play multiple voices in the shared front region (Chat 1). What happened is that, at the beginning of the meeting, while Jack’s screen was ‘private’, Tom, who had been physically present in Jack’s office, was guiding him on his work on the programming code. This practice is similar to what Dennis et al. (2010, p. 858) describe as an episode of ‘virtual ventriloquism’, whereby a person acting in the front region receives advice ‘like a prompter whispering instructions from backstage in a traditional theatrical environment’ (Dennis et al., 2010, p. 858). Being the PT coordinator, Jack admits, in an interview later, that he wants to protect his leading status in the team and does not, therefore, reveal to his fellow PT members Tom’s role in the meeting.
Xware’s silent interactivity enables Jack to move between the front and back regions, without Mark and Rob being aware of Tom’s presence in Jack’s office. Consequently, they are unaware of Tom’s instructions that influenced Jack’s work on the programming code. When Tom leaves the room, Jack performs the action Tom had suggested. However, Jack is not successful – an error appears in the code. Working with Xware’s affordance of high rehearsability, he keeps working on link N, trying different actions (i.e. modifying parts of the code). However, he is not being successful (an ‘error’ message appears on the screen). Frustrated, Jack tries out different approaches to solving this technical problem that has come up. The first author, sitting next to Jack in the same office, is able to view Jack’s actions on his computer screen. These actions include modifications to the programming code. Each time a specific action is not successful, Jack thinks it over, reflects on previous actions taken and tries out something different. All this involves further modifying the programming code. After several attempts, having gone through an iterative process of micro problem solving to ensure desired functionality is achieved, Jack is successful – his programming code changes at that point of the integration process are accepted (the ‘error’ message has disappeared). At this point, working with Xware’s affordance of presence awareness, Jack alters, once more, the boundaries of communication, switches the screen to public, disables the mute option, and demonstrates what he has done to the other two PT members. The reconstructed front region allows artifacts to become commonly shared, and new opportunities for dialogical interactions to emerge.
Carrying on interacting with Jack (lines 4–6), Mark is able to be of assistance to him since Xware allows Mark to monitor Jack’s actions synchronously on the screen. Since Mark views the same screen as Jack does, he now sees how Jack is actually playing with Xware’s affordances and guides him accordingly to perform specific actions (lines 4–6). Jack further rearticulates his thoughts (line 7), in light of Mark’s reply. However, Jack does not seem to follow Mark’s reasoning, so Jack continues the conversation with Mark, while, at the same time, wondering about the part of the code Mark has specified. Mark responds again to Jack’s utterance (line 8), also prompted by his own earlier utterances and those of Jack. Mark guides Jack with more detailed directions and specifies the M link, on which Jack must work. Jack follows Mark’s guidelines, but continues to express his puzzlement (lines 9–10). Any incorrect actions or wrong characters added are immediately marked red by Xware, producing an error message on the screen. As before, this affordance enables team members to reflect on their actions and potentially re-articulate their current understandings. Thus, when at this point an ‘error’ message appears on screen, it becomes an artifact that causes some puzzlement, and, as a result, Jack avoids now repeating the specific action. Jack realizes that he has to perform further changes in the programming code of the relevant link to solve the new problem that has come up. After he has done these changes, Jack manages to overcome the technical problem (lines 9–10). Having, through his dialogue with Mark and the ‘error message’ appearing on the screen, his attention drawn to aspects of the process, Jack has a richer account of his experience, since he appears now to understand what he has been doing (‘right!’, line 9).
Mark continues to explain his reasoning to Jack (line 11), to ensure that the ‘life cycle’ runs smoothly. For this to happen, Jack needs to tackle a few remaining programming code issues that keep impeding the integration of applications S and W. Jack rethinks the situation at hand and reflects (line 12) whether he should try a different approach. With Mark’s help (lines 11–13), Jack realizes that he had better perform some further changes on application W, rather than application S – a new distinction has thus been created. Jack appears now to incorporate Mark’s reasoning into his own and, having moved to application W, performs other minor changes in the relevant part of the programming code. Xware still produces error messages, visible to all, in the front region. Everyone is silent, watching Jack perform these changes in the programming code or, more likely, as Rob and Mark admitted in an interview later, doing other things in their own back regions (e.g. responding to emails). Jack reflects on these error messages and performs certain corrections to the code, until he manages to create a template (lines 15–16), which is the first indication that the two independently existing applications (S and W) have been successfully integrated into the new application X. Subsequent testing of the template will show whether this is the case.
We continue our exploration of M3 by focusing on the rest of the conversation in Chat 1.
18. Rob: …So we should have had a quality plan 8 for application W? … (Pause)
19. Jack: No. That was just a patch 9 …It’s linked to unit testing 10 there as well…I
20. think that looks somewhat simplified from the unit test documents…
21. Mark: It would create a unit test, however we would always have to create
22. another one…You go through secondary sort of unit…
(Testing starts. Jack points to and explains specific functions as they go through testing. He performs any necessary modifications)
23. Jack: Yes, that’s the unit test 1 …
(Testing ends. ‘Passed’ message appears on screen)
In this excerpt, Rob articulates an utterance (line 18), based on his understanding of the need for a quality plan. When Jack replies (line 19) ‘that was just a patch’, the apparent answer to the question ‘Who is doing the talking?’ is Jack. However, both the Xware’s and teleconference’s affordances of silent interactivity and presence awareness, as well as the email’s high reprocessability (due to the possibility of archiving emails), enable Jack to cross the communicative boundary of M3 and engage in his back region in quasi-dialogues with invisible others. He does so by revisiting past dialogues with minimal interruption to the front region activities, before replying to Rob. Thus additional voices are implicated in the production of Jack’s individual utterance.
Specifically, prior to the meeting at hand (M3), emails had been exchanged between three McKay project managers and Jack (Chat 3) to discuss whether ‘application W’ should be launched as a ‘patch’ (i.e. an updated extension of the existing software) or as a new product:
Ok the application W will it be a full release?
When we say full release we don’t mean actually full install from blank
I think this is a patch. We aren’t doing a product here
Can I just add something in there? I agree with you it can be done as a patch but application W is a beautiful piece of s/w that it would be a shame to lose the opportunity of getting sold and delivered now if we had S as well as 3.10
The question is are we going to have two releases?
Are we all agreeing that there are two separate things here and in order to satisfy the requirements we just need this patch that can be done very rapidly?
All right, that’s the information we need
At the beginning of this email exchange, Jack asks if W should be a ‘full release’ product (email 1). Tom is the first one who uses the term ‘patch’ to describe W (email 2), a term that is later appropriated by Ian and Jack. Thus, when in the virtual conversational context of M3 (Chat 1) Jack uses the term ‘patch’ to refer to application W (line 19), his statement ‘application W is a patch’ was very likely influenced by the earlier email exchange he had had with Tom, Colin and Ian, whose voices are now hidden in Jack’s statement.
When it comes to testing the newly created template, Jack performs the first test in Xware, allowing the other two participants to monitor his actions and reflect on any modifications he makes synchronously. This is possible due to the high presence awareness and the high velocity of feedback afforded by Xware. More specifically, the template is a new artifact that allows all team members to interact with it and with each other. Messages pop up on the Xware screen alerting team participants to specific parts of the programming code. When these messages highlight specific parts of the code in red, Jack can immediately track the problems and reflect on them by editing the code. When programming code is highlighted in blue, Jack verifies that this unit of code is working. If the test is successful, Passed! appears on the screen, otherwise, Failed! does. Team members ‘talk to’ the new template by ‘operating on’ it (in this case, testing it), but also the new template ‘talks back’ (Schön, 1987, p. 31) to team members by presenting to them a new state of affairs. Thus, the new artifact (the template) provides further opportunities for quasi-dialogical interactions: it embodies what individuals already know in terms of testing integrated applications and, at the same time, invites them to further interact with it, reflecting on and modifying any parts when this is considered necessary. When all code is highlighted in blue, a Passed! message appears on everyone’s screen indicating successful completion of testing. This means that the template now constitutes the new application X, which can now become part of the document management and collaboration software system release, since it has been successfully tested and may possibly be further used in new ways.
Further Interpretation of Research Findings
Throughout the research findings presented so far, three types of dialogical interactions can be identified: (a) dialogical interactions with real others, (b) quasi-dialogical interactions with invisible others and (c) quasi-dialogical interactions with virtual artifacts. Below, we expand on each one of them to further theorize how they are shaped in the multimodal polysynchronous setting under study. We will pay particular attention to (c), since quasi-dialogical interactions with virtual artifacts was a pervasive feature of the use of ICTs in the team under study.
Dialogical interactions with real others
While in a face-to-face communication, physical proximity enables the manifestation of the full bandwidth of physical senses and psycho-emotional reactions, in a multimodal polysynchronous meeting such as the above, individuals have fewer symbolic cues at their disposal (Dixon & Panteli, 2010; Lee, 1994; Markus, 1994; Panteli, 2004). Throughout their communication, PT members actively construct a context to interpret what has been going on, by altering the boundaries of communication when they see fit. While interacting with others, Jack had the opportunity to move in and out of front and back regions, making use of presence awareness and silent interactivity afforded by the particular ICTs. The result was that, concurrently with the dialogue at hand (Chat 1), various other side dialogues (with real, invisible others, or virtual artifacts) occurred (shown in Chats 2 and 3). Team members could bring information from individuals outside the meeting into the meeting, or revisit past emails to retrieve critical information for the current meeting, or respond to emails not always related to the meeting at hand. While some of these practices may increasingly be found in a face-to-face meeting as well, ICT-mediated communication appeared to allow more time between responses, thus affording communication high reprocessability and enabling Jack (and possibly the other two members) to have greater control over his actions. The observed practice of muting out Rob or Mark, or both, resulted, at times, in uneven distribution of information and even team member exclusion (Hinds & Bailey, 2003), but also showed how altering the boundaries of communication encourages new opportunities for dialogical interactions to emerge.
Quasi-dialogical exchanges with invisible others
In a multimodal polysynchronous context, individuals appear to each other as ‘indices’ – namely, as voices, images, or words on screens. Such ‘indices’ enable individuals to construct multiple realities and try out different versions of self in a way that is inhibited in conditions of face-to-face interaction (Hayles, 2001; Whitty, 2003; Wroe, 2002). In the PT under study, we identified instances during which making use of multiple front and back regions enabled Jack to bring into play others’ voices, something which is done in a far more limited way in a face-to-face context (Dodgson et al., 2013).
Thus, the opportunity to engage in hidden dialogues with currently absent others was enhanced. We saw how Jack, at times, fell back to his own back region, thus making it possible for quasi-dialogical interactions to occur between himself and invisible others (like Tom). Making use of silent interactivity, multicommunicating and high reprocessability afforded by the media used, Jack had the opportunity and the time, before a response was required, to cross the communication boundary of Chat 1 (Dennis et al., 2010, p. 865) and reflect on what had been discussed so far, view emails, ask for Tom’s help, revisit past recorded dialogues, re-run them, reinterpret them and construct new accounts with minimal interruption to the front region activities. By drawing on, and appropriating others’ past voices, his responses were more deliberately constructed compared with face-to-face communication (Nicolini, 2007).
Quasi-dialogical exchanges with virtual artifacts
The virtual artifacts PT members create in the course of carrying out their tasks (be they ‘error’, ‘passed’, or ‘failed’ messages, a particular path 11 selected, or linking applications S and W) constitute ‘epistemic objects’ (Knorr-Cetina, 2001, p. 181; Nicolini, Mengis, & Shaw, 201, p. 618; Stigliani & Ravasi, 2012, p. 1234; Ewenstein & Whyte, 2009, pp. 9, 12) or what Bamberger and Schön (1991, p. 192) call ‘reference entities’. Epistemic objects serve as knowledge carriers, having an ‘ambivalent ontology’ (Kallinikos et al., 2013, pp. 357–8): they are both stable and mutable. They are stable insofar as they constitute ‘a materialized “log” of the making process’ (Bamberger & Schön, 1991, p. 192), thus incorporating what actors focally know thus far. And they are mutable in so far as they incorporate knowledge individuals are not focally aware of, hence they are open to further development (Knorr-Cetina, 2001, p. 181; Nicolini et al., 2012, p. 618; Ewenstein & Whyte, 2009, p. 12). Thus, what is most distinctive about epistemic objects is the incomplete knowledge they incorporate (Ewenstein & Whyte, 2009, p. 12). While, on the one hand, they display what individuals focally know, on the other, they ‘embody what one does not yet know’ (Nicolini et al., 2012, p. 614). Lacking completeness, epistemic objects trigger a desire to keep exploring and developing them. In our case, new distinctions were made and thus new knowledge emerged when, in the process of carrying out their tasks, Jack, Mark and Rob made use of silent interactivity, high rehearsability and presence awareness, which the ICTs they used afforded them. While the three PT members interacted with the virtual epistemic objects created, the latter ‘talked back’ (Schön, 1987, p. 31) to them and helped them focally see things they could not see before (e.g. work on several links by altering programming code accordingly), thus helping them to re-articulate the problems they had been dealing with (i.e. running the test) and to form new understandings (Yanow & Tsoukas, 2009, pp. 1348–9).
When interactional experiences are turned into ‘texts’ (Hardy, Lawrence, & Grant, 2005, p. 60), their reprocessability is enhanced, thus becoming available for further ongoing communication. Although textualization is an important feature of all human communication, it is particularly enhanced in an ICT-mediated context. Textualization involves the ‘(incomplete) textual rendering’ (Taylor & Van Every, 2000, p. 230) of experience. When experience in a textualized form, namely in an explicitly articulated form that has been recorded and made available, is fed back to communication, it helps elicit the drawing of further distinctions. Several types of multimodal polysynchronous communication shorten the gap between textualization and its ongoing incorporation into acts of communication, enabling greater interactivity between one’s thinking process and outcomes and, thus creating opportunities for reflexivity (Giddens, 1990; Tsoukas, 2009a).
Thus, in our case, communicating via email or a synchronous ICT-mediated chat is a process of textualization. Sharing and ‘operating on’ files dynamically through collaborative software enabled the PT members in our study to turn their experience into texts available for further reflection. The texts created record what has been stated or done and thus provide members access to these texts for further reflexively structured communication. The textual rendering of experience captures more than what team members are focally aware of. Textualization makes the knowledge team members possess, but do not yet focally know, available to them for further reflexive reprocessing.
From the above, it follows that the process of quasi-dialogical interactions with virtual epistemic objects consists of three steps. First, PT members, concerned with a particular problem, have several opportunities for creating or modifying virtual epistemic objects, including textual records of their experiences. However, team members do so without being focally aware of how they do so (Polanyi, 1962, pp. 61–2; Tsoukas, 2005, p. 150). Second, the virtual epistemic objects created ‘talk back’ to team members, insofar as they cause the latter to become focally aware of what they had previously not known. And third, team members re-articulate aspects of what they now focally know in the virtual epistemic objects.
Discussion
The aim of this paper has been to advance understanding of how organizational knowledge is created in a multimodal polysynchronous context, through empirically exploring the dialogical interactions of a software development project team, housed in a UK-based international firm. In summary, our argument has been the following.
In line with an interpretive view of knowledge, and especially Tsoukas’s (2009a) model of dialogical knowledge creation, we have taken organizational knowledge creation to involve the making of new distinctions regarding a task at hand. We have argued that new distinctions may be developed through individuals engaging in dialogical interactions with real others, with invisible others and with artifacts. In a multimodal, polysynchronous context, in particular, we have demonstrated that dialogical interactions are shaped in distinctive ways.
More specifically, we have shown that: individuals (a) can move in and out of multiple front and back regions during dialogues with real others, thus availing themselves of the opportunity to draw on multiple voices beyond those they are focally engaged with; (b) alter the boundaries of communication and engage in quasi-dialogical interactions with invisible others, by actively working with the affordances of technology used; and (c) can operate on representations and manipulate them, as they engage in quasi-dialogical interactions with virtual epistemic objects, including textualized versions of their own personal experiences. We have shown that these types of dialogues and quasi-dialogues generate ‘strangeness’, as individuals try to reflect, understand better, and assimilate earlier responses to real others, invisible others and virtual artifacts, and by doing so are stimulated to make new distinctions.
We have reported several new distinctions PT members made, in their effort to integrate applications S and W into, and then test, new application X. More specifically, we have shown that through interactions with each other, invisible others and artifacts, team members come to realize and, therefore, draw the following new distinctions: (i) working on link N is a mistake; (ii) changes on links Y and M must be performed; (iii) Jack needs to perform actions on application W rather than application S; and (iv) programming code errors are spotted and subsequently reflected upon and corrected. It should be mentioned that these distinctions have been presented here not so much because of their major organizational importance, as for analytical purposes: to demonstrate how, in a multimodal polysynchronous context, new knowledge, no matter how mundane, may come about. As argued earlier, the organizational importance of this distinction is a separate issue (Tsoukas, 2009a, p. 951).
Moreover, as is known from ethnomethodology and dialogical psychology, a detailed micro-analysis of dialogical interactions reveals, among other things, the very ordinariness of novelty (Boden, 1994; Llewellyn & Hindmarsh, 2010; Mercer, 1995; Sawyer, 2003; Shotter, 2011). For example, Schön and Wiggins’s (1992) study of architectural designing demonstrates how generative thinking inheres in dialogically shaped ordinary activities, aided by the use of artifacts (see also Boland, Lyytinen, & Yoo, 2007; Ewenstein & Whyte, 2009). In this paper, we have arrived at a similar conclusion: ordinary dialogical interactions may lead to new understandings – fresh distinctions – and, therefore, to new knowledge. More importantly, we have shown how this happens in an organizationally based multimodal, polysynchronous context of communication that blends face-to-face with virtual interactions.
Some scholars have argued that the experience of ICT-mediated interaction tends to be relatively impoverished, insofar as it removes embodied presence (Dreyfus, 2001, ch. 3). While there are elements of truth in this claim, we have found that multimodal polysynchronous communication opens up other opportunities as well (Dennis et al., 2008; Goel, Johnson, Junglas, & Ives, 2011; Leonardi, 2011). Since a virtual context of interaction is more discursive than a face-to-face one (i.e. more discursive work is needed to establish what the context is), it offers individuals more opportunities to draw on their textualized experience. While the holistic sense of embodied interaction may be missing, individuals strive to make up for this by discursively constructing a shared front region while, at the same time, working with technology’s affordances, to participate in several other front and back regions, thus creating opportunities for multiple dialogues with real others (given the space), invisible others (given the time) and artifacts (given technology and the textualized record of communication). Our findings enrich Dennis et al.’s (2010) findings concerning ‘invisible whispering’ in collaborative decision making, by showing how the more asynchronous the communication, the greater the intervals between utterances, and the greater the ability of individuals to ‘rehearse’ and ‘reprocess’ what and how they are doing and bring into play more voices. Individuals can move to their back regions and draw on others’ voices (real and/or invisible) before they reply, with minimal interruption to the front region activities.
A virtual context creates more thinking space for individuals, by (a) enabling them to be more deliberative about what they do and how they do so; (b) helping them mobilize more voices than those present in the front region; and (c) facilitating the textualization of their experience and thus offering more opportunities for reflexive interactions with artifacts. While in a face-to-face interaction it is more difficult to revisit a past dialogue without interrupting the flow of the interaction, in a virtual context, communication tends to be textualized and thus available for further reprocessing and rearticulation. The latter enables, potentially, greater reflexivity.
As noted earlier in the paper, for Tsoukas (2009a) reflexivity is the force driving self-distanciation and is thus critical for the making of new distinctions. Yet, reflexivity is exercised in more complex ways in a multimodal polysynchronous setting than in a unimodal synchronous one, such as face-to-face. Writers on reflexivity have tended to underscore the retrospective self-questioning that reflexivity induces in individuals (Cunliffe & Easterby-Smith, 2004, pp. 38–40; Tsoukas, 2009a, p. 944). As we have shown here, a multimodal polysynchronous setting enables such self-questioning to be more than a merely retrospective exercise, since it affords participants more possibilities of encountering various degrees of dialogically generated ‘strangeness’ in real time, thus requiring reflection in (not merely on) action (Yanow & Tsoukas, 2009). Moreover, a multimodal polysynchronous setting provides participants with a broader repertoire of tools to be self-questioning with than a face-to-face setting, as, for example, through affording the mobilization of multiple voices not present in the front region and opportunities for reflecting on inherently incomplete virtual artifacts. Our study of dialogical interactions with virtual artifacts sheds further light on the use of artifacts in organizational knowledge creation. Prior research into the use of ‘boundary’ or ‘epistemic objects’ and of ‘cognitive artifacts’ or ‘scaffolds’ (Ewenstein & Whyte, 2009; Majchrzak et al., 2012; Stigliani & Ravasi, 2012) has largely taken place in face-to-face organizational contexts. Our findings suggest that the ability to share artifacts in a virtual context can create further opportunities for reflexive dialogical interactions and thus for making new distinctions. As we have shown, to be effective in ICT-mediated communication, individuals need to rely more on their own resources in attempting to address a problem. Particular features of ICTs, such as silent interactivity, high rehearsability and presence awareness, afford participants the possibility of taking advantage of the availability of multiple front and back regions to simultaneously participate in several dialogues. The turning of interactional experience into texts potentially makes interaction with virtual artifacts more dynamic, offering participants more opportunities to become aware of knowledge they are not focally aware of and, thus, the possibility of further reflexive reprocessing.
By focusing on a mainly virtual team, whose workings were interlaced with occasional face-to-face communication processes, our paper illustrates the increasing trend in organizations for face-to-face and virtual contexts of interaction to be interwoven. Dennis et al.’s (2010) study of ‘invisible whispering’, through the use of instant messaging during face-to-face, telephone and computer-mediated team meetings, is an example of how communication is increasingly conducted simultaneously via a variety of media, in which virtual and face-to-face interactions coexist and are mutually influenced. Our findings contribute to a richer understanding of knowledge creation, since they not only explore ICT-mediated knowledge creation (a topic largely neglected in the relevant literature) but go beyond the face-to-face vs. virtual communication dichotomy to embrace complex forms of dialogical interaction, in which various modes of communication (virtual and face-to-face) as well as various types of synchronicity in communication are blended in the work of work-based teams. In so far as the latter increasingly interweave face-to-face interaction with ICT-mediated communication (e.g. laptops, tablets, smart phones, etc.), how such interweaving takes place, with what effects, becomes an important topic of research.
Future research can further explore how multimodal polysynchronous communication impacts organizational knowledge creation and, more broadly, learning, by considering different communication media, different types of tasks and different team compositions from those considered in this study. Information systems research so far has highlighted how individuals develop perceptions of media richness for a given channel based on their experiences (Carlson & Zmud, 1999), how even lean media can be used for complex communication (Markus, 1994) and how media capabilities relate to the communication needs of a task and communication performance (Dennis et al., 2008). While a limitation of our study is that we have focused on a team with a specific type of synchronous audio communication (i.e. teleconference with disabled visual cues), different communication media (e.g. Skype) and varying degrees of synchronicity generate different affordances and types of dialogical exchanges and thus effects, which need to be compared and contrasted. Thus, what difference would it make to how knowledge differences are transcended in cross-functional teams working entirely virtually, or in conditions of multimodal communication with varying degrees of synchronicity? How would ‘co-creating a scaffold’ or ‘dialoguing around the scaffold’ (Majchrzak et al., 2012) be conducted differently?
Furthermore, it would be interesting to explore knowledge creation in multimodal polysynchronous contexts within multiple organizational arrangements. In our study, all members of the project team were members of the same organization. However, as instances of organizational collaboration proliferate, it is increasingly common that project team members are drawn from multiple organizations (Bruns, 2013; Majchrzak et al., 2012). How are dialogical interactions shaped in such settings? For example, how would the five practices Majchrzak et al. (2012) have identified unfold, if teams, working multimodally and polysynchronously, were drawn from different organizations?
Finally, a topic that requires further research is the role of power in knowledge creation processes. We have only alluded to this here in addressing the manipulation of front and back regions by team members, but it is clearly crucial (Contu & Willmott, 2003). How are asymmetrical dialogical exchanges mediated by multimodal polysynchronous environments, with what effects? How do participants manipulate their front and back regions, for what purposes? More research on such questions will enrich our understanding of how politics shapes dialogical interactions.
Footnotes
Acknowledgements
We are grateful to former Editor-in-Chief David Courpasson and to the three anonymous reviewers for their extremely useful comments. Without the challenging and insightful comments we received, the paper would not have taken the particular shape it did.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Notes
Author biographies
) is the Columbia Ship Management Professor of Strategic Management in the University of Cyprus, Cyprus and a Distinguished Research Environment Professor of Organization Studies at Warwick Business School, University of Warwick, UK. He obtained his PhD at the Manchester Business School (MBS), University of Manchester, and has worked at MBS, the University of Essex, the University of Strathclyde, and at the ALBA Graduate Business School (Greece). He has published widely in several leading academic journals. He was the Editor-in-Chief of Organization Studies (2003-2008) and has served on the Editorial Board of several journals. He was awarded the honorary degree Doctor of Science by the University of Warwick in 2014. With Ann Langley he is the co-founder and co-convener of the annual International Symposium on Process Organization and co-editor of the Perspectives on Process Organization Studies, published annually by Oxford University Press. His research interests include: knowledge-based perspectives on organizations; the management of organizational change and social reforms; organizational becoming; practical reason and the epistemology of practice; and meta-theoretical issues in organization theory. He has co-edited several books, including The Oxford Handbook of Organization Theory: Meta-theoretical Perspectives (Oxford University Press, 2003) (with Christian Knudsen). He is the author of: Complex Knowledge: Studies in Organizational Epistemology (Oxford University Press, 2005) and If Aristotle were a CEO (in Greek, Kastaniotis, 2012, 4th edition). He writes regularly for major Greek newspapers commenting on political and social issues.
E-mail:
