Abstract
Video has become a methodological tool of choice for many researchers in social science, but video methods are relatively new to the field of organization studies. This article is an introduction to video methods. First, we situate video methods relative to other kinds of research, suggesting that video recordings and analyses can be used to replace or supplement other approaches, not only observational studies but also retrospective methods such as interviews and surveys. Second, we describe and discuss various features of video data in relation to ontological assumptions that researchers may bring to their research design. Video involves both opportunities and pitfalls for researchers, who ought to use video methods in ways that are consistent with their assumptions about the world and human activity. Third, we take a critical look at video methods by reporting progress that has been made while acknowledging gaps and work that remains to be done. Our critical considerations point repeatedly at articles in this special issue, which represent recent and important advances in video methods.
Keywords
New technologies often lead to advances in scientific research. For example, both the telescope and the microscope were invented in the 1600s, enabling scientists to suddenly see what had always been invisible to their unaided eyes. Within just a few decades, both technologies spread across Europe and led to dramatic increases in scientific knowledge about things distant (e.g., moons and comets) and small (e.g., biological structures and microorganisms). A more recent example—and the focus of this article—is the growing impact of video technology on the social sciences. Within just a few decades, video has become a prominent feature of social life and a tool of choice for many researchers within a variety of disciplines, from psychology to anthropology.
The prevalence and popularity of video technology are evident all around us. Recording equipment is relatively inexpensive and rather easy to use. Most people carry mobile phones with an ability to create and share digital video, producing a steady stream of messages and memes that pervade our social relationships and organizations, as seen on the Internet. Many business organizations are now using video as a workplace tool for internal and external consumers: video conferencing, quality control, internal knowledge management, training, and more. We also see this technological trend within public organizations, such as law enforcement agencies and police departments, which have started to record interactions with citizens: So-called “body cameras” have become part of the uniform for many police officers because people think that video recordings provide an objective and verifiable representation of what “really” happened. With the spread of video technology has come a proliferation of video data. Some organizations are creating archives of videos that may be an especially valuable resource for researchers. Access to video data thus complements a growing interest in the visual dimension of organizational life (Meyer, Höllerer, Jancsary, & van Leeuwen, 2013), and allows us to explore some of the issues that visually-based data, which may be archival or collected in real time, pose for management and organizational research (Ray & Smith, 2012).
Video technology is changing social science—not only how we do it but also how well we do it—because video enables researchers to look at the world differently. Depending on the positioning and setting of cameras, recordings may literally provide a different view or perspective of human activity, confirming, complementing, or contrasting what the researchers themselves can see. Video recordings constitute a permanent record that analysts can examine repeatedly and that peers can watch and verify, helping to boost the accuracy and validity of research findings. Recordings can also be slowed, zoomed, replayed, and juxtaposed, enabling analysts to look more closely and precisely, to see what has been invisible to their unaided eyes. Thus, video enables analysts to identify new patterns of human behavior, toward generating new scientific accounts of organizational activity—that is, who did what, when, why, and how.
This article provides an introduction to video methods in organization studies. First, we situate video methods relative to other kinds of research, suggesting that video recordings and analyses can be used in lieu of or to supplement other approaches, not only observational studies but also retrospective approaches such as interviews and surveys. Second, we describe and discuss various features of video data in relation to ontological assumptions that researchers may bring to their research design. Video technology involves both opportunities and pitfalls for researchers, who ought to use video methods in ways that are consistent with their assumptions about the world and human activity. Third, we take a critical look at video methods by reporting progress that has been made while acknowledging gaps and work that remains to be done. Our critical considerations point repeatedly at articles in this special issue, which represent recent and important advances in video methods. We also recommend future directions for video methods, which will continue to be popular and useful as new and allied technologies become available.
Video Methods: Dimensions of Dynamism and Modality
Video methods occupy a distinct place in the universe of approaches to social science and organizational research. Most obviously and practically, video methods are distinct from other popular approaches (e.g., ANOVA, regression, etc.) in that the data gathered in video-based research are largely delineated by the technology used to conduct it. The video-based approaches we describe here can be used with other exploratory methods, such as grounded theory (Bryant & Charmaz, 2007; Suddaby, 2006), ethnographic research (Zicker & Carter, 2010), or case studies (e.g., Yin, 2013), and indeed may supplement other data collected such as the field notes and interviews typical in such methods. However, video constitutes a particular type of data that entails specific collection and analysis issues that require us to go further, even in these known methods, as explained in this article and special issue (e.g., Slutskaya, Game, & Simpson, 2018). As video technologies continue to evolve and advance, they will give researchers additional and sometimes better tools for recording and analyzing human behavior and organizational activity to complement and extend these existing research methods. In particular, a more thoughtful and nuanced delineator of video methods is the nature of the data that are collected and analyzed: dynamic, audiovisual data. 1 We use dynamic to denote the ever-changing flow of data that one is capturing, in contrast to methods such as archival and survey, where the data are relatively static. Audiovisual refers to the modality of data. Different methods can capture different modalities of data including textual, audio, visual, or audiovisual data. 2 The dynamic-static element of the data, as well as its modality, are useful to keep in mind when comparing video methods with other methods, including other visual methods such as the use of photographs (e.g., Ray & Smith, 2012).
Video methods have a kinship with ethnographic methods when the latter involves extended participant observation in the field with a particular group or organization (Fetterman, 1998; Pratt & Kim, 2012). Video methods also bear a kinship to photographic methods, which emphasize the visual and perspectival aspects of organizations (Meyer et al., 2013). For example, in Pratt’s (2000) ethnographic study of Amway distributors, he engaged in extended observations over nearly a year (dynamic, visual), collected interview data (dynamic, audio), and took pictures (static, visual) of various Amway-related events. Thus, he was able to observe changes over time and capture the process whereby these distributors recruited and trained their members. In a similar vein, laboratory experiments and other observational (e.g., nonobtrusive, nonparticipant) studies also rely on the collection of dynamic, visual data. It should not be surprising, therefore, that video methods have been incorporated into these and similar methods that utilize observations, such as in video ethnographies (e.g., Fraher, Branicki, & Grint, 2017; Jarzabkowski, Burke, & Spee, 2015), microethnographies (e.g., LeBaron, 2005; Streeck & Mehus, 2005), video ethnomethodologies (e.g., LeBaron, Christianson, Garrett, & Ilan, 2016), and video-experimental studies (e.g., Kaplan, LaPort, & Waller, 2012; Waller, Zellmer-Bruhn, & Giambatista, 2002).
On the dimensions of modality and dynamism, video methods contrast strongly with some of the most popular research methods in organization studies. To illustrate, video methods are quite different from surveys, archival studies, and content analyses that involve static data in the form of text. Surveys usually rely on a subject’s ability to remember organizational phenomena that were dynamic when they originally occurred but not later when the subject is providing a retrospective and text-based account (e.g., Henry, Moffitt, Caspi, Langley, & Silva, 1994; Kraiger & Aguinis, 2001). People are not always cognizant of what they do and how they do it, even in the moment of performance, and they may be even less aware in hindsight. Archival studies may involve annual reports, memos, and other texts that have important information about organizations (e.g., Covaleski & Dirsmith, 1988; Jick, 1979), but such texts rarely capture visual and rich information about the dynamics of organizational activity. Content analyses that deliberately focus on organizational documents and texts—sometimes called “mute evidence” (Hodder, 1994)—only sometimes attend to dynamic interplay within the four corners of a particular text or between texts (e.g., Krippendorff, 2004; Weber, 1990). Similarly, discourse analyses tend to focus on transcriptions of dialogue or conversation, which may be highly dynamic when they were captured, but are usually not focused on visual behavior (e.g., Tracy, 2012). Video methods also differ from interviews that capture only dynamic auditory data, rather than audiovisual (e.g., Knoblauch, Schnettler, Raab, & Soeffner, 2012). With regard to methods that capture visual data, video methods differ from the analysis of pictures or other visible artifacts (e.g., corporate logos) in that the latter involves only visual data that are static in nature (Hatch & Schultz, 2017).
Although video methods do differ from other methods more commonly used in organizational studies, it is important to note some similarities as well. To begin, despite differences in the dynamism of the data they collect, nearly any method can use the data they gather to ask and answer dynamic, process-oriented research questions. To illustrate, while the data themselves may be static, historical archival studies often cover a broad period of time and thus can be used to examine dynamic changes (e.g., Anteby & Molnar, 2012); and panel designs utilizing surveys, while often utilizing a shorter time frame, can also be used to track dynamic processes (e.g., Ballinger, 2004). By contrast, while interviews in inductive studies may be used to get at process (e.g., Pratt, Rockmann, & Kaufmann, 2006), they may also be used to explore nondynamic processes, such as when they are used to tailor a questionnaire for a later study (e.g., Detert & Edmondson, 2011). In addition, each of the data-gathering methods can be analyzed in different ways, either inductively or deductively. And ultimately, the “output” of data for each of these methods is usually text. To illustrate, much of the data collected in interviews ultimately get translated into transcripts. As we discuss later in the article, this is due—at least in part—to the nature of our research journals, which have a long tradition of text-based scholarship.
Beyond the dimensions of data modality and data dynamism, we emphasize that video methods are highly adaptable—ranging from how data can be collected to how they can be analyzed. With regard to collection, video data can be collected by the researchers themselves, by participants or research subjects, or by someone else who is not otherwise involved in the research project. Moreover, video data can be collected as the event or activity being studied is happening (synchronously) or after the events have already occurred (asynchronously). With regard to data analysis, video methods can utilize a variety of quantitative or qualitative approaches. Video methods can even differ on who analyzes the data. While data are typically analyzed by the researcher or the research team, collaborative video documentaries include participants as analysts (cf. Bartunek, 2007; Eden & Huxham, 1999; Jarrett & Liu, 2018; Lewin, 1946). The adaptability of video methods, in both data collection and data analyses, is discussed and demonstrated by the articles in this special issue, which now becomes the focus of our discussion.
Video Methods: Why and How?
This section of our article addresses two fundamental questions by providing a basic primer on video methods as applied to organizational studies. The first question is, “Why would you want to capture video (dynamic, audiovisual data) in a research project?” With all research projects, the methods employed should match the researcher’s investigative interests and research questions. At a fundamental level, if video methods are about gathering dynamic and audio visual data, then one might expect that the research questions would involve both process (dynamic) and things that are audible (audio) and observable (video). For example, here are a few research questions from recent video-based publications in organization studies: When physicians hand off patients at shift changes, how do they know when they have transferred enough information? (LeBaron et al., 2016) When reinsurance traders execute deals to fill their strategic portfolios, how is their strategic work accomplished through an orchestration of material, bodily and discursive resources? (Jarzabkowski et al., 2015) How do the emotional displays of top management team members shape their strategizing process during meetings? (Liu & Maitlis, 2014) How do laughter patterns among teams affect their communication effectiveness? (Wang, Doucet, Waller, Sanders, & Phillips, 2016) How is creative work coordinated in groups of modern dancers? (Harrison & Rouse, 2014)
Notice that each of these research questions relates to how members of organizations behave or interact (dynamic process) during face-to-face engagements (audiovisual). Video methods capture details of both sights and sounds, both verbal and nonverbal behavior, which can help to answer many research questions. Some of these research questions may be inductive. For example, the Harrison and Rouse (2014) article shows how researchers can use video data in conjunction with grounded theory. Indeed, given that a key strength of inductive field methods is their ability to capture realism through context-rich data (McGrath, 1981; Pratt & Bonaccio, 2016), video methods may be used to bolster data collection efforts. Other research questions may be more deductive: For example, Bartel and Saavedra (2000) used video recordings to show that and how work group moods may be manifested behaviorally.
If you decide that video methods are right for your research interests and questions, you will then need to ask the second question: “How do I go about designing and implementing a study using video methods?” The opportunities of video methods come with potential pitfalls. Too often, researchers fail to appreciate that their most basic cinematic decisions constitute theories about organizations and members and how these should be studied. As Christianson (2018) points out in her review of video research in top management journals, extant work using video recordings has rarely discussed camera placement as a theoretical choice. We explicitly assert that researchers make theoretical decisions when they locate, point, and begin recording with a camera. And by using a camera to frame, focus, or crop a particular scene, researchers have already begun to analyze behavior and activity in progress. In this way, video recording is both empirical and interpretive work as researchers help to construct the very objects of their analyses. With secondary data—that is, when researchers use video data that someone else has recorded—the past intentions of the operator may be imposed on the present researcher who is unaware. Researchers using photographic methods (Ray & Smith, 2012; Warren, 2002, 2005) have emphasized similar considerations, arguing that “a photograph is not taken, but rather is made” (Ray & Smith, 2012, p. 290). Scholars planning to use video data, either inductively or deductively, should be thoughtful and deliberate in deciding where to place their camera and what to include (and exclude).
Organization scholars also need to be aware of underlying assumptions associated with various methods available. For example, theories with roots in psychology (e.g., attribution theory) may be incompatible with video data and/or methods informed by anthropology (e.g., microethnography). The practices that constitute various methods (recording, archiving, transcribing, analyzing, interpreting, triangulating, coding, counting, reporting, etc.) come imbued with the ontological, epistemological, and practical assumptions of their parent disciplines. Hence, a “one-size-fits-all” approach, analytic “borrowing,” or citing precedents from other disciplines will not be sufficient for an inherently interdisciplinary field such as management and organization studies without clear guidance about which methods apply to which types of research questions and disciplinary approaches.
In the paragraphs that follow, we provide more information about the why and the how of video methods. Specifically, we describe four key features of video data that both enable and challenge social scientists: multimodality, embodiment, materiality, and sequence. These features are interrelated and overlapping, but each has a distinct emphasis and each represents reasons for using video methods and guidance about how to proceed. We provide a handful of references to our own work and to work in other fields where video has been used more extensively than in the field of organization studies. We encourage researchers to be aware of these features, especially the ontological import of these features, as they develop their research designs, taking advantage of the opportunities while avoiding the pitfalls of video-based work. This section ends with a succinct description (see Table 1) of various cinematic decisions and the ontological issues in play.
Cinematic Decisions Involve Ontological Assumptions.
Multimodality
The term multimodality has been used to describe different kinds of organizational activity. For several decades, experts in business operations have used the term multimodal to describe the movement of goods and products and people via two or more modes of transport: ships, planes, trains, trucks, drones, and so forth (e.g., Hammadi & Ksouri, 2014). More recently, experts in linguistics and semiotics have used the term multimodal when analyzing the audiovisual ensembles of our digital age, especially the texts and artifacts that emerge from organizations, such as advertisements on television and memes on the Internet that combine language, image, music, sound, texture, architecture, gesture, and so forth (Kress & van Leeuwen, 2001; Norris, 2004). In contrast, we use the term multimodal with regard to analyses of organizational behavior that can be captured on video recordings. We focus on language use and embodied interaction within the material contexts of organizations (Heath, Hindmarsh, & Luff, 2010; Stivers & Sidnell, 2005; Streeck, Goodwin, & LeBaron, 2011).
As we have noted, an obvious feature of video is that it combines both audible and visible channels, enabling people to both watch and listen to the behavior and activity of organizational members. Sometimes analysts deliberately parse these channels so that they can examine either the audio or the video in isolation. Video editing tools that are now popular and installed on most computers (e.g., Adobe Premiere and iMovie) make parsing easy, with separate on-screen displays for the video and audio tracks. Historically, psychologists have been particularly interested in visible behavior or so-called “nonverbal” communication as a form of emotional leakage that provides a window into the mind, such as Ekman’s (1973) famous studies of facial expressions and micro-expressions. Naturally, linguists have been particularly interested in spoken discourse, including the meaning and function of verbal behavior, such as Schiffrin’s (1988) well-known research on discourse markers. When these researchers parsed their subjects’ verbal and nonverbal behaviors, they proceeded in accordance with the ontological assumptions of their respective fields. They assumed that by isolating and analyzing particular audible or visible behaviors they could better understand component parts of human activity.
However, researchers who are working within other scholarly traditions may regard the artificial parsing of visible and audible behavior as ontologically problematic. For example, anthropologist Mead (1975) rejected nonverbal research as a “discipline-centric” neglect of vocal phenomena: She argued against Ekman’s (1973) claim that facial expressions have universal meanings, insisting that members of cultures derive meaning from facial expressions by relating them to the context in which they occur, which includes vocal behavior. Similarly, linguistic anthropologists have been critical of discourse analysts who ignore visible behaviors that may provide for the interpretation of language in use (Duranti, 1997). Thus, researchers who analyze video data need to proceed in ways that are consistent with their ontological assumptions about human and organizational activity.
Hypothetically, we can imagine different relationships between audible behaviors (such as those that might appear in a transcript) and visible behaviors that are co-occurring (such as those that might appear in still frames of video or “frame grabs”). At one extreme is the possibility that the audible and visible behaviors are redundant or mutually confirming—that is, they provide for the same interpretation or analytic finding. With this possibility, an analyst could examine either the audio, or the video, or both, and evidence would all point toward the same conclusion. Such alignment might help to strengthen research claims by enabling the analyst to triangulate across modalities. For example, Toraldo, Islam, and Mangia (2018) discuss and demonstrate the challenge of “elusive knowledges”—tacit, aesthetic, and embodied aspects of organizational life that are not well captured by traditional research methods. The authors use video data of a professional volleyball team to triangulate and thereby strengthen their research findings.
A different possibility, at the other extreme, is that the audible and visible behaviors are contradictory or that they provide for contrasting interpretations and findings, as when visible behaviors contradict what the transcription seems to indicate. For example, Jones and LeBaron (2002) showed that acts of leadership within a therapy group were visibly prompted by the therapists, despite the therapist’s spoken assertions that leadership was an individual achievement and a sign of an individual’s rehabilitation. In this way, a contradiction between audible and visible behavior went to the heart of this therapy program, and if the analysts had examined only a transcript of the therapy group’s dialogue they would have missed critical information about the organization’s practices. A video recording was indispensable to the success of the research project. Of course, the researchers did not know in advance of their recordings and analyses what they would find in their video data, so gathering video recordings of organizational activity from the outset—and not just audio—may be a prudent part of a research design.
In sum, multimodality is a key feature of video recordings and might be an indispensable part of a research project. Organizational activity is ontologically complex as people engage through a variety of modalities that must be altogether orchestrated: Talk, text, pictures, drawings, gestures, facial expressions, embodied maneuvers, and more can be recorded and analyzed. Some research interests and questions may require analysts to parse audible and visible channels so that they can be studied in isolation, but researchers need to be careful that they don’t destroy the very phenomena that they want to examine. Especially during face-to-face interaction, visible and audible behaviors are made meaningful through their co-occurrence or coordination, which provides for their mutual performance and interpretation. In the paragraphs that follow, we provide a more nuanced consideration of key features of video data and multimodality.
Embodiment
When people gather for the purpose of manual labor, knowledge work, or creative effort, their visible, palpable bodies are inseparable from their activities. Indeed, not much can happen within a conference room or workspace until people walk through the door, locate their bodies (standing or sitting), continually turn in one direction or another, assume various postures and points of view, make facial expressions and hand gestures—all in close coordination with what people are saying and deciding. Although the human body is absent or only implied in most organization research, video recordings can capture the body at the center of social interaction and organizational work.
The forms and functions of a typical video camera have an ontological kinship to the physical attributes of the human body. Obviously, the audiovisual register of video recordings relates to the sensory abilities of our bodies to both see and hear (multimodality). At the same time, just as the placement and orientation of our bodies determine what we see and hear, so too must video cameras be placed and pointed to determine what visual and auditory data will be recorded. People and cameras tend to “see” whatever happens in the direction that they are “looking” (orientation); they tend to “hear” things that are said nearby (location); and their location and orientation may change for the purpose of improving their seeing and hearing. Thus, video cameras may serve as a prosthesis for the forms and functions of the human body, enabling researchers to see, to hear, to take up a position, and to have a point of view. By recognizing the ontological kinship between people and their cameras, researchers can design projects and collect data that are consistent with their underlying assumptions about the nature of being human within organizations.
A key distinction in video-based research is whether the data provide an “insider” or “outsider” perspective. When researchers want to have an insider view of organizational activity, with more access to the lived experiences and situated meanings of participants, they may plant a camera to be positioned and pointed like a coparticipant. For example, if members are gathered around a conference table, the camera might also “take a seat at the table” and be pointed in the same direction that people are looking. In some situations, a researcher might want to put a camera into the hands of a subject, who would then make ongoing decisions about where to put it and point it. The subject’s body might never actually appear in the video recording, but the data would embody that particular insider’s perspective. Scholars using photographic methods have found this insider perspective to be valuable (Ray & Smith, 2012; Warren, 2002), especially among marginalized groups (Warren, 2005). Video can likewise provide rich access to an insider’s perspective. For example, Whiting, Symon, Roby, and Chamakiotis (2018) demonstrate an insider’s perspective as they examine the dynamic relationship between researcher, participant, and video in a participatory video study. Also, Zundel, MacIntosh, and Mackay (2018) demonstrate the methodological affordances and limitations of video diaries, which provide an insider’s perspective.
In contrast, researchers who want to get an outsider perspective of organizational activity can put their cameras outside the participation framework (Goodwin, 2007) of their subjects. For example, instead of taking a seat at the conference table, the camera might hang from the ceiling or wall, able to capture details of human interaction such as spatial maneuvers, hand gestures, facial expressions, and other behaviors that may escape the conscious notice of the participants themselves. Hindmarsh and Llewellyn (2018) demonstrate an outsider’s perspective as they mount their camera on a wall that overlooks activity at an art museum. Thus, researchers need to design their research projects and operate their recording equipment in ways that are consistent with the analytic perspective that they seek.
By far, most video-based research assumes an outsider’s perspective as analysts examine the elements and patterns of people at work—including their spatial maneuvers, which create relative distance, orientation, and posture among participants. When people want to work together, they may literally decrease the distance between themselves, making it possible to see and hear each other, at the same time showing to other people that they are already engaged or involved (Streeck et al., 2011). Sometimes people orient toward each other, making each other a mutual and ongoing object of attention (Goffman, 1961; Goodwin, 2000; Kendon, 1990); other times they turn away from each other and toward some other object of joint attention (Goodwin, 2000). Postural shifts may be used to alternate the intensity of their engagement and to demarcate the unfolding phases of their involvement (Scheflen, 1964). We emphasize the relativity of embodied behaviors (LeBaron, 2005) and especially spatial maneuvers: The movements of one person may have consequences for the spatial relationships of all, such as when two people are made proximate after only one of them moves. Video recordings capture the details of people’s spatial maneuvers, even when such maneuvers escape the conscious awareness of the participants themselves.
In addition to spatial maneuvers, people may also employ a host of sometimes subtle displays, such as facial expressions, eye gaze, and hand gestures. For example, in a video-based study of a top management team, Liu and Maitlis (2014) found that the participants’ emotional displays, especially their facial expressions, shaped the strategic conversations and decisions of the organization. LeBaron et al. (2016) analyzed video recordings of patient handoffs between attending physicians, which showed that transitions from one handoff to the next were signaled through subtle shifts in eye gaze. In recent decades, hand gestures have become a focus of video-based research within a variety of disciplines (e.g., McNeill, 2005), including organization studies (e.g., Gylfe, Franck, LeBaron, & Mantere, 2016). Gestures do much more than punctuate speech—they attract and direct attention, moving within three-dimensional space, thereby having an affinity with a material world within reach (LeBaron & Streeck, 2000; Streeck, 2009). In the words of Goodwin (2007), gestures are “environmentally coupled.” With this acknowledgment that the human body is, fundamentally, a dynamic material entity, we now turn our attention to materiality as another important feature of video data.
Materiality
The focus of video-based research has evolved over the years. Initially, researchers from various disciplines emphasized the multimodality and embodiment in everyday interaction (e.g., Goodwin, 1979, 1980), using video to capture and analyze the details of what Goffman (1982) called the “interaction order” of social life. Gradually, research interests and questions turned toward particular settings, especially places of work and expertise (e.g., Heath, 1986) where physical objects, artifacts, tools, technologies, and representations abound (e.g., Lynch & Woolgar, 1988). Suchman’s (1987) study of copy machines and technicians was especially influential: She showed how the use of material objects and technologies were an ongoing, situated, and contingent accomplishment, subject to the unfolding and moment-to-moment interpretations of professionals. Video recordings became the staple of a growing program of research called “workplace studies” (e.g., Luff, Hindmarsh, & Heath, 2000), which examines embodied interaction within complex material and technical environments.
Within the field of organization studies, attention to materiality has been increasing. Dameron, Lê, and LeBaron (2015) discussed three contrasting views of materiality that are active within the field of organization studies (see also Lê & Spee, 2015): Object focus (weak): Researchers foreground objects and how their properties impact behavior. Objects are seen as relatively passive and enduring. This view regards materiality as mere physicality. For example, Morgeson and Humphrey (2006). Object and subject (moderate): Researchers regard objects and their social contexts as distinct and separable, but also mutually dependent. The focus is on the possibilities for action that materiality affords. For example, Leonardi and Barley (2010). Entanglement (strong): Researchers regard the social and material as entangled and inseparable. Material objects, artifacts, tools, technologies, and so forth cannot be understood outside their social context. Materiality may be seen as performance rather than substance. For example, Orlikowski and Scott (2008).
Video recordings can be useful to researchers who have any one of these views, because video data capture the social and the material as mutually embedded—it is analysts who sometimes parse one from the other. Early research on materiality within organizations effectively used ethnographic methods (e.g., Bechky, 2003; Carlile, 2002; Orlikowski, 2002) to observe material objects within unfolding work practices. But we recommend video methods, which can augment traditional ethnographic observation by creating a permanent record that can be closely and repeatedly analyzed, allowing new insights to emerge.
One challenge of video-based research is deciding what is important and worthy of analysis. Video data are so rich with social and material phenomena that analysts are sometimes overwhelmed by the endless possibilities for digging in. Based on our experience, we offer this advice: Do not try to analyze everything because you cannot. Rather, focus on those features or phenomena that directly relate to your research questions and findings. Too often, analysts are distracted by novel or intriguing details that are outside the scope of their study. Experimental researchers deliberately limit or control the social and material aspects of their video data in advance of their recording (e.g., Congdon, Novack, & Goldin-Meadow, 2018), which helps to narrow the scope of their analysis. Ethnographers and ethnomethodologists—whose research questions may emerge only through their analyses of video data already collected—can focus on what the research subjects themselves show to be important. For example, Hindmarsh and Llewellyn (2018) study ticket purchases at an art gallery, which is a kind of knowledge work at the boundary of an organization. Consistent with the ontological assumptions of conversation analysis, the authors deliberately limit their analysis to the social and material phenomena that the participants show to be important in the moment. Thus, researchers must focus their analysis of video data in ways that are consistent with the ontological assumptions of their research tradition.
Sequence
Another key feature of video is its sequential organization of audible and visible phenomena. A couple of decades ago, video was a linear technology because recordings were put onto tapes: If an analyst wanted to go from minute 3 to minute 5 of a videotape, it was necessary to play or fast forward through minute 4. Today, video usually takes the form of a digital media file, which is a nonlinear technology: An analyst can instantly jump from minute 3 to minute 5 without traversing the stuff between. But notice that video players continue to look like a linear technology, with traditional buttons for play, fast forward, and rewind. The reason is that sequence is ontologically fundamental to how people experience and make sense of their world. Organizational activity unfolds through time and space, and a particular behavior is made meaningful largely through its temporal and spatial location within unfolding sequences of activity.
First, we consider the importance of sequence for understanding discourse. Using basic rules of grammar, people organize the words of their written and spoken sentences (e.g., subject-verb-object) so that participants can understand each other. Sociolinguists, who are committed to the study of language use beyond the boundaries of sentences, have shown that the sequence of our utterances is essential to our ability to participate in everyday conversations (Thompson, Fox, & Couper-Kuhlen, 2015). Some of the most rigorous video-based research on organizational behavior and activity is being done by conversation analysts from a variety of disciplines. Conversation analysts have shown convincingly that speech acts (Austin, 1962) are not necessarily performed by single utterances, but are instead accomplished across pairs or sets of utterances (Schegloff, 2007). For example, a question does not function as a question until it is answered; an invitation does not do inviting until it is accepted or rejected; a “question” may become an insult, if that’s how the next speaker responds; and so forth. In other words, the meanings and doings of our utterances may largely depend on their location within and relationship to prior and subsequent utterances within unfolding sequences of interaction.
Second, we emphasize that embodied behaviors constitute sequences of visible activity that video recordings can capture (e.g., Kendon, 1990). The conversational floor is a scarce resource if only one person is talking at a time, and visible behaviors enable people to participate continually, albeit quietly. Although visible behavior is often silent, it is not secondary to talk and should not be overlooked by analysts. Embodied actions are often first in the order of things that happen in a meeting, as when the participants enter the room, find a place to stand or sit, and then turn toward each other or some other object of attention before anyone has said a word. While embodied behaviors must be coordinated with each other in a sequence, they are—at the same time—coordinated with co-occurring sequences of talk. For example, Koschmann, LeBaron, Goodwin, and Feltovich (2011) analyzed a video recording of a surgical team at a teaching hospital: Through the sequential unfolding of their conversation, the participants were able to distinguish between the instructional work and instrumental actions of their hands. Deictic words such as “this” and “that” invite hearers to observe the location, orientation, and stance of the speaker’s body, which immediately makes embodied behavior relevant to what people are in the process of saying. Video recordings, of course, can easily capture the complexity of interaction sequences and their coordination.
Video data help to highlight the ontological importance of sequence (temporal and spatial) for people’s experiences and behaviors within organizations. When researchers are collecting video data within an organization, we recommend that they avoid turning their camera off and then on again, because this disrupts the sequential organization of activity. Similarly, we caution researchers against using television shows or documentaries as raw data for research and analysis, because such recordings are usually edited in ways that disrupt the natural sequence of audible and visible phenomena (Wieder, Mau, & Nicholas, 2007).
In summary, we have provided answers to a couple of fundamental questions about video methods. First, why would researchers want to use video data in a research project? Second, how can researchers collect good video data that are consistent with their underlying assumptions about the nature of human behavior and organizational activity? In response, we have identified four interrelated features of video data that are especially useful: multimodality, embodiment, materiality, and sequence. We emphasize that video technology has an ontological kinship with the human body, which talks and moves within time and space in ways that constitute organizational activity. A video camera is a prosthetic device that enables analysts to better hear, see, and examine the empirical details of human interaction (dynamic process) during face-to-face, organizational activities (audiovisual). Video data are rich and may stand alone as evidence for research findings (e.g., experimental studies), or they can also be used to augment more traditional methods of observation (e.g., ethnography or even case studies). However, we caution researchers to be careful in their research design because even the most basic cinematic decisions have fundamental and ontological consequences (see Table 1). Having addressed the why and how questions, we conclude with a critical discussion of video methods in organization studies: What issues have been addressed, what questions remain unanswered, and how can researchers improve video methods going forward?
Video Methods: Taking Stock
Our purpose in this article and special issue is to address the potential of video methods and highlight some of its challenges. In particular, we are excited about the promise of abundant video data: both in the public domain as secondary data, as well as through primary data collection, which is facilitated by the increasing convenience and accessibility of video-recording technologies and greater social acceptance of video use, for example on social media. Essentially, video is increasingly part of the everyday life of organizations. It is available on corporate websites as a means of projecting the organization to the external world, communicating to shareholders and investors, and as forms of corporate selling to customers and to potential employees. Video is equally ubiquitous in projecting messages inside the organization, from video conferencing between distributed team members to video broadcasts by top managers in the central headquarters of multinationals to regional managers, the use of video for training and development, and various uses for culture building. Thus, organizations are using video extensively.
Yet, as we reflect on the corpus of papers initially submitted to the special issue, especially the select articles that made it through to publication, and as we reflect on our suggestions (above) about the possibilities of video, we think it is time to critically assess video methods in management and organization studies. We hope our critique will serve as a call to arms: While organizations are using video to conduct their internal work practices and communicate with multiple stakeholders, relatively few scholars in management and organization studies are focused on why and how video is used in organizations. Our critique is threefold: (a) that we have not gone far enough in accessing the rich sources of video data, (b) that we have yet to exploit and further develop possible methods for analyzing video, and (c) that we still need to demonstrate why video methods matter theoretically. We now address each of these issues in turn.
First, as the review article by Christianson in this special issue shows, only 8.6% of all articles in six of the leading management and organization journals over a 25-year period even mention the word video, and of these a mere 0.6% actually incorporate video as central to their research design. Clearly, video as a research method is in its infancy, at least within the field of management and organization studies. Given that secondary video data are often available on the websites of organizations—along with other forms of secondary data such as annual reports and press articles—it is somewhat surprising that scholars are not making greater use of such data. Of course, secondary video data have issues, such as how the data were collected and for what purpose (see Table 1 for a description of ontological issues associated with basic cinematic decisions). However, these are issues with any secondary data (Cowton, 1998; Stewart & Kamins, 1993), and video at least counteracts some of these limitations through phenomena of multimodality, embodiment, materiality, and sequence, which can help to make retrospective and remote accounts more vivid, bringing some of the actors to life.
Issues of access may be a big concern for some researchers and a major reason that so few publications involve primary video data, collected expressly to answer a specific research question at the center of a research design. Organizations must protect their proprietary information, for both legal and financial reasons, and video recordings have the potential to capture and compromise confidential matters. Many organizations have security desks and protocols to prevent strangers from wandering around the building and gleaning valuable information. Although all forms of data collection have a possibility of leakage, video can capture distinguishing and sometimes proprietary designs and architectural layouts in fuller detail than someone who is simply taking written notes. Indeed, a doctoral student of the third author was barred from taking even still images of a co-working space because members of the executive team did not want anyone copying their unique layouts. Leaders of organizations may need strong and binding assurances that their participation in a research project will not inadvertently disclose organizational secrets.
However, we assert that video access may be easier than people think, and that access is becoming less of a problem. We have successfully negotiated access to a variety of confidential and sensitive sites, such as the interrogation rooms of police stations (LeBaron & Streeck, 1997) and the operating rooms of hospitals (e.g., Koschmann et al., 2011). Recent studies show remarkable levels of access to commercially confidential sites (e.g., Jarzabkowski et al., 2015; LeBaron, Glenn, & Thompson, 2009; Liu & Maitlis, 2014), including the articles in this special issue.
We recommend three tactics that have helped us to overcome the objections of organizations to video-based research. First, be flexible about storage. Sometimes organizations are primarily concerned about video data being distributed or broadcast. When the data are securely stored, under lock and key, within a room that only administrators and researchers can access, the organization’s concerns may be alleviated. Second, consider a time capsule. When video data are time-sensitive, as with a 5-year strategic plan, video access may be possible, on condition that the data not be used or analyzed for a period of months or years. Third, seek a “win-win” agreement. Your primary and secondary research findings might be valuable to the organization you are studying. Sometimes organizations are eager to grant video access, in exchange for full access to research findings.
In addition to permission from the organization, researchers usually need permission from the individual research subjects. Some people simply don’t like to have their picture taken. In rich audiovisual detail, video seems to “catch people in the very act” of being themselves, including relatively private moments and mistakes. Nonetheless, researchers have managed to obtain informed consent from people in extremely sensitive situations (e.g., group therapy for sex offenders; MacMartin & LeBaron, 2006), and people dealing with sensitive personal issues (Mengis, Nicolini, & Gorli, 2018). Sometimes, research subjects may be willing to invest their own time and energy in a recording project. For example, Zundel et al. (2018) examine samples from 28 video diaries, recorded by managers in an engineering firm, over a period of 3 months. In addition to taking a lot of time, the diaries included relatively personal information related to three broad topics in organizational research: bodily expression, identity, and practice.
Informed consent from subjects may require adjustments to the research design so that the participants are protected at the same time that research objectives are achieved. People are often concerned about their privacy, wanting to remain anonymous if they are going to participate in a research project. We recommend three techniques that have helped us to reassure subjects by protecting their privacy. First is masking: All of the visible features of a video recording, including the faces of research subjects, can be altered or made unrecognizable. Video editing software enables faces to be reshaped, discolored, blurred, or blotted. Second is vocal disguise: Audio editing software enables the volume, pitch, and/or speed of a subject’s voice to be changed, making it unrecognizable. Third is deidentification: The audio track of a video recording can be scrubbed to remove identifiers, such as the names of people and places. These techniques can be combined and varied so that the privacy of participants is protected while the research objectives are achieved. For example, when Beach and LeBaron (2002) analyzed a doctor-patient consultation, they blurred the patient’s face to mask her identity—except for the patient’s eyes, which were not blurred because eye gaze was a particular focus of the analysis.
Most universities have institutional review boards (IRBs) to help protect organizations and human subjects from negative consequences in academic research. When video access and/or permission seem difficult, we encourage researchers to work with their IRB in adjusting their research design and/or modifying their video data in ways that overcome the objections of organizations and address the concerns of research subjects, while still meeting research objectives. Are video data worth the hassle? Solutions such as storing data remotely and masking people’s faces may be inconvenient, but they might also make a video-based project possible. If researchers want to analyze dynamic and audiovisual data, then video methods are a singular option. When multimodality, embodiment, materiality, and/or sequence are not a specific focus of analytic attention, researchers may opt for audio recordings, interviews, surveys, or some other method that makes access easier or less complicated.
As relatively new in the field of management and organization studies, video methods face many of the challenges that all under-utilized methods face. To begin, absent requisite training in a graduate program or in other venues, researchers may not have the skills or even confidence to conduct video-based studies. In addition, it may be difficult to find editors or reviewers who are comfortable evaluating studies that use video data. Fortunately, as our special issues highlights, there are an increasing number of scholars who are utilizing this method and thus can serve in these evaluator roles. In addition, we hope that the articles in this special issue will help to surmount some of these obstacles to video methods, which have led to the underuse of a readily available and vivid source of data on organizational phenomena. But beyond simply a liability of newness, video methods also suffer from book and journal formats that are often stuck in a “print” mentality. Indeed, one cannot “show” video in a hardcopy book or journal. Although most journals now have an electronic presence as well, relatively few journals (e.g., Academy of Management Discoveries) allow different modalities to be presented in their publications (LeBaron, 2017). We hope that more journals will allow the presentation of multiple types of data in the future.
In a related vein, our field lacks templates about the types of studies for which video methods may be best utilized. Our special issue demonstrates the broad scope of possible methods for video analysis. To illustrate:
Congdon et al. (2018) explain how analysis of gesture provides critical insights into nonverbal aspects of knowledge, cognitive patterns, and psychological “readiness-to-learn” that are critical in learning and development, and that shape the communicative context. Their explanation is important because it shows us how the study of videotaped gesture can take us beyond the analysis of verbal reasoning in explaining cognitive processes and adds to our understanding of communication beyond spoken words. In particular, gesture does not always mirror language, providing insights into mismatches between what people say and what they may believe, even when such mismatches are not apparent to the speakers themselves.
Waller and Kaplan (2018) address four key decisions that researchers need to consider when using video recordings to generate quantitative data. Drawing on examples from their own work as well as others’, the authors make helpful recommendations about choosing a data collection site, selecting and using a coding scheme, and undertaking statistical analysis of the data derived from video. They also identify tools that can aid in the coding and analysis process, many of which have received little attention in management research, such as linguistic programs that can analyze speech patterns (e.g., pitch, intensity, voice breaks, etc.).
Heath and Luff (2018) take a more qualitative approach to video analysis by discussing and demonstrating how “quasi-naturalistic” experiments have informed workplace studies in the fields of sociology and anthropology. Quasi-naturalistic experiments, like Garfinkel’s (1963, 1967) famous breaching experiments, are focused on forms of action or interaction that emerge when there are changes to an activity or context. These experiments are not intended to provide causal evidence or tests of existing theories; rather, they are exploratory, providing researchers with phenomena and findings that can be investigated through a larger program of research. The authors recommend these methods as a way to jumpstart new perspectives and insights within the fields of management and organization studies.
Slutskaya et al. (2018) propose to improve the practical impact of ethnographic work through the benefits of collaborative ethnographic documentary. This video method enables researchers to give “voice” to their informants, especially those in low status occupations; and it enables analysts to present their research findings in a way that drives managers to action.
These four articles deal with phenomena of specific interest to management scholars, such as teamwork, workplace interactions, and the cognitive processes of learning and communication. More germane to our interests in video methods, these pieces provide formalized techniques for video analysis that organizational researchers may want to adopt.
One challenge for management scholars is to immerse themselves in some of these techniques, as pertinent to their research questions, and thereby advance their study of management problems. For example, strategic management and organization theory have long been concerned with the way mental models and cognitive maps shape the behavior of actors across an industry, using cognitive processes to explain a range of outcomes from strategic groups (Porac, Thomas, & Baden-Fuller, 1989) to predispositions to strategic change or inertia (e.g., Huff, 1982; Johnson, 1987). Imagine how such studies could be expanded by the examination of gesture as an indication of cognitive processes (e.g., Congdon et al., 2018), illuminating both knowledge and beliefs that are not apparent to the participants themselves, as well as highlighting mismatches between the verbal and nonverbal cognitive processes that might further explain cognitive barriers to the pursuit of organizational goals.
Going beyond this, we suggest that organization scholars need to develop their own video methods. This special issue already provides some compelling new ways of thinking about collecting video data, according to who holds the camera (Whiting et al., 2018), how the lens is positioned (e.g., Mengis et al., 2018), and what constitutes the object of focus (e.g., Jarrett & Liu, 2018).
Whiting et al. (2018) use a paradox framework to explore the roles of participant and researcher in a relatively new form of video methods: participatory video studies. In participatory video studies, research participants control the camera, with varying levels of supervision from the researcher. Drawing on their own study of work-life boundaries in the digital age, Whiting et al. explore the enactment of participation-observation and intimacy-distance by both participants and researchers, showing how these tensions are produced and managed in the moment of performance. Their work has important methodological implications for how scholars use video data collected by participants.
Mengis et al. (2018) examine the important role that space plays in creating video data—as they note, these seemingly technical issues have important methodological relevance in video methods. Using data collected from an outpatient emergency room, they demonstrate four potential views of space based on camera angle, focus of the shot (panoramic vs. focused), and movement (stationary vs. moving camera). Each of these foregrounds particular elements of space while placing others in the background. Importantly, they find that the traditional perspectives used in video data may be partially responsible for static, inert understandings of space in previous research.
Jarrett and Liu (2018) begin with a review of literature on video ethnography, which has focused on examining patterns of micro-interactions (“zooming in”) as well as how these patterns evolve and endure (“zooming out”). They then add a new contribution that they call “zooming with”: Rather than the analytical focus being on observed behaviors and patterns, zooming with incorporates participants’ perspectives of what they were feeling and thinking during key moments in a video recording. This method gives voice to the participants and at the same time gives researchers insight into their subjects’ “motivations, intentions, and experienced emotions.” The authors demonstrate the value of this method in a study of a top management team undergoing strategic and structural changes.
In addition to methods of data collection, these articles provide some initial insights into analyses of video data that are produced in different ways (e.g., Toraldo et al., 2018; Whiting et al., 2018; Zundel et al., 2018). However, our field needs to go further. Management scholars have the luxury of not being bounded by a specific disciplinary approach (Agarwal & Hoetker, 2007), allowing scholars to import different methods and adapt them to the phenomena of interest. Similarly, we may generate new methods of analyzing video data that borrow from different disciplinary approaches, such as sociology, ethnomethodology, and psychology, as demonstrated by Liu and Maitlis (2014), who analyzed the emotional displays of a management team with regard to their organization’s different strategy processes. While such borrowing runs the risk of ontological inconsistency, as discussed in this article, it might also open new avenues for video methods in organizational research.
Third, the novelty of video in organization and management studies is not a theoretical validation in and of itself. That is, we need to work on robust reasons to use video data, and generate clarity about what difference the analysis of video data makes to existing theoretical understandings. For example, simply showing that nonverbal communication is also part of group interactions, without furthering understanding about how groups interact, is not sufficient for a contribution. This was perhaps one of the most challenging aspects of our special issue: helping authors to show why video made a difference and to what—as well as puzzling about that ourselves. While it was clear that video could shed light on different phenomena within a workplace (e.g., Jarrett & Liu, 2018; Toraldo et al., 2018), or open up the way we conceptualized workspaces (e.g., Mengis et al., 2018), or invite participant reflexivity (e.g., Zundel et al., 2018), it was harder to substantiate why that difference mattered theoretically.
We recognize that our special issue is about research methods—specifically video methods—not what video methods contribute to a particular body of theory. Nonetheless, Christianson’s (2018) review of video use in leading management journals found that most articles tend to rely on what subjects say, supplemented by the subjects’ facial expressions, which shies away from the full potential of video to demonstrate how multimodality, embodiment, materiality, and sequence can help to extend our theoretical understandings. Just as Congdon et al. have shown how the study of gesture can extend our theoretical understanding of cognitive processing and communication beyond the spoken word, so must management scholars show how video data and methods can take organization theory beyond its existing boundaries. This remains a key challenge going forward.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
