Abstract
Many phenomena of interest to management and psychology scholars are dynamic and change over time. One of the primary impediments to the examination of dynamic phenomena has been challenges associated with collecting data at a sufficient frequency and duration to accurately model such changes. Emerging technologies that produce nearly continuous streams of big data offer great promise to address those challenges; however, they introduce new methodological challenges and construct validity concerns. We seek to integrate the emerging big data technologies into the existing repertoire of measurement techniques and advance an iterative process to enhance their measurement fit. First, we provide an overview of dynamic constructs and temporal frameworks, highlighting their measurement implications. Second, we discuss different data streams and feature emerging technologies that leverage big data as a means to index dynamic constructs. Third, we integrate the previous sections and advance an iterative approach to achieving measurement fit, highlighting factors that make some measurement choices more suitable and viable than others. In so doing, we hope to accelerate the advancement of dynamic theories and methods.
Many phenomena of interest to management and psychology scholars are dynamic (Langley, Smallman, Tsoukas, & Van de Ven, 2013). Dynamic phenomena can be found at the individual, team, and organizational levels in varying manifestations as they emerge and change over time. Although these phenomena are theorized to change over time, they are often empirically examined using “snapshots” of the phenomena in the form of survey data collected at a single point in time. As a result, research on dynamic phenomena has been largely one of statics rather than dynamics (Kozlowski, 2015; McGrath & Tschan, 2007).
One of the primary impediments to the examination of dynamic phenomena has been challenges associated with collecting data at a sufficient frequency and duration to accurately model such changes. Emerging technologies that produce nearly continuous streams of big data offer great promise to liberate researchers from these constraints by producing “unprecedented volume, micro-level detail, and multifaceted richness” (George, Haas, & Pentland, 2014, p. 324) but bring with them a host of new methodological challenges and construct validity concerns. We seek to integrate the emerging technologies into the existing repertoire of measurement techniques and advance an iterative process to enhance their measurement fit (i.e., the degree of alignment between how a construct is conceptualized and measured). More broadly, we feature measurement considerations associated with dynamic phenomena and in so doing, our overall goal is to accelerate the advancement of dynamic theories and methods.
We begin with an overview of dynamic constructs and temporal frameworks (i.e., developmental, episodic, and event-based models) and highlight their measurement implications. Second, we discuss the data streams (i.e., physiological responses, words, behaviors) that represent the elemental contents, or indicators, from which constructs are derived. We then provide illustrative examples of emerging measurement technologies and the big data streams they capture. Consistent with George et al. (2014), we suggest that “big data” is not only large in volume (e.g., terabytes, petabytes), but “the defining parameter of big data is the fine-grained nature of the data itself, thereby shifting the focus away from the number of participants to the granular information about the individual” (p. 321). By discussing emergent technologies in terms of their functions (e.g., collecting, indexing, interpreting) and constituent data streams, it creates a common language and facilitates identifying similar and unique aspects of newer (e.g., wearable sensors) and more traditional tools (e.g., surveys). This in turn promotes integration of emergent as well as future technologies into the existing repertoire of measurement techniques.
Next, drawing from and expanding current measurement fit considerations, we advance a measurement fitting process involving three core components: (a) construct elements, (b) measurement features, and (c) contextual considerations. We offer guidelines for achieving measurement fit, which highlight factors that make some measurement choices more suitable than others and provide scaffolding for researchers to simultaneously and iteratively consider competing demands. Unlocking insights from big data and ensuring construct validity hinges on researchers’ conceptions concerning what data streams to collect and what makes sense in their context as considerations in a measurement fitting process. The utilization of this iterative approach promotes a measurement refinement process, which improves construct validity. We conclude by discussing implications and future directions for dynamic theories and methods.
Advancing the Theory of Dynamic Constructs
All dynamic constructs share the unifying feature of being theorized to change over time. Metaphorically, examination of dynamic phenomena is about the study of the movie that emerges from stringing together scenes of actions (Mathieu & Luciano, in press). It is the ordering and interconnectedness of those scenes that provide meaning to the story. Scenes are self-contained segments that each have a purpose, a beginning, and an end. And scenes in turn are understood through stringing together numerous frames or snapshots of activity. Notably, the frames of activity are important as without valid and reliable indices of dynamic constructs at any given time, one cannot discern the patterns that emerge and change over time. Therefore, to measure dynamic constructs, it is important to have a temporal theory in mind, which in turn guides decisions about when to assess a construct and what constitutes “frames” of measurement.
We define dynamic constructs as phenomena that change over time, which are generally conceptualized as behavioral processes and emergent psychological states. For example, in the teams domain, Marks, Mathieu, and Zaccaro (2001) defined team processes as “members’ interdependent acts that convert inputs to outcomes through cognitive, verbal, and behavioral activities directed toward organizing task work to achieve collective goals” (p. 357), whereas emergent states are “properties of the team that are typically dynamic in nature and vary as a function of team context, inputs, processes, and outcomes” (p. 357). In this sense, behavioral processes are action verbs and describe behaviors that members exhibit. In contrast, emergent states describe affective and cognitive properties of the team as they exist at any given time. The dynamics refer to the emergence, patterns, and changes of processes and states over time and may exist at multiple levels (e.g., individual, dyad, team, unit, organizations).
Although studies that examine the static state of dynamic constructs are useful for answering many research questions, tests of temporal relationships necessitate longitudinal approaches. In general, these questions are concerned with examining the core of the dynamic phenomenon itself: questions about how and why these constructs emerge and change over time. For example, research on dynamic elements may focus on the initial development of a construct including the process of its emergence, formation, or congealing (Morgeson & Hofmann, 1999). Alternatively, research may focus on the pattern of a phenomenon over time, including its ebbs and flows, rhythms, or developmental trajectories. Moreover, dynamic theories may benefit from the exploration of questions related to tipping points, termination, and other nonlinear patterns over time (cf. Mitchell & James, 2001). Other important dynamic questions involve construct changes in response to different events, including both the shape of the change and why it occurred. Notably, understanding why things change may prove informative for anticipating when they will likely change. We believe the examination of these types of questions will enable major theoretical advancements in our understanding of dynamic phenomena. Due to the central role of time in these phenomena, building theories of dynamic constructs requires articulating both the relationship between constructs as well as positioning those relationships within the relevant temporal frameworks (Cronin, Weingart, & Todorova, 2011).
Temporal Frameworks
Time comes in many different forms and varieties that have different yet related implications for measuring dynamic constructs (Ancona, Okhuysen, & Perlow, 2001). Numerous temporal-based theories have been advanced over the years, but they generally fall into three primary types: (a) developmental models, (b) episodic models, and (c) event-based models (see Mathieu & Luciano, in press). Herein we provide a sampling of theories that utilize each of these temporal models at the individual, team, and organizational levels.
Developmental models
Developmental theories suggest that entities have a life span and change as a function of maturation over time. Further, they presume that earlier occurring phenomena set the stage for later development and create a path dependence that can be tracked and analyzed over time. Individual-level developmental models include ones pertaining to training and development (e.g., Ford, 2014) and socialization (Kammeyer-Mueller, Wanberg, Rubenstein, & Song, 2013) processes, whereas team developmental models assume that groups progress in linear (e.g., Tuckman & Jensen, 1977), discontinuous (Gersick, 1988), or multiple-sequence (Poole, 1983) fashions from one stage to another over their life span. At the organizational level, there are developmental models of organizational change and development (e.g., Langley et al., 2013) as well as models specifically pertaining to entrepreneurship and new venture creation (e.g., Huang & Knight, 2015). The common theme among developmental models is that the entity evolves through qualitatively different stages over time, with each later stage built on the earlier stages. This suggests that measures of dynamic constructs need to be aligned with when the entity is within and transitioning between such stages as well as an appreciation for the fact that different entities may be developing at different paces.
Episodic models
Episodic models suggest that entities do different things at different times directed at some goal or end state and that these patterns of activity reoccur in a cyclical pattern. Button, Mathieu, and Aikin (1996) defined performance episodes as “distinguishable periods of time over which performance accrues and is reviewed” (p. 1085). Episodic theories are prominent at the individual level of analysis, including social cognitive (Bandura, 1986) and control (e.g., Carver & Scheier, 2012) theories. McGrath’s (1991) theory of time, interaction, and performance (TIP) submits that teams juggle multiple bundles of activities over time, which creates challenges in terms of how to best represent the “complex matching of bundles of activities to particular periods of time” (p. 163). Marks et al. (2001) suggested that teams perform different activities during performance episodes (i.e., action processes) than they do between episodes (i.e., transition processes). Episodic organizational level models of change and development (Bartunek & Woodman, 2015) and interorganizational cooperation (Jarvenpaa & Majchrzak, 2016) have also been advanced. The common theme among episodic models is that various states and processes can be expected to be activated at different times on the basis of the temporal rhythms of the work procedures, equipment, or other cyclical pacers. The implications of these theories are that investigators need to measure different constructs—at different times—in sync with their occurrences in the cyclical recurring pattern of activities.
Event-based models
A third temporal approach to the study of dynamic constructs is to associate them with environmental events. Generally speaking, an event refers to a particular environmental circumstance within time and space that disrupts the normal course or rhythm of activities (Morgeson, Mitchell, & Liu, 2015). Events are external stimuli (i.e., occur in the environment) but serve to trigger internal process and states. Individual-level theories along these lines include affective events (Weiss & Cropanzano, 1996) and stress event theories (Park, 2010). At the team level, for example, Morgeson and DeRue (2006) examined the relationship between different external leadership behaviors and teams’ adaptability to different types of disruptive events. In addition, Burke, Stagl, Salas, Pierce, and Kendall (2006) advanced a theory of adaptation that featured team responses to environmental challenges. At the organizational level, events have been examined as reactions to natural disasters (e.g., Linnenluecke, Griffiths, & Winn, 2012) as well as the introduction of new technologies (e.g., Barley, 1986). The common theme in event-based theories is that dynamic constructs are presumed to vary over time in meaningful ways as related to environmental triggers. The implications are that scholars need to assess such constructs before, during, and following such events to fully discern their dynamic nature and influences.
Co-occurrence
Although certain constructs may be more aligned with specific temporal models, it is important to consider that some research questions will evoke multiple temporal models. For example, while building a dynamic theory of conflict, a researcher may want to investigate whether its level or impact varies based on a dyad’s (or team’s or organization’s) stage of development (e.g., perhaps instances of conflict that occur early in a relationship are more detrimental than ones that occur later), the task episode (e.g., perhaps conflict during planning vs. action stages influences group dynamics differently), or external event (e.g., perhaps there is more conflict after a major performance setback—such as a company product recall). The potential for these temporal dynamics to co-occur increases the frequency and duration of measurements necessary to tease out different temporal patterns (Mathieu & Luciano, in press). Modeling such higher order relationships requires at least three waves of comparable data and significantly benefits from far more. Inherent in all of these temporally based theories is that constructs, such as affective states, behaviors, and cognitions, need to be indexed repeatedly over time to fully understand and model the phenomena. Whether such constructs are thought to emerge and change over time or whether they are presumed to be activated in response to natural rhythms or environmental events, it is critical to have sufficient data streams to create valid construct measurements that can be compared over time.
Data Streams and Emergent Measurement Technologies
Large amounts of data seem to be increasingly pervasive and accessible. “Big data is generated from an increasing plurality of sources, including Internet clicks, mobile transactions, user-generated content, and social media as well as purposefully generated content through sensor networks or business transactions such as sales queries and purchase transactions” (George et al., 2014, p. 321). Some data are produced specifically for research purposes (e.g., behavioral streams from wearable sensors), whereas other data may have been generated originally for other purposes (e.g., personal: social media; business: financial statements). A half century ago, Webb, Campbell, Schwartz, and Sechrest (1966) discussed the potential use of naturally occurring unobtrusive trace measures as indicators of psychological constructs. They focused primarily on measures of erosion (e.g., resources consumed during activities) and accretion (e.g., evidence of activities such as email traffic). Modern big data technologies are consistent with this historical philosophy of exploiting naturally occurring byproducts of activities as indicators of relevant psychological constructs. Consistent with George and colleagues’ (2014) conceptualization of big data, we focus on data streams that generate fine-grained information about individual and collective processes. While we acknowledge that there are many sources from which to retrieve big data (e.g., public, private, physical trace, community, and self-quantifiable; George et al., 2014), herein we focus on the elemental content of the data generated by individuals, which can be combined to offer insights about teams and organizations.
Types of Data Streams
Dynamic data streams reflect three general types of activities: (a) behaviors, (b) words, and (c) physiological responses. Behavioral data streams include factors related to movement (e.g., acceleration), position (e.g., body orientation, spatial propinquity), posture (e.g., slouching, leaning forward), gestures (e.g., hand wave), and facial expressions (e.g., smile, wink). Word data streams include factors related to the pattern of speech (e.g., tone, interruptions) and the content of speech and writing (i.e., meaning, pronoun use). Alternatively, physiological response data streams reside within an individual and include factors such as the content of thoughts, pattern of brain waves, and vital signs (e.g., heart rate, blood pressure). These data streams are the elemental content of constructs, which need to be combined and contextualized to generate meaning.
Distilling various technologies (e.g., video and audio recorders, wearable sensors, software packages) into their core functions (e.g., data collection, indexing, and interpretation) will likely aid in their appropriate utilization. Many of the new technologies shift the source of the data stream collection and/or aggregation from a person (e.g., participant, observer) to a device (e.g., wearable sensor, computer). For example, survey-based techniques often implicitly ask participants to mentally aggregate across their experiences over time. More specifically, when participants respond to a survey item such as “how much relationship tension is there in your work group” (Jehn & Mannix, 2001, p. 243)—in essence, to make an overall inference about team relationship conflict—they are being asked to consider all potential data streams (e.g., snarky comments: word, eye rolls: behavior, increase in heart rate: physiological) across all group members and across some specified (and often unspecified) time period. Observer ratings are another form of a mental data aggregation technique, in which nonmembers (e.g., observers, supervisors, customers) are asked to perform the same sort of synthesis in their heads and render some summary judgment. Conversely, emergent technologies often use explicit algorithms to aggregate data across individuals, time, and sometimes data streams.
The different aggregation options (i.e., participant, observer, algorithm) have different strengths and weaknesses. For example, although people are prone to unreliability and biases, they are able to incorporate contextual considerations and nuance in a more seamless manner than can formal algorithms. This highlights the critical point that the process of creating indicators and indexing variables, even with the assistance of computer algorithms, must be considered in context. For algorithms to help produce valid constructs, it is imperative that researchers understand the origin and nature of their indicators (e.g., what data streams were collected, how were those streams combined, which combinations make sense in this context) and not fall prey to relying blindly on default algorithms in methodological black boxes. Accordingly, we emphasize the underlying data streams to facilitate the creation of valid construct measures and offer insights that are generalizable across devices and techniques.
Emerging Measurement Technologies
New and emerging measurement technologies are capable of generating massive amounts of time-stamped big data that may be leveraged to index dynamic constructs. Indeed, George et al. (2014) submitted “evolving practices—using big data—can allow us to study entire organizations and workgroups in near-real time to predict individual and group behaviors, team social dynamics, coordination challenges, and performance outcomes” (p. 325). Herein, we provide a few illustrative examples of emerging technologies, the data streams they generate, and variables that may be indexed while noting some salient contextual considerations.
Behavior-related data streams
Behavioral data refers to observable actions that convey relevant constructs. These may come in the forms of individuals’ whereabouts, their body position and posture, and even subtleties such as gestures and facial expressions. They may be indexed using traditional methods such as surveys or observer ratings using structured coding schemes in real time or via video recordings; using newer technologies ranging from personal cell phones to dedicated devices can also yield valuable data streams.
Wearable sensors are small electronic devices that can be worn on the body to generate indices such as members’ spatial propinquity, body movements, and posture (Pentland, 2007). By leveraging technologies such as global positional, infrared, or Bluetooth signals, members’ whereabouts in time and space can be easily monitored and recorded. Using accelerometer technology, individuals’ rates of movement, overall activity levels, and consistency of motion can be monitored and recorded for analysis. These types of measures may be particularly applicable for situations where team members’ physical proximity and movement are salient. For example, knowing where firefighters are in structure and their rate of movement reveals their coordinating actions and potential danger (Voirin, 2015). In more confined and controlled environments, certain movements such as sitting up (or back), nodding, gaze, gesturing, body orienting, arm and leg positioning, and gesturing while speaking or listening have been associated with team cohesion (Hung & Gatica-Perez, 2010) and contentious or collaborative meetings (Bousmalis, Mehu, & Pantic, 2013; Gatica-Perez, 2009; Zahn, 1991).
Individuals’ faces reveal a host of emotions that they are experiencing (e.g., anger, disgust, fear, joy, sadness, surprise), and there are likely some consistencies within and across cultures (Ekman & Oster, 1979; Valstar, Mehu, Jiang, Pantic, & Scherer, 2012). However, incorporating facially transmitted emotions in research has traditionally involved the use of highly trained subject-matter experts (SMEs) and extensive coding of video-taped sessions. Since the 1990s, there have been efforts to automate such efforts, which have now advanced to the point that we are able to apply it to spontaneous expressions. These systems extract a sufficiently reliable signal that we can employ them in behavioral studies and to begin to develop applications that respond to spontaneous expressions in real time. (Bartlett & Whitehill, 2011, p. 489)
Word-related data streams
The analysis of employees’ communications has been a mainstay of organizational behavior research since the dawn of the discipline. However, such indexing has typically been a painstaking endeavor involving recording communications, transcriptions, developing coding schemes, training coders, multiple revisions, and so on (Fischer, McDonnell, & Orasanu, 2007). In other words, while such a process yields a rich understanding of interpersonal dynamics, it is very laborious and prone to unreliability. However, there have been recent developments in the area of linguistic analyses of text communications that converts written passages into substantive dimensions (Pollach, 2012; Yilmaz, 2016).
Automated analyses of text materials are often referred to as computer-aided text analysis (CATA) in the social sciences. CATA can function as a sophisticated form of content analysis to quantify word use and patterns to make inferences from the text in an objective and systematic manner (Krippendorff, 2004). These techniques can be viewed on a continuum of sophistication ranging from simple word counts or the percentage of we versus I pronoun use, to complex algorithms that score multiword phrases and passages to derive semantic meanings from text (e.g., Carley, Columbus, & Landwehr, 2013), to neural network natural language processing type applications (e.g., Honnibal, 2016). For example, some software iteratively parses written passages into phrases and then clusters them into more general content themes in a fashion similar to exploratory factor analysis or cluster analysis (e.g., AutoMap, Carley et al., 2013; IBM, 2015). Other software packages use prespecified categories or dictionaries of words or phrases and scores their use in text passages (e.g., Linguistic Inquiry and Word Count, Pennebaker, Mehl, & Niederhoffer, 2003; Diction, Short & Palmer, 2008).
Whereas analysis of text yields information about what is being said, analysis of how things are being said can also be informative. In other words, speech characteristics and patterns both within and across individuals may be used to infer a variety of constructs. Generally speaking, sophisticated analyses of group speech patterns adopt a two-layered Hidden Markov Model (HMM) (Pentland, 2007). First, the speech patterns of individual members are indexed over time. Second, the interrelationships of members’ actions are derived from the co-varying sequences of speech patterns between them. For example, individuals’ amount, frequency, and amplitude of talking can be indexed per temporal unit (e.g., seconds) over a given duration (e.g., an hour-long meeting). This yields indices such as the percentage of time talking versus listening (which signals relative dominance in a collective; Jayagopi, Hung, Yeo, & Gatica-Perez, 2009) and consistency of a member’s speech (which signals emotional stability vs. arousal; Scherer, 2003). Pairing speech patterns of two or more people can yield a variety of interactional indices such as turn-taking, interruptions (both those that are granted and those that are not), mirroring, overlapping, and the variation of speaking time across members, which have been associated with constructs such as team roles (e.g., Zancanaro, Lepri, & Pianesi, 2006), cohesion (e.g., Hung & Gatica-Perez, 2010), conflict (e.g., Pesarin, Cristani, Murino, & Vinciarelli, 2012), and different group interactional patterns (e.g., Jayagopi & Gatica-Perez, 2009). Most work to date has been done with relatively small groups (∼3-5 members) in very controlled environments (e.g., conference meetings, Gatica-Perez, 2009). Extrapolating to larger collectives in dynamic physical environments will require very specific grounding and indexing (Pentland, 2007).
Physiological response–related data streams
Physiological data refers to indices reflecting the state of an individual’s body or its subsystems, such as brain activity, respiration and heart rates, electrocardiograms, blood pressure and oxygenation, and skin temperature (Imani et al., 2016). These measures might be gathered at certain times (e.g., as by taking one’s pulse or drawing a blood sample) or continuously (e.g., by wearable sensors such as wrist bands or heart monitors). For example, Waldman, Wang, Stikic, Berka, and Korszen (2015) described a sophisticated neuroscience technology for indexing group dynamics. They describe software and lightweight portable hardware with sensors placed on individuals’ scalps, which can reliably record the electrical activity of their brains. The resulting quantitative electroencephalography (qEEG) data have been used to model neural patterns of human interactions such as leader emergence and team members’ engagement (Waldman et al., 2013). Comparing members’ qEEG patterns may well reveal their collective cognition, affective contagion, and a variety of other team dynamics. Whereas qEEG scalp caps may be less feasible for field research, this technology may be suitable for traditional laboratory environments and perhaps real-world simulation environments such as those used in aviation and medical training.
Singularities, similarities, and synergies
The new and emerging measurement technologies have some particular singularities, similarities, and synergies. In terms of singularities, different techniques afford different insights. For example, CATA can capture the contents of communication (what people are saying) but not nuances in terms of tone, volume, and so on (i.e., how they are saying it). Alternatively, auditory speech analysis is sensitive to tone, frequency, overlaps, interruptions, and a variety of other audio cues but not the content of information. Physical movements, gestures, and posture, as well as a variety of direct measures, can convey whether and the degree to which members are reacting to others but not exactly why. Therefore, different modalities may be more appropriate for indexing different constructs or different facets of constructs.
In terms of similarities, many speech patterns, physiological reactions, facial expressions, and bodily movements are thought to signal the same underlying psychological states such as interest, joy, or frustration. In this sense, different modalities may be used as multiple measures of the same underlying construct(s) such as one might employ in a multitrait, multimethod investigation. And in terms of synergies, different aspects of different technologies may be used in combination. For example, gestures and physical activity signify different things depending on whether one is speaking or listening in a group (i.e., dominance vs. interest, respectively). For instance, in some settings, members may choose whether to collocate in the same area or work in separate areas (e.g., offices). Their physical proximity (determined via infrared and Bluetooth signals) can be used to determine whether they are within range of face-to-face conversations. When members collocate, there may be a premium on speech and gesture analysis to index their degree of cohesion, conflict, and so on. However, when members do not collocate, then their team dynamics might be better indexed via CATA analyses of their text, chat, and email messages—or perhaps through facial recognition analysis of video communications. And of course, different employees may interact with different others simultaneously using different mediums. Big data techniques offer great promise for enabling the collection of data at a sufficient frequency and duration to accurately model changes over time. However, the effective utilization of these techniques hinges on the researcher’s careful consideration of the suitability of different measurement choices in context.
Measurement Fit
Measurement fit reflects the degree of alignment between how a construct is conceptualized and measured. Measurement fit is a complex issue, which precludes one-size-fits-all recommendations or lock-step cookbook recipes—even for specific constructs. For example, using emails to index knowledge sharing may be a good fit for virtual teams but less so for collocated teams—it depends on the proportion of correspondence that occurs via email. Similarly, the proximity data stream from wearable sensors may only be a meaningful indicator of coordination when individuals are mobile, as opposed to sitting at designated stations. We suggest that achieving measurement fit requires an iterative approach involving three core components: (a) construct elements, (b) measurement features, and (c) contextual considerations (see Figure 1). In the following, we outline each of those components and discuss how they should influence measurement decisions. We then discuss their alignment and advance an iterative approach to achieve measurement fit.

An iterative approach to measurement fit.
To assist in the application of this iterative fitting process, we use affect, behavior, and cognition, often referred to as the ABCs of psychology, as a grouping mechanism to illustrate differences between types of constructs in a way that is generalizable across levels (e.g., individual, team, organization). Examples of dynamic constructs of each type include affect: satisfaction, conflict, emotional contagion, mood, cohesion, and morale; behavior: efforts, actions, coordination, and communication patterns; and cognition: efficacy, mental models, strategic orientation, and work climates. There are also some dynamics such as empowerment, transactive memory systems, ambidexterity, and agility that include a blend of the ABCs. Tables A1 through A3 in Appendix A contain examples of different studies in which their measurement features exhibited strong alignment with the construct features and contextual considerations. Examples are included from the individual, team, and organizational levels, sampling across emergent technologies and more traditional methods. The studies were selected to demonstrate appropriate use of different measurement techniques and how strong measurement fit can yield interesting insights. In addition, Table 1 contains examples of measurement fit considerations more specifically targeted at big data techniques. These examples showcase considerations across particular streams, constructs, and study contexts but should also be contemplated in light of one’s temporal theory and the other measurement fit components articulated in the following.
Examples of Measurement Fit Considerations for Big Data–Style Collection of Data Streams.
Construct Elements
Prior research has emphasized the importance of providing a clear and unambiguous definition of the construct and noted the numerous issues generated by insufficient clarity (cf. Aguinis & Vandenberg, 2014; Baumgartner & Steenkamp, 2006; Chen, Mathieu, & Bliese, 2004; Gerring, 2012). Recently, Podsakoff, MacKenzie, and Podsakoff (2016) offered an in-depth discussion of the importance of concept clarity and a series of recommendations for creating better concept definitions. They devised a detailed four-stage model to develop good conceptual definitions. For the investigation of new constructs or modification of existing constructs, the importance of the full rigorous four-stage model is clear. Drawing from and expanding their framework, we suggest the four overarching elements that should be included in the explication of each dynamic construct are space, nature, structure, and appearance.
Space
The element of construct space encompasses both the content and dimensionality of the construct and where it resides in the larger nomological network (Chen et al., 2004). This involves articulating the boundaries of the construct and the rules for inclusion and exclusion. Notably, listing the necessary and sufficient attributes of a construct can help clarify the underlying theme or essence of the construct (MacKenzie, 2003; Podsakoff et al., 2016). Furthermore, if the construct is multidimensional (e.g., transactive memory systems), scholars must also articulate the underlying dimensions (MacKenzie, Podsakoff, & Podsakoff, 2011).
Nature
The element of construct nature describes its conceptual domain “by specifying the type of property the concept represents and the entity to which that property applies” (Podsakoff et al., 2016, p. 184). The entity component simply describes the person(s), event, or object to which the construct applies, such as a dyad or team. The type of property refers to the general nature of the construct (e.g., individual characteristic, thoughts, performance evaluation; Podsakoff et al., 2016). When articulating the nature of a dynamic constructs, we suggest researchers include whether the construct is (a) a process and/or emergent state and (b) an affect, behavior, and/or cognition. These two distinctions are salient grouping categories that notably influence a construct’s appearance and in turn, measurement considerations.
Structure
The structural element describes the shape and change over time (Luciano, DeChurch, & Mathieu, 2015). Morgeson and Hofmann (1999) described the conceptual importance of construct structure. They suggested that “focusing on the interactions that define [italics added] and reinforce [italics added] the collective phenomena can provide a better understanding of how collective phenomena arise [italics added] and continue [italics added]” (p. 257). The structural shape illuminates where the meaning resides in the combination of information and interactions. Central to this element is determining whether meaning resides in the central tendency or some element of variability (e.g., averages, variances, configurations; Chan, 1998; Kowlowski & Klein, 2000; Mathieu & Luciano, in press). Notably, Chen et al. (2004) describe a variety of ways that measures might be combined to represent collective phenomena. For example, affective constructs such as team efficacy are often more appropriately aggregated using an additive or referent-shift consensus approach, whereas cognitive constructs such as shared mental models are often more appropriately indexed using a dispersion approach (Chen et al., 2004).
Mathieu and Luciano (in press) submitted that it is important for scholars to articulate the anticipated form of different emergent constructs. For example, is the variability in the construct presumed to decrease/converge (e.g., cohesion), increase/diverge (e.g., leader emergence), move toward a specific pattern (e.g., a particular communication flow), or systematically vary in response to particular episodes or events (e.g., flux in coordination) over time? Kozlowski (2015) also advocates for consideration of growth trajectories and fluctuations over time. Notably, the temporal framework(s) guiding the investigation (e.g., developmental, episodic, event-based) may influence the particular shape of the anticipated changes. In addition, Mitchell and James (2001) encouraged scholars to describe the time lag, duration, and rate of variable change (to the extent possible) as each facet has important implications for when a construct is measured. Finally, for multidimensional constructs, the structure element also involves articulating how the subdimensions combine to form the higher order construct, including the relative strength and direction (e.g., reflective, formative; MacKenzie, Podsakoff, & Jarvis, 2005) of the relationships.
Appearance
Constructs are abstractions used to explain a phenomenon and not directly observable, therefore their existence is inferred from observable indicators (Kozlowski & Klein, 2000; Nunnally & Bernstein, 1994). The appearance element of construct explication entails how the construct is likely to manifest (e.g., increased heart rate, text, proximity) and any conditions required for the manifestation to occur (e.g., resilience may be best observed when the entity is under stress; adaptability may require changing circumstances). Combining considerations from appearance and structure, researchers should articulate when the indicators are likely to be observable. This element is likely different for emergent states as opposed to team processes because “emergent phenomena require sufficient time and team interaction before coalescing as perceptible team properties” (Carter, Carter, & DeChurch, 2015, p. 1). This distinction is particularly important as attempting to capture construct indicators when they are not observable can lead to inaccurate conclusions (Carter et al., 2015). For example, the level of cohesion present in a team after their first meeting may not be particularly meaningful as it has not yet coalesced. Notably, one’s temporal theory provides important guidance on when the construct is likely to manifest. The appearance element is the most contextualized of the four and most directly expands beyond the borders of a construct definition to explain the construct in context.
Summary
Explicating the elements of space, nature, structure, and appearance provides a comprehensive understanding of the construct. These four elements work in concert with one another, each informing the other. They should be considered separately and together, with reference to but not constrained by the existing literature. More complete construct explication facilitates measurement alignment and helps build a stronger knowledge base for the domain.
Measurement Features
Construct measurement includes both how the data are collected and operationalized. Herein, we more prominently feature considerations regarding how the data are collected as it has previously received less attention than operationalization/aggregation concerns (cf. Bliese, 2000; Bliese & Lang, 2016; Chen et al., 2004; Fisher & To, 2012; LeBreton & Senter, 2008; Uy, Foo, & Aguinis, 2010). Incorporating and expanding beyond current measurement considerations, we detail five measurement features that should be explicated in the study design: content, technique, source, timing, and aggregation.
Content
Content reflects what is included in the measure. Although content is likely the most straightforward measurement feature to align with the construct explication, the threats to construct validity in the case of misalignment warrants it a brief mention. The content of the measure should be aligned with the conceptual construct space such that the measure adequately samples and represents the domain of the construct at the appropriate level of specificity (i.e., content validity). Misalignment of construct space and measurement content can threaten construct validity and lead to construct contamination (if extraneous factors are included) or deficiencies (if important aspects are omitted; Bagozzi & Edwards, 1998; Schwab, 1980).
Technique
Measurement technique refers to the “instrument” that is used to collect data. The feature is particularly important as misalignment of construct elements and measurement technique can threaten construct validity as the data collection may fail to capture the intended phenomenon. The majority of dynamics research utilizes surveys to capture the phenomenon of interest; however, many studies have utilized interviews (e.g., Fauchart & Gruber, 2011; Pache & Santos, 2013), observations (e.g., Bruns, 2013; Homan, van Knippenberg, Van Kleef, & De Dreu, 2007), and digital/archival traces (e.g., Bänziger, Patel, & Scherer, 2014; Davison, Hollenbeck, Barnes, Sleesman, & Ilgen, 2012). Emerging technologies provide additional options, including wearable sensors and CATA. Each technique has strengths and weaknesses and is better suited for different circumstances. In Table 2, we provide a sampling of studies utilizing the newer measurement devices for the different data streams and further comment on the strengths and limitations of the different techniques.
Resources and Examples of Continuous Measures of Different Data Streams.
Interviewing or surveying key informants are widely used methods to index dynamics; as these individuals are participants in the dynamics, they clearly have an important perspective to consider. However, their perceptions may be tainted by the role(s) that they occupy as well as self-serving and other forms of biases (Martell, Guzzo, & Willis, 1995). Moreover, they may not be privy to certain information, be aware of, or be in a position to accurately report on certain aspects of their functioning. In short, interviews and surveys often become the methods of choice not necessarily because they are inherently ideal but because they are familiar and often relatively easy to use. Although such considerations certainly play a role in determining the best measurement technique to use, there are other important considerations.
The type of construct is one of the most important aspects of appropriate measurement technique considerations. Behavioral constructs are visible activities that may be recorded, observed by others, and often leave trace measures. For example, transition activities often involve information searches and situational analyses and culminate in formal plans for action. Coordination involves the synchronicity of actions that are often readily apparent and may be evident in communications and physical movements. However, members who are caught up in the heat of the action are not likely to recall the flurry of interactions accurately, therefore rendering interviews and surveys as less suitable measurement techniques.
Cognitive constructs are “in the heads” of individuals and not easily accessible by others, increasing the suitability of techniques that rely on self-report. Interviews and survey techniques can elicit how individuals structure information but may inadvertently impose a structure on participants. Alternatively, techniques such as analysis of natural language use and perhaps some forms of direct measures such as neuroscience technologies may be able to reveal individuals’ actual structuring of knowledge. These techniques may minimize reactance and be particularly well aligned for indexing knowledge structures in an ongoing fashion.
Affective constructs are about an individual’s internal experiences. Naturally, it follows that individuals are the best source of such information; but asking interview or survey questions about potentially “loaded topics” (e.g., conflict, social loafing) can be threatening, evoke a variety of response biases, and be seen as intrusive. Some affective constructs may be revealed through spatial positioning, such as individuals’ propinquity, the extent to which they face one another, and the nature and reciprocity of their speech. Other affective constructs such as conflict are often readily observable, conveyed in language and heated or asymmetrical speech patterns. Facial emotional recognition techniques are particularly well suited for unobtrusively detecting affective states, and neurological and psychophysiological patterns are also associated with certain emotions and affective states (Waldman et al., 2015).
Some investigations may employ a number of different methods of measurement for different purposes. First, it may be the case that researchers employ different methods of measurement to index different substantive constructs. Second, researchers might employ a mixed-methods approach where one approach is used to inform another. For example, O’Neill and Rothbard (2017) explored high-level emotional components of organizational cultures using semistructured group interviews, which they then quantitatively tested with a survey data. Pavlou and Fygenson (2006) sought to understand the adoption of e-commerce. They began by using exploratory belief elicitation techniques, followed by qualitative open-ended questioning, to inform a confirmatory quantitative study. In these instances, one method—typically a more qualitative one—is used to inform the construction of a different measurement method. Alternatively, a pattern of big data measurements could suggest something important changed in the dynamics, warranting closer scrutiny of audio or video recordings or interviews with participants to discern the underlying reasons for the change. In other words, different methods may be suitable to address different questions that researchers have at different times in an investigation (cf. Grant & Wall, 2009; McGrath, 1964; Scandura & Williams, 2000).
A third application of multiple methods of measurement is the classic multimethod investigation where researchers simultaneously use different techniques to index the exact same construct(s). For example, Venkatraman and Ramanujam (1987) compared senior executives’ perceptions of business economic indicators (e.g., sales growth, net income growth, profitability) with actual values derived from archival measures. In addition, Amabile, Barsade, Mueller, and Staw (2005) examined temporal dynamics of individual affect using both narrative content coding protocol of daily dairies and self-report daily questionnaires from members. Typically, the idea behind a multimethod investigation is to be able to gauge the extent to which the observed scores are a product of the constructs (true scores) versus a byproduct of the method of measurement. Whereas we applaud and encourage such approaches, we caution that not all methods of measurement are likely to be equally suitable for indexing all dynamic constructs. Spatial positioning measures are not likely to correspond with survey measures of empowerment, and interviews are not necessarily going to parallel the emotions that people reveal through facial expressions during events. In other words, multiple methods of measurement are only likely to converge if they are equally aligned with the target construct(s). Alternatively, researchers may use multiple techniques to capture different facets of the same construct. For example, cohesion could be examined using a combination of spatial positioning and word analysis (i.e., how close the team members were standing to each other and what they were saying). Particularly for the newer measurement techniques, ways to combine them to provide richer information than any one source would provide is an important area for future research.
Source
Measurement source refers to where or from whom the data are collected. Interviews may be conducted with and surveys may be completed by individuals, their managers, or others with whom they come in contact (e.g., subordinates, customers, members of other work units). Consistent with the 360o feedback literature, we note that some parties are more knowledgeable and better suited for providing information of different types (Tesluk, Mathieu, Zaccaro, & Marks, 1997). We suggest the key concerns when gathering information from people are: (a) whether they have sufficient knowledge/perspective to provide valuable information and (b) whether they are motivated enough to provide accurate information. Without those two factors, measurements are likely to be distorted, reflecting implicit theories, halo, or socially desirability errors rather than true scores. For some research questions (e.g., the influence of differences in or convergence of perspectives over time), triangulation of data from multiple sources will be particularly important (Cohen, Manion, & Morrison, 2007).
Additionally, the appropriate source(s) of information should be considered in conjunction with measurement techniques. For example, samples of members’ communications can come from emails, texts, video conferencing, or transcriptions of face-to-face exchanges. It is possible that individuals express affect or conflict differently through an email than they would in person. At issue is, for example, which communication(s) to sample, which physical traces to collect, and what spatial positioning to track, all of which have implications for the resulting measurements and must be aligned with the constructs of interest. Determining the best source of information for any given construct requires a grounding in the study context and the elements of the construct.
Timing
The temporal theory guiding one’s investigation proscribes a sampling frame for the timing of measurements (Ancona, Goodman, Lawrence, & Tushman, 2001). For example, developmental theories suggest that entities go through different stages or phases over their life cycles (Morgan, Salas, & Glickman, 1993). The timing and duration of stages may not be and are not likely to be uniform across entities. However, to adequately model the evolution of stages, one needs to collect data across time, when each entity is at different stages and transitioning between stages. This will not occur if one gathers measurements from all entities at one point in time. Similarly, episodic theories suggest that transition processes such as interpreting feedback and forming plans occur first, after which there is execution of action processes (Marks et al., 2001). Here again, simultaneously gathering measures about transition processes and action processes will result in misalignment with one or the other process. For emergent states, it is particularly important to consider whether the construct is in the process of emerging versus is established but changing over time (cf. Carter et al., 2015; Mathieu & Luciano, in press).
In a similar vein, researchers may also pursue what we refer to as an embedded event evaluation (EEE) methodology. EEE approaches focus on certain temporal windows in the stream of ongoing phenomena that afford opportune times to examine certain dynamics. Morgeson et al. (2015) argued that event-based theories provide a bridge between attribute-based research (i.e., between entity comparisons of statics) and process-based research by concentrating on dynamics surrounding particular events or occurrences and their features. For example, in the context of experimental and simulation research, Fowlkes, Lane, Salas, Franz, and Oser (1994) proposed that stimulus events (e.g., the appearance of an enemy aircraft or a critical system failure) can be embedded in a larger scenario and designed to trigger certain processes. Alternatively, in a retrospective field design, Morgeson (2005) had team leaders describe previous novel and disruptive events and then asked team members to report on how their leader and team handled those very events. Also, Swider, Liu, Harris, and Gardner (2017) explored the implications of rehiring former employees by comparing performances at boomerang employees’ key transition events (i.e., initial departure from an organization and return to the organization he or she left). In short, certain periods within an ongoing stream of behavior, akin to particular scenes in a movie, provide especially vivid opportunities to examine dynamics.
The implications that entities engage in different activities at different times suggest at least two overarching measurement timing strategies. First, researchers may attempt to synchronize the measurement of different constructs with when they are actually occurring. Aligning measurements with the occurrence of certain events is relatively straightforward in EEE style experiments or simulations as well as controlled laboratory investigations but much harder to accomplish in the field. However, if entities tend to follow a particular pattern of behavior (e.g., seasonal businesses, preparation for an initial public offering), then researchers may be alerted to when each entity is about to transition from one phase to another and time their measures (observations, interviews, video samples, etc.) accordingly. Alignment is also achievable using a two-stage retrospective design in the field and perhaps by synchronizing measurement with anticipated upcoming events (e.g., anticipating work process or technology interventions). Notably, this strategy requires an ipsative approach where each entity’s progressions trigger measurement occasions that may not align across entities.
The second strategy for aligning measurement timing is to gather data continually, which enables one to model the natural ebbs and flows of dynamics and/or sample from the continuous stream instances that correspond to certain developmental or episodic periods or meaningful events. For example, customer interactions may be video- or audio-taped and then sampled and coded, or perhaps particularly revealing exchanges (e.g., encounters with problematic customers) can be reviewed and coded. Note that this alternative is particularly suitable for constructs such as interpersonal processes and affective states that may be salient and activated at any point in time. Continuous data gathering is not feasible for traditional interviews or surveys, although abbreviated versions may be viable for diary or event sampling style investigations. The emerging techniques that we reviewed generate ongoing streams of data. In these cases, the challenge becomes the application of automated scoring protocols and indexing, lest the volume of the data overwhelms the resources of the researcher(s).
Aggregation
The measurement of a construct includes both how data are collected and operationalized. The operationalization of dynamics constructs usually involves some form of aggregation. Aggregation considerations include the appropriate combination of data across (a) indicators/data streams, (b) individuals/sources, and (c) time/instances. The justification for aggregation should also be included when appropriate (e.g., intraclass correlation estimates, interrater agreement, scale reliability; Bliese, 2000; LeBreton & Senter, 2008).
Although the appropriate form of aggregation was likely determined while explicating the construct structure, data that are collected multiple times (e.g., via wearable sensors, trace measures) may also require additional decisions regarding meaningful units of time. Determining the appropriate temporal unitization requires exploring what the smallest meaningful samples of behavior, cognition, or affect are necessary to yield valid snapshots of the construct, which guides decisions as to how to aggregate the data over time. For example, how much communication is required to yield valid and reliable CATA scoring? Wearable sensors and many direct measures generate streams of data in seconds or less—how long a period is required to be tracked and aggregated to provide meaningful snapshots of a phenomena that can then be analyzed on an ongoing basis? Generally speaking, larger samples of spoken or written words, behaviors, or physiological measures yield more reliable and stable measures. Alternatively, larger samples may gloss over important nuances or dynamics inherent in the different data streams. Here again, achieving optimal alignment will likely be an iterative process taking into consideration the nature of the construct in question, along with properties of the measurement instrument(s) intended to represent it, and informed by one’s theory of time and events. Notably, there is no one way to determine the most suitable units of aggregation—such decisions must be informed by the measurement fit process. The key question to consider for the aggregation measurement feature is whether the variance that exists in reality is appropriately represented in the operationalization of the construct.
Summary
Collectively, decisions concerning the appropriate content of the measurement, most suitable measurement techniques to employ, source(s) from whom or where data are gathered, and assorted timing and aggregation issues should be guided by the elements of the construct and the temporal theory underlying the investigation. Even though there are no one-size-fits-all solutions, there are relatively better (and worse) aligned combinations. Invariably, these decisions will also be guided by contextual considerations.
Contextual Considerations
The idea that context matters is axiomatic in management research. Generally, such assertions refer to contextual influences on substantive relationships (e.g., Bamberger, 2008; Eisenhardt & Graebner, 2007; Johns, 2006), but context can also be viewed as a framing factor for making decisions about how to conduct a research investigation. In short, different reasons for conducting an investigation change the relative priority of multiple considerations, including what measurement approaches to employ. Additionally, some contexts afford or prohibit different strategies. Accordingly, we focus two sources of context, the research context and the study context, as they pertain to measurement alignment issues in dynamic investigations.
Research Context
The research context considerations include the state of the existing knowledge and the purpose of the investigation. Prior research has extensively discussed the state of existing knowledge as a key determinant of the appropriateness of different research methodologies (e.g., Bouchard, 1976; Lee, Mitchell, & Sablynski, 1999; McGrath, 1964). For example, Edmondson and McManus (2007) posited that nascent areas are more suitable for qualitative research methods whereas quantitative approaches are more suitable for mature areas. Similarly, less structured techniques such as observations or unstructured interviews are better aligned with ill-defined or nascent constructs, whereas structured measurements are better aligned for indexing well-defined and well-understood constructs. In short, the state of the existing knowledge informs construct explication and thereby suitable measurement techniques.
The purpose(s) of the investigation also has direct implications for measurement alignment issues. Although we have focused on the value of measuring emergent states and processes repeatedly over time, there are many instances where a simpler approach may be suitable. For example, if an investigator is simply interested in whether an empowerment intervention has been successfully implemented, two well-timed surveys may be adequate. However, if the investigator is interested in the evolution and development of team empowerment over time in response to that intervention, dynamic measurement protocols are called for. And naturally, the extent to which a dynamic construct is the focal variable of an investigation, as compared to a covariate, places a premium on its measurement alignment.
Study Context
Every study context affords certain opportunities and constraints. Two key study context considerations include the ability to capture the construct of interest and whether the context alters the construct. The question of whether a construct can be captured in a particular setting subsumes several more specific questions. For example, does the topic of interest manifest in the context in such a way that it can be captured? This issue includes concerns about variance in behavior (e.g., situational strength), frequency of occurrence (e.g., how frequently do novel events occur?), and logistical concerns regarding the level of intrusion/access the host setting and participants will tolerate (e.g., will the organization and prospective participants allow video recordings of interactions?) and amount of time they are willing to give (e.g., a maximum of two surveys, less than 15 minutes each).
Along similar lines, scholars must also consider whether study contextual factors may alter the construct. Changes can pertain to both the manifestation (e.g., information elaboration may largely manifest as verbal exchanges in collocated groups and as text in virtual groups) and trajectory (e.g., the trajectory of organizational empowerment may be different depending on the levels of other constructs that make up human resource management bundles; Subramony, 2009). To provide a hybrid example, field investigations often study groups, units, or organizations with ongoing memberships that might be at various stages of their life cycles. Various events, such as membership changes (Hausknecht & Holwerda, 2013; Hausknecht & Trevor, 2011) or environmental challenges (Morgeson et al., 2015), may tax an entity’s processes and states and trigger changes. Additionally, entity states and processes may have long since emerged but may fluctuate as a function of circumstances. Such dynamics may be the focal variance of interest or represent confounding factors depending on one’s research question. Either way, researchers must consider the potential impact and whether it can be captured or necessitates a different study setting to achieve measurement fit.
Summary
Research context incorporates consideration of why the research is being conducted, whereas study context incorporates considerations related to where. These represent important factors that can enable or undermine a study. Researchers must determine whether a given context represents an opportunity to study their intended construct(s) and if sohow they can optimize the measurement alignment. Building on the discussion of the construct elements, measurement features, and context, we advance a measurement fitting process that adopts an iterative approach to achieving measurement fit.
Measurement Fitting Process
Consistent with the methodological fitting process advanced by Edmondson and McManus (2007), we suggest that measurement fit is “achieved through a learning process…that centrally involves feedback and modification at many stages” (p. 1173). Our measurement fitting process integrates and expands beyond current fit considerations to provide guidance for future research. Our goal is to provide a scaffolding for researchers to simultaneously and iteratively consider competing demands while making measurement decisions to achieve optimal fit and enable modeling of dynamic constructs. Table 3 presents an overview of the guidelines for achieving measurement fit, including the main steps, components, and considerations. A sample illustration of how the guidelines can be applied to a specific measurement challenge is provided in Appendix B.
Guidelines for Achieving Measurement Fit.
Prior to beginning the measurement fitting process, the researcher(s) should determine the topic, phenomenon, or problem of interest. Naturally, dynamic constructs will also need to be embedded in a temporal framework (development, episodic, event-based) as the ideas take shape. At this early stage, it is useful to consider the research context (i.e., state of the literature and purpose of the investigation) and have a sense of the research question. A well-defined research question with clear constructs of interest will help minimize the number of iterations, but it is not prerequisite to begin the process. Particularly for more nascent research areas, novel questions, or inductive approaches, refinements are likely to occur along the way.
Iterative steps
The fitting process starts with fully explicating the construct, including the space, nature, structure, and appearance. These elements should be considered separately and in combination, with reference to the existing literature. Initially thinking more broadly about the construct can assist in the exploration of new areas and articulating the temporal dynamics more fully than in the existing literature. However, there ultimately needs to be clarity and alignment between the granularity and nature of the construct and its measurement. For example, is the researcher interested in multiple types of conflict or only task-based conflict? The former requires a multidimensional measurement plan, whereas the latter is focused on a single dimension. Such alignment helps clarify the construct space of an investigation in the larger nomological network (Schwab, 1980).
The next step in the fitting process is to determine the ideal measurement features, including the content, source, technique, timing, and aggregation. Again, each feature should be considered separately and in combination, frequently referencing back to the construct elements to confirm alignment. In particular, construct type (affect, behavior, cognition) and temporal theory (developmental, episodic, event) should play prominent roles in measurement decisions.
Next, incorporating the specifics of study context can be both enlightening and frustrating. In particular, the study context may influence the appearance of the construct and inform the selection of measurement techniques. For example, many organizations will not permit video-taping or the monitoring of members’ physiological states, yet some high-fidelity simulation environments (e.g., medical, space, or aviation) may welcome the use of such measures. The key question to consider is how the study context alters the manifestation or development of the construct. Consideration of these factors will assist in choosing between viable measurement techniques, collected from which sources, and at what times. The study context also assists in measurement customization (e.g., determining survey items to assess task SMMs) and brings specificity to the study procedure (e.g., after “x” event means on April 15). However, to the extent to which these contextual considerations require modification (rather than clarification) of the study plan, the researcher should revisit previous decisions to ensure they remain appropriate. In addition, the research context should also be revisited to confirm alignment. Although presented here in a relatively linear fashion, this is an iterative process, and revisions can originate from any of the elements, which necessitates revisiting other elements.
Additional verifications
After the researcher has completed the iterations of the alignment iteration steps, four additional verifications should be conducted. The first three are designed to help reveal whether the researcher can realistically capture an accurate and complete depiction of the dynamic construct. The fourth involves ethical considerations. These additional steps reaffirm the importance of measurement precision and a holistic assessment of the measurement considerations to maximize construct validity.
The first verification involves assessment of completeness. The researcher should consider whether the construct has been fully explicated and will be comprehensively captured utilizing the selected measurement features; in particular, considering whether sufficient attention has been given to how the construct will manifest in the study context and whether the selected measurement techniques are likely to capture the relevant facets of the construct throughout the relevant time period in such a way that fully addresses the research question. Well-articulated constructs help ensure completeness and minimize measurement contamination and deficiencies.
The second verification involves accuracy. The researcher should consider whether the measurement features are calibrated to accurately capture all relevant facets of the constructs throughout the relevant time period in the study context with the degree of precision required to establish convergent and discriminant validity from related constructs. Furthermore, sufficient consideration needs to be given to the potential influence of the measurement techniques on the construct of interest. Stated differently, researchers should be aware of validity threats, such as social desirability, evaluation apprehension, experimenter expectancies, and the Hawthorne effect, that are likely to undermine the accuracy of the data. Although indicators are inherently imperfect representations of the latent construct, to the extent to which there are inaccuracies or biases in the construct conceptualization or measurement, the study results become suspect.
The third verification involves an assessment of feasibility. In addition to the constraints imposed by the study setting, it is important to consider feasibility from the perspective of the researcher and participant. For example, is the measurement protocol so daunting that it would overwhelm the researcher(s) (e.g., real-time observation of 50 co-occurring interactions, with no video tapes)? Additionally, researchers must consider how much they can ask of participants before the quality of data becomes compromised and/or interferes with the participants’ accomplishment of their primary work. For example, many individuals would be unable to accurately recall every instance of conflict between every dyad of team members over the past year and unwilling to chronicle every instance of conflict over the next year. Even for studies of relatively shorter duration, participants may become fatigued, disinterested, or resentful.
The fourth verification involves ethical and privacy issues. Studying the development of and changes in behaviors, cognitions, and affective reactions mean that individuals need to be monitored frequently. Evolving minimally intrusive big data techniques offer exciting opportunities for researchers but also the potential for abuses. The ubiquity of video recordings has become a way of life in modern society. But data mining of employees’ conversations, text and email messages, and movements throughout the day leaves little unrevealed. Add to that, the monitoring of employees’ emotional and psychological states through direct measures may make employees feel as though they must sign away their privacy rights as a condition of employment—the likes of which would rarely be approved by institutional review boards. Moreover, the potential use of continuous monitoring devices often triggers extra scrutiny by host organization lawyers and employee unions.
Review
Although some studies of dynamic constructs may exclusively examine one construct as it emerges and changes over time, many scholars will desire to study multiple dynamic constructs simultaneously. After completing the measurement fitting process for each variable, it is important to consider the sum of the decisions with respect to the overarching study design and research question. Regarding the overarching study design, researchers should consider whether the combination of the intended measurement protocols is tenable and suitable for the context and assure alignment between the level of theory and method (Krasikova & LeBreton, 2012). This final review links the measurement fit concerns with the overall methodological fit concerns of the investigation. It also affords the opportunity to terminate an investigation before too much is invested if measurement alignment and other requirements (e.g., statistical power) cannot be sufficiently addressed or the proposed study protocol does not sufficiently address the research question. Although measurement fit is an important component of high-quality research, it is not the only component. The question investigated must be worth answering.
Philosophy of Research
The measurement fitting process advanced previously suggests the researcher has identified a construct, phenomenon, or problem of interest, which is compatible with both deductive and inductive approaches to the research. Big data techniques may be used in deductive approaches but are often promoted as tools to promote inductive research methods (e.g., McAbee, Landis, & Burke, 2017; Roski, Bo-Linn, & Andrews, 2014). As emergent technologies produce nearly continuous streams of big data, it provides great flexibility in the examination of changes over time, which may be critical when the appearance or timing of the construct is not known a priori. However, we suggest that breaking the boundaries of what is known will likely require a spirit of discovery and embracing both inductive and deductive techniques. Stated differently, theory building may benefit from a combination of or iteration between inductive and deductive approaches. For example, theory building may begin with more inductive forms of reasoning, via the observation of patterns in the real world, which prompts the development of tentative hypotheses and theorizing. That theorizing is then further refined, deducing more specific hypotheses that are systematically tested to further inform the theory. It is important to keep in mind that big data streams will not generate knowledge in a vacuum–they have to be used and interpreted in the context of a theoretical framework and research paradigm (Mathieu, 2016). In a related vein, we believe that problem-focused research can guide a process called abduction, which can involve inductive and/or deductive inferences (Mathieu, 2016). Viewing empirical examination as a way to inform rather than exclusively confirm (or not) the original theorizing represents an important step forward for theory development. This suggestion is aligned with recent calls for transparently engaging in post hoc analysis of scientific data to “promote the effectiveness and efficiency of both scientific inquiry and cumulative knowledge creation” (Hollenbeck & Wright, 2017, p. 5).
Discussion
Dynamic phenomena that emerge and change over time are important topics of interest to numerous scholars. Although investigations of the static states of dynamic phenomena have generated a wealth of important information, numerous research questions about the core of the dynamic phenomenon itself (e.g., how and why these constructs emerge and change over time) can only be addressed using dynamic research methods. One of the primary impediments hindering the examination of dynamic phenomena has been the challenges associated with collecting data at a sufficient frequency and duration to accurately model changes over time. Emerging technologies that produce nearly continuous streams of big data offer great promise to address those challenges and enable examination of dynamic research questions. However, they also introduce new methodological challenges and construct validity concerns. With the overarching goal of accelerating the advancement of dynamic theories and methods, we strove to integrate the emergent technologies into the existing repertoire of measurement techniques and offered an iterative approach to achieving measurement fit to reduce threats to construct validity.
Implications
We began with an overview of dynamic constructs and temporal frameworks (i.e., developmental, episodic, and event-based models), highlighting their measurement implications. Second, we discussed the data streams (i.e., behaviors, words, physiological responses) that represent the elemental content from which constructs are derived. We then offered illustrative examples of emerging measurement technologies and the big data streams they capture. By discussing emergent technologies in terms of their functions (e.g., collecting, indexing, interpreting) and constituent data streams, it creates a common language and facilitates identifying similar and unique aspects of newer and more traditional tools and thereby promotes integration of emergent technologies into the existing repertoire of measurement techniques. Furthermore, understanding the singularities, similarities, and synergies of different data streams is critically important to determine when the different measurement techniques should be used.
No measure is a perfect representation of its intended construct. Consequently, researchers need to consider a number of competing demands to achieve optimal measurement fit for their particular application. Therefore, we advanced a measurement fitting process, which took an iterative approach involving three core components: (a) construct elements, (b) measurement features, and (c) contextual considerations. For each of these facets, we integrated and expanded on existing guidelines. We then advanced iterative process for achieving measurement fit, including key considerations at each step and full process verifications, including completeness, accuracy, feasibility, and ethics. In so doing, we offer guidance on how to comprehensively think through important measurement fit considerations.
Measurement fit enables better tests of theory by refining and expanding construct validity considerations and in turn, builds a stronger knowledge base for the domain. Achieving measurement alignment is key to unleashing theoretical and empirical advancements in the study of dynamics. This article builds a bridge between the theoretical advancements in the conceptual understanding of dynamic constructs (e.g., Cronin, 2015; Kozlowski, 2015) and advancements in the analytical tools to examine those proposed relationships (e.g., Bliese & Lang, 2016; Putka, Beatty, & Reeder, 2018; Tonidandel, King, & Cortina, 2016). Absent construct-valid and useable measures of dynamic constructs, the sophistication of those theoretical and analytic frameworks will remain disconnected, and progress will be impeded.
Practical Considerations
Examinations of dynamic constructs require a different mindset and different skill sets than static studies (Cronin, 2015). Due to the complexity of the temporal dynamics, future research on dynamic changes may look very different from current research on static relationships as an examination of a single construct over time offers numerous areas for theoretical contributions. In addition, conducting longitudinal research, particularly using some of the newer devices, may require further development of research site relationship management skills. Researchers typically discuss gaining access to organizations to collect one or two surveys administrations, perhaps paired with some external criteria data in exchange for summary feedback reports. However, the examination of dynamic phenomena requires a much greater level of intrusion and involvement with the research site, and these so-called “unobtrusive” measures can at times be very obtrusive—even invasive—to the potential study participants. Gathering such data will likely require greater relationship building and management skills with client organizations than is typically practiced at present.
In addition, big data has been proffered as a means of studying entire organizations in near real time (George et al., 2014). However, the financial cost and logistical challenges of equipping hundreds (or thousands) of individuals with wearable sensors that require daily data downloads and charging is likely infeasible for most researchers. Even when equipment costs can be minimized, there are still costs associated with the storage and processing of massive amounts of data. Beyond financial costs, each of these new techniques and technologies requires a substantial investment of personal time and effort to learn. As these investments into new techniques and technologies may not be feasible or desirable for all researchers, the examination of dynamic phenomena may require collaborations between larger groups of researchers.
Future Directions
Big data techniques are likely to play an important role in accelerating the theoretical advancement of dynamic phenomena. Arguably the biggest advantage of big data is that they allow the investigation of temporal features of dynamic phenomena by capturing the entire movie reel as opposed to trying to draw inferences on the basis of a few snapshots. Such data are suitable for new and evolving statistical techniques such as growth modeling (e.g., Bliese & Lang, 2016) and multilevel dynamic network analyses (Zappa & Lomi, 2015). Moreover, having the capability to model dynamics beyond what a limited number of measurement occasions affords will encourage scholars to advance and test new theories pertaining to emergence, sustainability, discontinuous change, and a host of other time-related phenomena.
As technology and the use of big data techniques continue to advance, it will be important for these techniques to become increasingly reliable and replicable. Currently, the level of reliability of big data–related technologies varies notably. For example, McKenny, Aguinis, Short, and Anglin (2016) have outlined particular reliability models applicable to CATA analyses. Furthermore, CATA dictionaries can be developed and their content validity assessed and applied across situations. Alternatively, wearable sensors, although evolving rapidly, still have calibration and reliability challenges associated with the technology itself (e.g., calibration issues), participants’ use (e.g., inadvertently blocking signals), and contextual issues (e.g., salience of face-to-face or other relative positions; see Chaffin et al., 2017). Many of the device-related issues will likely be resolved as technologies mature; yet, researchers should remain vigilant and conduct controlled tests of all equipment prior to deployment. Accordingly, important directions for future research include the continued development of structured processes or methods to facilitate contextualizing constructs as well as the generation of guidelines for ensuring reliability and validity in the development of big data–style measures (e.g., McKenny et al., 2016), similar to the guidelines that exist for survey scale development (e.g., Hinkin, 1995; MacKenzie et al., 2011).
There is also a need to advance theory and methods to integrate data from multiple streams. For example, individuals’ nonverbal cues can substantially enhance or undermine the meaning of their words. Further investigation is required to determine (a) which combinations of data streams are particularly potent, (b) which combinations are associated with the greatest confusion or misinterpretations, and (c) what the absence of certain data streams signals in terms of processes. For example, sometimes the absence of communications in cockpits or surgical suites signals a lack of situational awareness and dire circumstances, whereas in other instances, it stems from highly developed implicit coordination processes. In short, future theory and research need to address how to combine multiple streams of data (e.g., behaviors, words, physiological responses) via different methods of measurements as aligned with different constructs (e.g., affect, behavior, cognition) and temporal theories (e.g., developmental, episodic, event).
In conclusion, it is an exciting time to be researching dynamic constructs. New and emerging technologies are rapidly developing and providing researchers with continuous streams of data at an unprecedented volume and detail. These microlevel data enable researchers to examine a vast variety of temporal trajectories—ranging from more simple linear and curvilinear trajectories to cycles spiraling up or down, intensifying or waning over time.
Footnotes
Appendix A
Sample Studies of Organizational-Level Dynamic Constructs.
| Construct | Description of Measurement Fit |
|---|---|
| Emotional culture (affect) | O’Neill and Rothbard (2017) examined the emotional component of masculine organizational cultures using 1- to 2-hour semi-structured group interviews and observations at 27 fire stations. Since the culture of organizations can be vastly diverse and context specific, the authors first explored high-level themes with qualitative method (i.e., interviews and observations). Then they quantitatively tested developed themes and hypotheses in Study 2 with survey data. This study demonstrates a mixed-method study exhibiting a high measurement fit for a context-specific construct, which enabled the authors to generate a more nuanced understanding of organizational culture, emotions, and gender. |
| Knowledge transfer (behavior) | Maurer, Bartsch, and Ebers (2011) examined the mediating role of knowledge transfer between social capital and organizational performance by administering a survey to key informants (project leaders) of 218 projects in the German engineering industry. In this paper, knowledge transfer was conceptualized as comprising mobilization (i.e., sharing between actors), assimilation, and use of knowledge within the organization, (beyond the transfer of knowledge itself). Accordingly, their measurement of knowledge transfer is well designed to capture three subprocesses with items accounting for the degree to which particular knowledge (i.e., market knowledge and technology knowledge) was discussed between project and other organization members and the degree to which this knowledge increased recipients’ knowledge or led to actions. By doing so, the authors were able to reconcile the mixed findings in the social capital–organizational performance relationship. |
| Long-term orientation (cognition) | Flammer and Bansal (2017) investigated how the long-term orientation of organizations impacts firm value, using a textual analysis of the firm’s 10-K filings from the EDGAR database. Specifically, they counted the number of keywords referring to the short term and long term, respectively. Then computed the long-term (LT) index as the ratio of the number of long-term keywords to the sum of long- and short-term keywords. This study demonstrates the use of text analysis on archival data to generate a proxy variable to represent an abstract construct in a systematic and consistent way. By doing so, this study generated insights on the time-based agency conflict between managers and shareholders. |
| Organizational ambidexterity (blended) | Andriopoulos and Lewis (2009) examined organizational ambidexterity (exploitation and exploration) in their comparative case study of the leading new product design (NPD) consulting firms. The study setting enables examination of the research questions because each firm had demonstrated consistent profitability and repeat clients (i.e., exploitative innovation) while also being highly ranked for cutting-edge design and receiving prominent design awards (i.e., explorative innovation). An intensive set of data was collected over more than four years, comprising (a) 86 semi-structured interviews, (b) archival data (industry reports and internal documents), and (c) observations (informal, nonparticipant site visits). By utilizing data from multiple sources and selecting a study context in which the construct of interest manifests, this study offers a deeper understanding of an alternative framework for examining exploitation-exploration tensions and their management across organizational levels. |
Appendix B
Authors’ Note
The views, opinions, and findings in this article are those of the authors and shall not be construed as an official Department of the Army position, policy, or decision.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was conducted with support from the United States Army Research Institute (Contract: W911NF-15-1-0014; The Development of Construct Validation of Unobtrusive Dynamic Measures of Team Process and Emergent States).
