Abstract
Organizational scholars frequently rely on experiments using human confederates or descriptions of vignette characters to study a range of phenomena. Although experiments with confederates allow for realism and rigor, human confederates have several critical limitations. We present a novel and efficient alternative: the use of responsive electronic confederates for manipulating constructs in dyadic, group, and team contexts. Specifically, we (a) define electronic confederates in an organizational research context, identify their optimal qualities, and review studies that have used them; (b) discuss challenges of utilizing human confederates and how electronic confederates may address these; (c) identify boundary conditions around using electronic confederates and, within these conditions, identify the many types of inquiry that can be aided by electronic confederates; (d) discuss validation strategies for electronic confederates, while increasing their believability to study participants; (e) provide materials for two versions of an adaptable research platform involving electronic confederates; and (f) identify future opportunities for developing novel tools for behavioral research. Our article thus provides a toolkit for organizational researchers that empowers them to utilize electronic confederates in their own research.
While organizations race to capitalize on sophisticated artificial intelligence (AI) to increase efficiencies, minimize labor costs, and decrease error rates (Ford, 2015), organizational researchers across a wide range of areas may also reap the benefits of simpler forms of AI—namely, “electronic confederates” (ECs). ECs are computer-based “actors” that simulate interpersonal responsiveness to human inputs in a dynamic and humanly authentic manner; more succinctly, ECs are responsive electronic agents who behave convincingly like humans.
ECs are used to manipulate study participants’ interpersonal experiences (e.g., social facilitation, social interaction, and/or observations of others’ social behavior) and contextual experiences (e.g., exchanges with others who vary in their levels of politeness, creativity, agreeableness, leadership behaviors, and myriad other traits and behaviors) in realism-maximizing ways that heighten researchers’ ability to infer generalizable causal effects from manipulated variables. Accordingly, experiments utilizing ECs offer significant opportunity to scholars in organizationally relevant fields, including human resources, organizational behavior, entrepreneurship, and strategic decision making. In this article, we introduce a range of EC technologies and recommendations for next-generation EC technology for organizational research (including “chatbots”). To expedite use of ECs in organizational researchers’ experiments, we present the following: First, we introduce three optimal qualities of ECs. Second, we review organizational research to date that has used ECs, including evidence of their efficacy as a research tool. Third, we highlight strengths of ECs, which may address some limitations of human confederates. Fourth, we discuss constraints surrounding the use of ECs. Fifth, we identify broad areas of research inquiry in which ECs may be useful to organizational scholars. Finally, we close by discussing future opportunities for developing more advanced EC technologies. To facilitate quick identification of key points, each subsection begins with a “section preview” highlighting the specific “take-home” message.
Optimal Qualities of ECs
Section Preview: ECs Are Optimized by and Can be Evaluated on Qualities of Evocativeness, Responsiveness, and Conceptual Faithfulness
We propose that EC technologies can be evaluated by three qualities, with the goal for any EC-based study to maximize all three to the greatest extent possible. We propose these qualities as a guiding heuristic for scholars adapting or developing EC platforms. We note that this rubric is nonexhaustive (i.e., other considerations are likely to emerge as ECs proliferate in organizational scholarship).
The first quality, evocativeness, refers to ECs’ ability to elicit natural interpersonal responses from naïve experimental participants. Evocativeness reflects that the participant is engaging with the confederate as if it were human, often predicated on the belief that the EC is human. 1 Evocativeness implies that naïve experimental participants will either (a) take for granted that they are interacting with another person or (b) believe that they are interacting with other people and thus respond as if this is occurring. To make ECs’ behavior seem as authentically human as possible requires balancing two seemingly contradictory goals: the need to constrain the behavior of the EC(s) so that it operationalizes the variables of interest, and the need to provide sufficient dynamic interaction-opportunities for naïve experimental participants to feel interdependently involved with their study’s other “participants” (actually ECs). To illustrate the importance of evocativeness, it is useful to contrast ECs with a similar technology used in behavioral research. “Agents” within agent-based modeling simulations are intended to simulate autonomous behavior and interact only with other simulated agents (Fioretti, 2013) and are constructed for the sake of formalizing mathematical models to describe how individual behavior may impact a large and complex system (e.g., Huang, Parker, Filatova, & Sun, 2014). Unlike EC-based experiments, agent-based models are purely simulations, and agents produce “data” in the form of end system states that can be compared against other observed outcomes that are generated by modifying system inputs or the “behavior” of individual agents (Vancouver & Putka, 2000). However, because ECs are designed explicitly as experimental stimuli, they are useful only to the extent to which they can elicit naturalistic data from human participants.
The second quality, responsiveness, refers to the level of interpersonal adaptation with which ECs reply to inputs of experimental participants. This responsiveness is lacking when experimentalists use nondynamic stimuli, as occurs in the common experimental practices of using alternative versions of vignettes or video recordings (Brooks, Huang, Kearney, & Murray, 2014; Hekman et al., 2010). However, researchers can make such video vignettes more adaptive and interactive. For example, pairing segments of videos with conditional branching “decision points” can allow participants to choose from a set of choices to “interact” with actors in a video, representing the simplest form of ECs. 2 For example, a study looking at venture capital funding decisions might allow a participant to choose from a set of questions to “ask” an entrepreneur in the video (which would then trigger a video segment with the entrepreneur providing a corresponding answer).
The third desirable quality of ECs, conceptual faithfulness (including both construct validity and content validity), describes the ability of ECs to faithfully represent the theoretical conceptualization of the construct being investigated. For example, ECs have been designed to behave rudely (Schilpzand, Leavitt, & Lim, 2016), to display personality traits (Erez, Schilpzand, Leavitt, Woolum, & Judge, 2015), and to ostracize others (Williams & Jarvis, 2006). While construct validity and content validity represent a goal for all experimental manipulations, the ability of ECs to standardize participants’ experiences while removing irrelevant confederate characteristics (such as attractiveness or age) may allow researchers to maximize correspondence between the desired theoretical construct and the operationalized manipulation. Later in this article we describe strategies for ensuring and maximizing the construct validity and content validity of EC-facilitated manipulations.
Review of Previous Experimental Research Utilizing ECs
Section Preview: While ECs Represent a Nascent Tool Within Organizational Research, the Published Use of Two Platforms Highlights the Promise of ECs as a Research Tool
The experimental EC platform Cyberball (Williams, Cheung, & Choi, 2000) was first used in psychology research nearly two decades ago. Cyberball was initially developed to demonstrate that social ostracism is such a broad concern for humans that it could be experienced in a relatively impoverished social environment (i.e., an online game with strangers; see Williams et al., 2000). In Cyberball, a naïve focal participant is told by the experimenter that he or she will be engaging in a virtual team task, interacting online with two other team members (who, unbeknownst to the participant, are actually ECs). The virtual team task involves a simple ball-tossing exercise. The ball-tossing actions of the ECs are manipulated by the experimenter to create the experience of either social inclusion or social exclusion for the focal participant.
Social psychologists have long been using this paradigm to investigate a variety of research questions regarding ostracism. However, only recently have organizational scholars begun using Cyberball to address questions about workplace behavior. For example, Whitson, Wang, Kim, Cao, and Scrimpshire (2015) used Cyberball to test more nuanced circumstances that explain individuals’ choice to exclude or include others—specifically, whether individuals’ tendency to use social exclusion with norm violators and to use social inclusion with norm conformers is stronger for individuals with greater (rather than less) job mobility and is weaker for individuals from collectivist (rather than individualistic) cultures. Narayanan, Tai, and Kinias (2013) used Cyberball to explore how individuals use social connections to recover from ostracism—specifically, whether individuals with higher power or position levels are more likely to seek social connections after experiencing social exclusion. Kouchaki and Wareham (2015) used Cyberball to test whether individuals experiencing social exclusion on work tasks feel more psychologically aroused and, in turn, behave unethically. Although many insights have been gleaned via Cyberball, its singular focus on social exclusion has limited its application.
Researchers interested in other domains of inquiry have adopted a tool called “Synergize” (Erez et al., 2015; Schilpzand et al., 2016). Similar to Cyberball, Synergize consists of a focal participant interacting on a timed team task with team members who (unbeknownst to the focal participant) are actually manipulated ECs. In contrast to the ECs in Cyberball who can only toss a ball, the ECs in Synergize can engage in greater social interaction (more closely resembling human confederates), by using simple communications (i.e., text or voice messages) during the task. Furthermore, researchers are able to manipulate ECs’ attributes, personalities, emotional displays, messages, and task performance.
Erez et al. (2015) used Synergize to examine whether and how emergent biases due to team members’ personalities can affect performance evaluation and reward allotment given by peers. In the study, an EC was manipulated to behave as highly agreeable, disagreeable, extroverted, or introverted. The researchers found causal evidence that extroverted and disagreeable individuals were rated as less competent and given less credit for team performance by their introverted peers. Schilpzand et al. (2016) used Synergize to examine the effects of rudeness directed at a single target versus multiple targets. The EC’s level of incivility was manipulated to target the focal participant, another EC, or both. The researchers found that study participants who experienced uncivil treatment targeting them and another individual (rather than only themselves) generally experienced less psychological disruption (i.e., stress, rumination, and task withdrawal).
Evaluating the relative efficacy of EC-based versus human confederate-based studies has proved challenging for two reasons. First, because ECs represent a nascent approach for organizational and behavioral research, no systematic or meta-analytic data comparing their efficacy to human confederates has been published. Second, given the need for ECs to interact with participants exclusively through electronically mediated communication, it is impossible to create commensurate conditions with a human versus electronic confederate. Despite these challenges, however, we offer triangulated evidence from studies involving ECs. More specifically, these findings suggest that manipulations using ECs can meet our three criteria: (a) evocativeness, (b) responsiveness, and (c) conceptual faithfulness (construct and content validity).
First, ECs appear capable of the goal of evocativeness. Evidence of this is provided by an article published in Science (Eisenberger, Lieberman, & Williams, 2003) that showed that manipulating ostracism using ECs was sufficient to trigger activation in the anterior cingulate cortex (ACC), an area of the brain associated with physical pain and social pain and isolation. Consistent with this, this study’s participants with higher ACC activation levels tended to report higher levels of distress. Thus, ECs appear sufficient to trigger neurological substrates associated with real social rejection.
Second, evidence suggests that ECs can meet the goal of responsiveness, in that participants in studies to date were generally unaware that they were interacting with ECs. As evidence, non-naïveté rates hover below 3% within both Synergize (Erez et al., 2015; Schilpzand et al., 2016) and Cyberball (Zadro, Williams, & Richardson, 2004). However, in some contexts the effectiveness of ECs may even be robust to issues of participant awareness. When participants playing Cyberball were told ahead of time that the other players were ECs, their experience of ostracism was similarly aversive as compared to those who were told that the other players were real people (Zadro et al., 2004).
Third, ECs appear capable of sufficiently manipulating constructs of interest (the goal of construct validity). To wit, a semianalogous pair of studies (Porath & Erez, 2007; Schilpzand et al., 2016) both directly manipulated rudeness via statements directed at participants. In Porath and Erez (2007, Study 1), participants were treated rudely by the experimenter, and in Schilpzand and colleagues (2016), participants were treated rudely by an EC within the Synergize platform. Although the studies used slightly different scales for their manipulation checks, the test statistics comparing the manipulated rudeness condition to the control condition were especially similar, F(1, 94) = 16.62, p < .01, effect size r = .38 in Porath and Erez (2007), compared to F(1, 125) = 34.88, p < .01, effect size r = .46 calculated for the two equivalent conditions in Schilpzand et al. (2016). While this is not a formal or direct comparison, the similarity in the test statistics would preliminarily suggest that the effectiveness of the manipulation is likely comparable for the two methods. Next, we discuss how ECs may address many of the challenges associated with using human confederates.
How ECs May Address Complexities of Studying Social Behavior Using Traditional (Human) Confederates
Section Preview: Compared to Utilizing Trained Human Confederates, Utilizing ECs Often Reduces Costs and Research Time Horizons, Reduces Unwanted Systematic and Random Variance, and Helps Protect Confidentiality
Before discussing human confederate-related complexities and how ECs overcome these, we clarify that the term confederate in the context of experimental research was historically used to describe a real person in the study context who, unbeknownst to the naïve experimental participants, was trained and paid by the experimenter to behave in ways that matched the study’s intended experimental manipulations (e.g., Asch, 1951; Mori & Arai, 2010).
Limitations of human confederates include the possibility that they (a) may feel emotions or moods or prefer behaviors that conflict with their behavioral instructions (e.g., behaving like an extrovert when they are actually highly introverted; cf. Nestel, Mobley, Hunt, & Eppich, 2014), (b) may spontaneously react to experimental participants, for example, if they personally know them or feel physically (un)attracted to them (Landy & Sigall, 1974), (c) may behave in ways that reflect hypothesis-guessing (Kuhlen & Brennan, 2013), (d) may forget (some of) their behavioral instructions, (e) may fatigue, and/or (f) may be difficult to schedule at times that are compatible with the experimenter and/or participants.
These six limitations of human confederates may be at least partially addressed by ECs because electronically programmed actors lack emotions or moods, behave only in ways that are programmed, lack the ability to make judgments about naïve experimental participants, lack hypothesis-guessing ability, and are indefatigable. Therefore, ECs’ ability to consistently and indefatigably evoke interpersonal responses from and respond to the actions of study participants enables experimenters to fully standardize the intended construct manipulations across many experimental sessions. ECs’ behavioral standardization removes many possible sources of both systematic and random noise variance. For example, hypothesis guessing by confederates may introduce systematic variance as confederates “direct” the behavior of participants toward the expected outcome (cf., Kuhlen & Brennan, 2013), which may thus increase the likelihood of Type I errors (i.e., false-positive findings). Similarly, when human confederates’ behavior is altered for reasons that are hypothesis-irrelevant (e.g., modifying their behavior due to physical attractiveness of some participants; cf. Landy & Sigall, 1974), they may introduce unintended noise variance into the study. Such “noise” variance may influence participant responding in unforeseen ways, increasing the likelihood of Type II errors (i.e., false-negative findings). The standardization allowed by ECs may thus greatly reduce both systematic and random unintended variance in an experiment.
Utilizing ECs may also invoke a useful trade-off, involving greater up-front development time and resources (i.e., programming and validating new EC-based manipulations) in exchange for reduced marginal time and resource costs for gathering large quantities of data. ECs do not need to be paid for training time and for showing up to experimental sessions; they also do not need to be schedule accommodated. Indeed, ECs’ lack of physical space and time constraints enables experiments with ECs to work effectively and economically in offline (laboratory) and/or online (web-based) environments. Because many existing platforms (such as Synergize) are easily adaptable, researchers with limited coding skill may modify existing EC platforms and avoid high-cost invoices from a professional programmer (requiring only that the new manipulations be validated; see subsequent section in this article). Although the software licenses to use these platforms are not always costless, the cost is often much lower in comparison to the cost of using human confederates. Moreover, while adapting and developing new platforms may require significant time, the marginal cost for collecting additional data once developed is especially low. Thus, the time and resource savings of ECs (compared to human confederates) are likely greatest over the span of a research program rather than an individual study. Additionally, like deploying an established survey scale from the literature, using prevalidated EC-based manipulations from prior studies can dramatically reduce the time from study design to completed data collection, as all experimental sessions can be run in tandem. When coupled with online data collection services such as Amazon’s Mechanical Turk or Prolific Academic, researchers may collect vast amounts of relatively high-quality data at a low cost from global or U.S. participants in a matter of hours (Buhrmester, Kwang, & Gosling, 2011; Crump, McDonnell, & Gureckis, 2013; Mason & Suri, 2012).
Finally, compared to human confederates, the inherent naïveté of ECs makes it easier to ensure that core information about the research and its participants remains confidential to people other than researchers. This offers the additional benefit of ensuring compliance with the regulations set by institutional review boards (IRBs), and safeguards the rights and interests of participants. 3 In summary, there are many limitations and commensurate risks associated with using human confederates to deliver “treatments” in behavioral experiments that can often be addressed by using interpersonally responsive ECs. That said, the benefits of ECs have boundaries (as do all research tools) that we discuss next.
When Are ECs Most (and Least) Likely to Aid Experimental Organizational Research?
Section Preview: As Summarized in Table 1, EC-Based Studies Are Appropriate for Many Research Questions But Limited by Three General Boundary Conditions
Three Boundary Conditions (Constraints) for Using Electronic Confederates.
The realism-maximizing goal of using ECs is undermined when naïve experimental participants perceive these confederates to be nonhuman. Thus, the crux of using ECs in behavioral experiments is structuring the research design so that ECs’ communications are both responsive and evocative. Meeting this design challenge requires understanding that two of the communication capabilities that are distinctly human are (a) being able to choose (rather than simply pursue prespecified) goals and (b) being able to express imaginative thoughts inspired by idea sharing (cf. Hill, Ford, & Farreras, 2015). Understanding ECs’ limitations in true human capability enables us to identify three types of constraints within which organizational scholars must work.
The Interaction Mutuality Constraint
First, EC-aided experiments are suitable for studies of phenomena regarding socially elicited responses that are captured at the individual level. Related to this, ECs may be an especially useful tool when the goal of the study is to examine natural judgments and behaviors that arise from immersion in a social context, including the social judgments a focal individual makes about another individual’s traits or behaviors. Due to ECs’ inability to communicate in ways that are distinctly human, EC-aided studies are not suitable for examining mutual interactions as seen in studies involving jointly or mutually developed goals, feelings, perceptions, opinions, or decisions; hence, they are inappropriate for examining jointly produced outcomes or emergent processes. We refer to this as ECs’ interaction mutuality constraint. At present, they are not an appropriate choice for studying reciprocal or emergent behaviors within teams or dyads, such as breaking conflict spirals (Brett, Shapiro, & Lytle, 1998), team structure and climate (Edmondson, 1999), or the development of reciprocal relationships (Mayer, Davis, & Schoorman, 1995). Thus, ECs are a poor choice when the outcomes of interest lie at the dyadic or team level. For example, if the outcome of interest is a candidate for aggregation to a team level of analysis (Quigley, Tekleab, & Tesluk, 2007) and a study trial includes only one individual participant interacting with ECs, it is likely an inappropriate choice to use ECs. If emergent outcomes such as these are the focus of behavioral experiments, then a more appropriate research tool is the minimal dyad or minimal group technology (i.e., creating randomly assigned ad hoc groups of real people in a laboratory environment; see Brewer, 1979).
The Interaction Complexity Constraint
Second, EC-based studies are suitable for studies examining relatively simple communication-based interactions; we refer to this as ECs’ interaction complexity constraint. ECs are effective when communication between subjects and ECs can be structured and limited, such that the focus of the social interaction is directed toward an ostensible task or goal. ECs are thus an appropriate choice when an experimental task is intended to unfold in a linear and predictable manner to maintain participants’ belief that the ECs are real people (the goal of evocativeness), and ensure that most EC behavior is directed toward making the intended manipulation salient (the goal of conceptual faithfulness). To achieve these goals, EC-aided studies should involve an interaction context that is relatively impoverished.
The Interaction History Constraint
Third, EC-aided studies are appropriate for areas of inquiry that involve new acquaintances (e.g., working as part of newly forming teams) or strangers (e.g., the earliest formation stages of trust judgments). ECs are an appropriate choice when examining “first impression” judgments, or when the construct of interest would be “clouded” by interaction histories, friendships, or loyalties. We refer to this as ECs’ interaction history constraint. EC-aided studies that have effectively placed real participants in (allegedly) stranger or new acquaintance interactions include situations involving social judgments relevant in team formation or judgments relevant to hiring (Biesanz, West, & Millevoi, 2007; Funder & Colvin, 1988). By contrast, scholars examining phenomena that require longer interaction histories should rely on trained human confederates, and those interested in manipulating phenomena within the context of established groups would be well advised to consider a quasi-experimental approach (see Grant & Wall, 2009). In summary, these three constraints of ECs illuminate when EC-based research designs will likely aid organizationally relevant inquiries. Next, we identify broad areas of inquiry that are conducive to EC-aided research designs. While this list is not exhaustive, it is intended to inspire organizational researchers to consider EC-based experiments for their research efforts.
Organizationally Relevant Inquiries Conducive to EC-Based Experiments
Section Preview: As Seen in Table 2, Broad Categories of Inquiry Within Organizational Behavior, Human Resources, Entrepreneurship, and Strategic Decision Making Are Well Suited to EC-Based Research
Organizationally Relevant Inquiries Conducive to EC-Based Experiments.
ECs are useful for myriad areas of inquiry. For convenience, we divide this section into organizationally relevant inquiries conducive to EC-based experiments for two different sets of scholars: (a) scholars who study “micro-level” phenomena (e.g., organizational behavior/human resources) and (b) scholars who study more “macro-level” phenomena (e.g., entrepreneurship and strategic decision making).
How ECs Can Aid Organizational Behavior and Human Resources Research
Processes and biases in employment interviews and evaluations work
The first area of inquiry that may benefit from the use of ECs pertains to employee selection and evaluation as a function of biases of social judgment (Funder, 1987), wherein ratings or judgments of others may be unintentionally influenced by a rated target’s personality, demographics, gender, or other performance-irrelevant characteristics (Skowronski & Carlston, 1987). The study of biases in social judgment is inherently broad, including research in domains ranging from biases in customer satisfaction ratings (Hekman et al., 2010) to judgments within employment interviews (Madera & Hebl, 2012) and evaluations of team member contributions (Erez et al., 2015). A carefully designed study using ECs allows scholars to manipulate the desired characteristic of a target while holding actual performance-relevant behaviors more constant than is possible with human confederates or commensurate versions of a video. Moreover, factors that may amplify or attenuate such biases (i.e., moderator variables) can easily be co-manipulated, such as task interdependence with the individual, level of task performance of the rated target, shared identities or attributes with the real participant, or salience/distinctiveness/relative level of the key attribute.
Responses to interpersonal behavior within teams
The second area of inquiry that would benefit from the use of ECs is interpersonal behavior within teams. ECs can be easily programmed to display (via their communications sent to focal participants) interpersonal behaviors ranging from subtle to overt, and negative to positive. Moreover, chat/dialogue boxes that are used to send ECs’ communications also receive focal participants’ specific responses directed at the EC; as such, these chat/dialogue boxes’ substance enable content coding of focal participants’ reactions to stimuli from ECs. ECs are thus useful for studying responses to the individual behavior of others (e.g., judgments or behaviors that occur after witnessing courageous individuals or cheating team members). This may be especially relevant for researchers interested in studying the effects of moral elevation (Aquino, McFerran, & Laven, 2011), punishment within teams (Fehr & Gachter, 2000), whistleblowing (Dozier & Miceli, 1985), gratitude (Emmons & McCullough, 2003), or other responses to observable (mis)behavior. Thus, by changing the task parameters in conjunction with manipulating the behavior of the ECs, researchers can readily allow participants to reward, ostracize, or punish the behaviors of others. For example, one version of Cyberball allows for throwing bombs to eliminate a player (“Cyberbomb”) and another version allows the player to deduct money from other players (“€yberball”) (see Van Beest & Williams, 2006).
Social facilitation effects in “public” performances of work
The third area of inquiry that is conducive to study via ECs pertains to social facilitation effects that stem from basic social pressures, such as feeling watched (e.g., monitoring; see Niehoff & Moorman, 1993), working in a team (e.g., social loafing), or experiencing pressures or encouragement from peers. Via the use of ECs, researchers can also easily manipulate a virtual team’s (supposed) demographic fault lines (Li & Hambrick, 2005) or virtual team members’ (supposed) attributes of similarity. This was done, for example, by Erez et al. (2015) by simply utilizing and directing attention toward pretask profiles of (supposed) team members who were actually ECs.
Spillover effects from interactions with coworkers or customers
Fourth, we suggest that ECs will be useful for studying spillover effects from perceived characteristics or behaviors of coworkers. For example, examining effects such as emotional contagion (Barsade, 2002), emotional labor (Morris & Feldman, 1996), or effects on participant information processing following interactions with a promotion- or prevention-focused team member could all be readily studied via interactions with ECs. Modeling spillover effects may be an especially fruitful area of inquiry using ECs, as virtual interactions with an ostensibly remotely located individual should provide a conservative test of how limited social interactions may impact an individual’s own downstream behavior.
How ECs Can Aid Research on Entrepreneurship and Strategic Decision Making
While experimental studies are historically less common in entrepreneurship and strategic decision making research (Hsu, Simmons, & Wieland, 2017), new best practices for enhancing ecological validity, preferences for multistudy papers, concern for causal claims, and more sophisticated thinking around generalizability have motivated macro-level researchers to incorporate experiments (Stevenson & Josefy, 2019). Specifically, researchers have argued that experiments are particularly useful for entrepreneurship research when studying early venture processes, when studying elements of theory which should generalize across contexts, or when seeking additional control for studying late-stage processes (such as funding decisions) with expert populations such as venture capitalists (Hsu et al., 2017). Thus, we now discuss areas of inquiry within entrepreneurship and strategic decisions where ECs may prove useful.
Drivers of entrepreneurial intentions
The decision to become an entrepreneur is a critical step in the entrepreneurial process, and scholars have devoted significant energy to understanding what factors will ultimately drive that choice (Krueger, Reilly, & Carsrud, 2000; Murnieks, Klotz, & Shepherd, in press). Experimental designs have been utilized to capture in-situ cognitions, including implicit and explicit gender stereotype activation (Gupta, Turban, & Bhawe, 2008). More structural features of decisional environments have also been recently explored to study entrepreneurial intentions, including manipulations of venture progress and autonomy (Gielnik, Spitzmuller, Schmitt, Klemann, & Frese, 2015). We suggest that utilizing ECs can allow entrepreneurship scholars to manipulate information about a potential venture’s feasibility, stereotype threats, or social cues such as role modeling. Moreover, other relevant states (such as learning orientation or emotions; see Hayton & Cholakova, 2012) may be manipulated by priming participants via ECs’ behavior.
Early entrepreneurial processes
Scholars are increasingly interested in studying both the social and cognitive processes associated with entrepreneurs’ opportunity recognition and exploitation (Gish, Wagner, Grégoire, & Barnes, 2019; Grégoire, Barr, & Shepherd, 2010; Singh, 2001). Moreover, valid measures of opportunity recognition have made capturing such processes in an experimental context more tractable (Grégoire, Shepherd, & Schurer Lambert, 2010). To wit, scholars have used experiments to manipulate “in the moment” drivers of opportunity recognition, including self-efficacy (Krueger & Dickson, 1994), idea solicitation framing (Rigtering, Weitzel, & Muehlfeld, 2019), perspective taking (Prandelli, Pasquini, & Verona, 2016), and self-framing (Mitchell & Shepherd, 2010). Such psychological states can be manipulated effectively and inconspicuously through interactions with ECs. Thus, ECs can be a useful tool for exploring the roles of feedback, emotions, and other cues (Cardon, Foo, Shepherd, & Wiklund, 2012) on entrepreneurial recognition and exploitation intentions.
Strategic performance following social interactions
Differences in the strategic decision making processes of professional managers versus entrepreneurs have drawn significant attention (Busenitz & Barney, 1997), with emphasis on decisional processes across levels of risk and uncertainty. While many of these studies rely on quasi-experiments within simulations and scenario-based approaches (Hsu et al., 2017), we suggest that such quasi-experiments can be augmented through the use of ECs. Accordingly, ECs could be used to manipulate factors related to overconfidence among team members, information quality, stakeholder salience, or decision-making frames (e.g., cooperative versus competitive).
Risk perceptions and uncertainty in a venture space
Decisions to take action in the face of uncertainty are often influenced by cognitions and perceptions at the individual level (McMullen & Shepherd, 2006; Palich & Bagby, 1995). For example, Foo (2011) manipulated emotions to study risk perceptions, as decisional biases, distortions, and heuristics often drive entrepreneurial decision making (Busenitz, 1999). Many factors that influence such processes may be especially well suited for study designs utilizing ECs. For example, information quality, tolerance for rule breaking, and social support can all be readily manipulated via ECs to study aspects of entrepreneurial decision making related to in-situ social and cognitive processes.
Judgements, evaluations, and funding decisions based upon entrepreneurial attributes
Because ECs are especially useful for manipulating confederate attributes while holding other aspects of confederate behavior consistent across experimental conditions, ECs may be extended to studying biases in investment and funding decisions related to entrepreneur attributes. To wit, discrete behaviors (Pollack, Rutherford, & Nagy, 2012) and demographic features (Brooks et al., 2014) of entrepreneurs have been shown to influence venture capitalists’ investment choices. More recently, scholars have begun to examine emergent biases between investor characteristics and venture capitalist/angel investor characteristics (Drover et al., 2017), such that precision in manipulating entrepreneur behavior in controlled ways would greatly benefit study design. Thus, researchers seeking to manipulate entrepreneur educational background, race, or gender would be served well by using ECs. Moreover, attributes and behaviors of focal entrepreneurs, such as entrepreneurial passion (Chen, Yao, & Kotha, 2009), can be engagingly manipulated using ECs. Finally, scholars have utilized experiments to examine investor judgments about likely venture survival as a function of managerial decision making (Shepherd, 1999). However, because access to and time with investors is often limited, these studies have historically relied on scenario approaches (as opposed to full pitches). We suggest that future research may leverage ECs, such that participants can watch an interactive investment pitch unfold in “real time,” with an EC entrepreneur responding to participant questions as the pitch unfolds.
Best Practices for Validating and Using ECs in Organizational Research
Section Preview: Scholars Considering ECs Should Prevalidate and Postvalidate for Construct Validity, Maximize Believability, Improve Engagement When Designing Confederates and the Task Environment, and Consider Ethical Concerns Related to Deception
All experimental designs must consider the validity of manipulations, focusing on the extent to which the constructs of interest are precisely manipulated within a study. Further, psychological realism is a goal and criterion for evaluating the quality of all laboratory experiments, including experiments using ECs. Researchers must primarily consider three important questions when designing studies using ECs to obtain a high psychological realism: First, are the ECs sufficiently believable as to appear human (i.e., are they sufficiently responsive)? Second, are the ECs (and the task environment) correctly designed and sufficiently engaging to activate the intended social judgment processes (i.e., are they evocative)? Third, does the presentation or behavior of the ECs precisely represent the intended construct of interest (i.e., are they conceptually faithful)? In this section, we describe several key techniques that behavioral researchers have used or can use to increase the validity, believability, and engagement of ECs and the interaction context. Table 3 identifies concerns related to believability and engagement, potential solutions for increasing both (through design of confederates and context), and the type of poststudy checks relevant to each.
Concerns Related to Psychological Realism and Strategies for Addressing Them Within Research Paradigms Using Electronic Confederates.
How to Ensure Conceptual Faithfulness and Increase Believability and Engagement
Ensuring conceptual faithfulness (construct and content validity)
To be useful, ECs must represent a faithful manipulation of the construct of interest (construct validity), while sufficiently covering the essential domains of the construct (content validity). First, because the behavior and comments/statements of ECs are prescripted, scholars may consider the use of expert panels in a given research domain to verify that the confederates’ behaviors converge with the intended construct (definitional correspondence; see Colquitt, Sabey, Rodell, & Hill, 2019), include behaviors related to all dimensions of the target construct (content validity; see Wilson, Pan, & Schumsky, 2012), and sufficiently diverge from related constructs (definitional distinctiveness; Colquitt et al., 2019). For example, Schilpzand and colleagues (2016) provided ten topically expert scholars with common definitions of multiple forms of interpersonal mistreatment. They were then asked to classify potential confederate comments as rude/uncivil; undermining; abusive; multiple categories apply. Items with high interitem agreement for rudeness/incivility (and low correspondence to other “nearby” constructs) were maintained and scripted to manipulate the focal EC’s behavior.
Additionally, descriptions of and self-statements by ECs may be constructed to use the language associated with the target manipulation. For example, Erez and colleagues (2015) asked participants to view feedback about each team member ostensibly derived from a personality inventory. Accordingly, participants read that the ECs were described using exact subtraits from Saucier’s (1994) big five inventory to manipulate high/low agreeableness and high/low extraversion. By using the exact traits from a scale intended to measure the manipulated construct, the researchers were able to ensure that the manipulation was faithful to the target construct. As with all experimental manipulations, scholars should consider calculating Lawshe’s content validity ratio for all versions of the manipulation (see Wilson et al., 2012).
Further, researchers should calibrate the intensity of their manipulations to map appropriately on to the construct of interest (e.g., incivility represents a lower-intensity behavior than workplace abuse; Pearson, Andersson, & Porath, 2000), remain salient against the backdrop of the task itself (i.e., stronger manipulations should be employed if the task environment is distracting), and consider how nascent the research topic is (i.e., stronger manipulations when the goal is to demonstrate a new effect can occur; subtler manipulations when attempting to maximize external validity). Accordingly, organizational scholars should embed manipulation checks in their studies to ensure that participants accurately perceived the construct that was intended to be manipulated. For example, scholars intending to manipulate “entrepreneurial passion” might ask participants to rate the EC on an existing scale of the construct (Cardon, Wincent, Singh, & Drnovsek, 2009) immediately following their interaction. Finally, research designs relying on ECs should employ subtle naïveté and suspicion checks at multiple stages of the study (following the principles of a funneled debriefing procedure; see Chartrand & Bargh, 1996). Checks should begin shortly after the participants’ interactions with ECs and before they complete surveys about the EC’s behavior, as extensive posttask surveys may reduce naïveté.
Increasing believability
We have emphasized how the mechanization of EC-based programs can reduce systematic and random error derived from the use of human confederates. Ironically, the technique used to minimize such systematic error can itself be a problem: overly mechanizing ECs can backfire as participants become suspicious if ECs are too robotic. One obvious concern is the potential for participants to recognize that ECs are not real people. We suggest that non-naïveté matters less when the behavior of confederates is intended to evoke an emotional response in participants—indeed, it is well established that fictional characters evoke real emotional responses from those consuming media (Mar & Oatley, 2008). Moreover, non-naïveté may be less problematic when the goal of the study is measuring or evoking social judgments about the confederate, as “paper people” vignettes are a common and valuable tool for capturing judgments and behavioral intentions (Aguinis & Bradley, 2014). However, confederate believability is especially critical when the outcome of interest is embedded within the social context. For example, if participants are given opportunities to cheat, whistle-blow on unethical team members, or voice objection to racist comments, awareness that other players are not real would call the validity of the data into question. To that end, we now discuss strategies for increasing believability and measuring non-naïveté. Notably, these practices are intended as initial considerations and do not represent empirically validated and exhaustive best practices (which will likely emerge as ECs proliferate in organizational research).
First, study designs employing ECs should structure their studies to maintain only task-relevant communications between participants and confederates. Within the Synergize paradigm, the text and voice messaging functions allow for sending messages to the entire team and only on their own turn. Other strategies for structuring interaction include asking participants to limit communication to task-relevant issues, creating perceived incentive structures against unnecessary communication (e.g., sending comments erodes limited time that could otherwise be used for earning money), or allowing communication only during certain stages of the study (e.g., chat functions are only enabled during a specified time). Limiting dialog box length or seconds available for voice messages among participants/confederates may similarly reduce threats to believability. Similarly, ECs are more believable if elements of the study design ostensibly demonstrate that time passes as other “real people” (actually ECs) are joining the task. In the case of Synergize, online participants view several screens suggesting that the program is checking the network for potential players, and another screen requires them to wait as the task populates (e.g., “waiting on final player to join…”; see Erez et al., 2015).
Second, the believability of ECs can be increased if they directly engage the participants in a responsive way. For example, in a study manipulating rudeness, the rude confederate used the participant’s username (e.g., “[participant’s entered username] picked the lamest avatar!”) and reacted to the participant’s actual performance on a creativity task (e.g., “[most recent creative use for a brick entered by participant]?!?! Really?!?!”) as part of the manipulation (Schilpzand et al., 2016).
Third, believability can be increased through designing subtle features of the EC’s persona and behavior that humanize them. For example, to simulate “thinking time,” ECs within both Cyberball and Synergize vary how long they take to respond during their turns based upon probabilistic distributions (i.e., sometimes taking three seconds to respond, other times taking seven). Additionally, to simulate how real individuals behave in performing an online environment, ECs should be programmed to make occasionally simple typographical or punctuation errors consistent with a computer-based task (e.g., “nice wrok team!!”) and use shorthand for communication (e.g., “pass 2 me next!”). Similarly, if multiple ECs are used, one of the confederates (not central to the manipulation) might be programmed to give mediocre answers, interject light humor, or have a username that includes some benign cultural reference (e.g., “StarbucksAddict”).
Fourth, believability of ECs may also be increased by adding voice chat options to the functionality and employing voice actors to provide comments by the confederates. This feature can be supported by requiring participants to confirm that they have speakers and a microphone enabled at their own computer terminals before the task began. Related to this, one version of Cyberball allows for the use of player photos to be used next to player avatars, increasing the humanization of ECs (Williams & Jarvis, 2006). This approach can even be extended in future EC-based experiments to include fictitious (prerecorded) “live video” feeds of other team members sitting at their computer terminals or introducing themselves to the team (with the real participant being asked to do the same).
Increasing engagement for the goal of evocativeness
While believability is a likely salient concern for experimenters considering using ECs for the first time, equally worthy of consideration is making sure ECs and the research context are sufficiently engaging to encourage participants to notice the manipulated attributes or behavior of the ECs. Starting with the original studies of coordination losses in teams (Ringelmann, 1913), researchers have demonstrated that merely believing that others are present can profoundly affect both motivation and arousal. Accordingly, ECs must be sufficiently salient to facilitate social perceptions and behavior. Moreover, to the extent to which the attributes or behavior of the ECs (as opposed to other features of the task) represent the manipulated factors within the study, then those attributes and behaviors must be sufficiently salient for participants to react to them.
One factor relevant to increasing engagement is the use of cover stories that will direct participants’ attention in a certain desired direction. For example, a version of the Synergize task developed for capturing cheating and whistle-blowing (described in the next section) instructs participants that the researchers are focused on how team members build from one another’s ideas (including when an idea gets “scooped”). To wit, the task provides a noticeable and exploitable “loophole” to cheat and maximize earnings. Regardless of the context, a well-constructed cover story for the study can direct participants to focus on key factors within the study to improve the salience of manipulations.
Second, to the extent to which attributes of the ECs are critical components of the study, producing brief “personality profiles” at the beginning of the study can bolster the salience of confederate behavior manipulations. For example, Erez et al. (2015) had participants complete personality survey items and write a brief paragraph describing their own personality. Once the computer assigned participants to their teams (all ECs), they were able to view personality profiles of all team members (including themselves). When participants interacted with the ECs during the actual task, the ECs manifested these traits through relatively subtle comments and actions, and participants were already primed to utilize these comments as confirming information about their beliefs about the ECs.
Third, additional features unrelated to the manipulation itself can increase engagement. For example, providing time limits to participants’ responses can increase arousal. Similarly, additional on-screen details (e.g., a score counter or sound effects) can make the task environment more engaging. Another way to increase engagement is to make the task environment “feel” like other applications that participants are familiar with since people tend to attribute an object or experience to a group based on how well the object matches particular prototype (i.e., representativeness bias; Kahneman & Tversky, 1972). For example, the Synergize task environment uses shades of blue and gray similar to the hallmark colors of Facebook, to subtly remind participants of social media platforms that they routinely use for virtual interactions.
Considerations for Research Ethics, Institutional Review, and Deception
Scholars who design EC-aided studies will generally aim for their study’s focal participants to be unaware that the “people” with whom they interact are not people. Thus, studies using ECs include deception, invoking a potential ethical concern. Accordingly, we present some considerations for using ECs ethically.
Given that being deceived is typically an unpleasant experience, studies that rely on deception will have additional hurdles gaining approval from IRBs. Scholars need to explain two things as part of the research proposal they submit to IRBs: (a) why the deception is essential for enabling the study’s hypotheses to be tested; and (b) when they plan to give this same explanation to their study’s participants.
We recommend addressing the first issue by decoupling the “identity” of the ECs from other potential sources of deception within the study. Specifically, researchers should seek to minimize additional sources of deception whenever practical to do so (i.e., in a study about personality, inform participants that the study is, in fact, about personality). Second, ethical standards (and IRB approval) often require that debriefing occurs immediately after participants complete the experiment, and IRBs are reticent to delay debriefing until after all sessions have been completed. However, this “instant debrief” risks compromising an EC-aided study if study participants share information from the debriefing with other potential subjects. When this risk of participant interdiscussion is high (as occurs when the subject pool involves students who share classes and/or residence-housing together), then one best practice for EC-based studies is to have participants all take part relatively simultaneously (as opposed to scheduled sessions with a human confederate). Thus, debriefing may occur immediately after completion of the study, with reduced concern for contamination.
With considerations for research ethics in mind, we turn attention next to providing two running examples of the use of ECs for behavioral research, including links to supplemental materials and for two adaptations of the Synergize task.
Adaptable Materials for EC-Based Research: The Synergize Paradigm
Section Preview: All Necessary Materials for Previewing, Adapting, and Implementing Two Versions of the Synergize Platform Are Described Below, and Linked in Table 4
Supplemental Resources Directory.
Synergize is built to allow for an engaging and attractive user interface. The primary feature of Synergize is the ability to manipulate the behavior of the ECs through their comments, the traits of ECs through ostensibly autobiographical personality profile pages, their high/low performance within the task and ethical/unethical game play, and their preferences for sharing turns with the real participant or others. While Synergize is adaptable for studying a variety of behavioral phenomena, we have provided two maximally different versions of the paradigm to allow readers to broadly apply features from both. The current section includes a brief overview of these two versions, along with links to supplemental materials. Notably, these scripts include user-friendly “switcher panels,” which allow the user to substantially modify attributes of the scripts to fit their own needs without significant programming skills.
Overview Features of Both Versions of the Synergize Script
In both versions of the Synergize tasks, participants are ostensibly working as a team (with ECs) to generate as many possible uses for a common object (e.g., paper clip, brick) within a limited time (or limited turns, as desired by the experimenter). All three ECs can be quickly modified to communicate via text, via voice messages, or remain silent. The ability to readily adjust both communications and the provided answers of the confederates allows control over individual confederate’s behavior, displayed emotions, and dispositional characteristics. Both versions include an embedded posttask naïveté check, and researchers interested in capturing reactions to the confederates can enable a follow-on module.
All supplemental materials to this article (including descriptions, links to demonstration versions, and videos of the tasks in use) are linked in Table 4. The first version of the task is a faithful remastering of the Erez et al. (2015; Study 2) task, which was used to manipulate confederate personality traits within a team context. 4 The second version of the task is a new version for examining participant behavior in an unethical team context, wherein two teammates discover, announce, and exploit a “loophole,” resulting in a greater financial payment. This version is intended for studying voice, whistle-blowing, and participant unethical behavior.
Future Opportunities for Using ECs in Organizational Research
While this article provides a springboard for organizational scholars to use ECs in their own research, there is substantial opportunity to develop new and more sophisticated tools. We conclude by identifying opportunities for methodologists interested in developing the next generation of ECs. All technologies and tools referenced here can be found in Table 4.
First, as described earlier, researchers currently using vignette or video stimuli can use conditional branching with video stimuli to create simple EC-like technologies and increase both responsiveness and evocativeness. Moreover, while believable “fake” videos create broad ethical concerns for society, emerging technologies can enable researchers to make more commensurate versions of experimental conditions. For example, “deepfake” technology allows for the believable swapping of faces within videos. As such, an experimenter manipulating the race of a video-based EC could swap the face of one actor for that of another, ensuring that tone, inflection, posture, and other features remain identical. Similarly, researchers have developed algorithms that can change words spoken by an individual in a video by simply typing replacement text. This could allow researchers to literally “put words in the mouths” of ECs (e.g., editing video statements made by real executives to create different conditions).
Second, researchers should develop EC platforms that allow multiple naïve participants (minimal group technique) to interact with ECs to get the “best of both worlds.” In this way, ECs can be manipulated to “set the tone” in a newly formed, otherwise naïve team, and data from the non-EC team members would readily be aggregated to form team-level measures.
Third, scholars can use existing AI technologies to create ECs that “learn” to respond more organically to participants, in ways that are still fully consistent with the intended manipulations. Such complex AI technologies are already widely available for commercial purposes. For example, Chatbots (e.g., MegaHal, CONVERSE, ELIZABETH, HEXBOT, ALICE) are advanced AI programs that can conduct conversations with individuals via both auditory and textual methods (Shawar & Atwell, 2007). Given their ability to use sophisticated language systems to generate natural replies to human users, they have been broadly used in practical fields such as customer service. Moreover, commercial services already exist to modify existing chatbots for specific needs (such as IBM’s Watson Assistant or Pandorabots). Developers of ECs can use preexisting lexical dictionaries (such as Linguistic Inquiry and Word Count; liwc.wpengine.com) to ensure that the EC utilizes statements representing intended manipulations. This will also allow a wider variety of task environments, as constraints on interactions during the task can be lessened as ECs becomes more human-like.
Fourth, there is potential for coupling ECs with virtual reality (VR) to achieve an ideal level of psychological realism. The surroundings associated with organizational characteristics impact individuals’ behavior, and thus, contextualization is meaningful for organizational research (Ilgen, Hollenbeck, Johnson, & Jundt, 2005; Johns, 2006). By coupling ECs with VR, organizational researchers can effectively address such limitations. The vividness and interactivity of VR may allow it to be used in multiple domains that require simulation of interpersonal interaction. Previous research has provided support for the close parallelism between virtual and laboratory environments (Innocenti, 2017). However, sparse organizational research has adopted VR although such recommendations were suggested over two decades ago (Pierce & Aguinis, 1997). We believe that by coupling ECs with VR, organizational research will be greatly benefited by the advantages derived from both technologies.
Finally, as ECs become more prevalent in organizational research, methods-focused scholars should consider targeted work to more directly compare ECs to alternative designs. Scholars may directly assess practical issues such as when non-naïveté is a necessary (or not necessary) condition for evocativeness, the extent to which increased engagement from using ECs (versus static stimuli) will reduce “noise” variance within studies, and the amount of time participants may interact with an EC before evocativeness diminishes.
Concluding Remarks
ECs hold much promise for researchers interested in studying phenomena within a controlled experimental context. While a lack of information about ECs and perceived barriers to entry (such as cost or required programming skills) may have limited their use to date, an analysis of its strengths and limitations coupled with increased availability of readily adaptable tools should allow for more widespread use by both macro and micro organizational scholars.
Footnotes
Acknowledgments
All authors contributed equally to this article, and author order is consequently alphabetical. We would like to thank Charles Murnieks and Pauline Schilpzand for their helpful feedback on this article.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
