Abstract
The efficacy of future human-machine teams will be determined by the ability of machine agents to work cooperatively with their human teammates. The ways teammates engage in cooperative teamwork behaviors might affect human beliefs about expressed cooperativity, and these changes in beliefs might depend on the identity of a teammate as a human or machine. In the current experiment we investigated how violations of cooperation expectations, presented to participants in a narrative vignette, influenced ratings on a measure of cooperation derived from social interdependence theory. As expected, participants rated vignette actors as less cooperative if the story included violated cooperation expectations, but consistent with previous research, the identities of the actors as humans or robots did not appear to influence ratings. Overall, this is a useful expansion on previous work describing the factors that influence perceptions of cooperation in human-machine teaming.
Keywords
Recent advances in artificial intelligence (AI), such as the development of generative AI (e.g., ChatGPT, DALL-E), powerfully illustrate that human-machine teaming (HMT) is possible. People are using such systems to collaboratively write fiction (e.g., Mayne, 2022), generate photorealistic portraits (e.g., Jackson, 2023), and even write software code (e.g., Rawat, 2023). However, there is also a growing awareness of the limitations of these systems (e.g., Birand, 2023), including their narrow ability to engage in teamwork and team processes (e.g., Stowers et al., 2021). These constraints must be overcome to achieve the full promise of HMT.
Cooperation in Human-Machine Teaming
Members of a team share a common goal and have interdependence; in other words, they must cooperate to attain shared interdependent outcomes (e.g., Wageman, 1995). As such, it is critical for the success of future HMT that AI agents engage in cooperative teamwork behaviors and processes with their human teammates. Cooperation, in terms of working toward shared objectives, is critical for effective teamwork. Cooperative tasks require team members to jointly strive toward a shared goal, be able to interfere with each other with respect to goal processes and achievement, and manage goal interference (Hoc, 2000). Since AI agents are currently limited in their ability to interpret others’ intentions and communicate (e.g., Mutlu et al., 2013), and to understand complex and evolving plans and actions (e.g., Schydlo et al., 2018), their ability to effectively cooperate with human teammates may be seen as limited. This is potentially problematic because agents may pay greater “penalties,” in terms of decrements in trust and estimates of reliability, for making social or task errors relative to humans who make similar mistakes (e.g., de Visser et al., 2016).
However, recent research by Funke et al. (2022) exploring violations of cooperation expectations in HMT suggested that the identity of the team members (human or agent) may not matter as much as their behaviors. Funke and colleagues asked participants to read a vignette describing a situation that required cooperative joint action from two actors (described as two humans, or a human and a robot), and then rate several aspects of the cooperation displayed by those partners. Funke et al. (2022) found that their participants rated the partners as less cooperative if the vignette presented them as failing to assist and communicate with each other, but the identity of the actors as two humans or a human and a robot did not appear to influence participant ratings.
These results are somewhat tentative, though, as the vignettes Funke et al. (2022) employed always presented the opportunity for cooperation as a human actor asking a second actor, either a human or a robot, to engage in joint action. The opposite request, a robot asking a human actor for joint action, was not included. Previous research indicates that human compliance to the requests of a robot is weaker than that to a human (e.g., Haring et al., 2021). This suggests that human violations of cooperation expectations around requests from a robot may be judged less harshly than they would be if the requestor was a human.
Cooperation Questionnaire (v2) and Social Interdependence Theory
Funke et al. (2022) based their novel cooperation questionnaire on Deutsch’s (1980) social interdependence theory, which posits that positive interdependence exists when individuals perceive that they can only attain their goals if other individuals with whom they are cooperatively linked also attain their goals. Positive goal interdependence then entails expectations regarding the behaviors and attitudes held by cooperating parties toward each other. Funke and colleagues’ (2022) questionnaire included 24 items that assessed cooperation on four dimensions: expected and actual assistance, communication and influence, task orientation, and friendliness and support. However, a follow-up exploratory factor analysis (detailed in Funke, 2022, to conserve space here) using the data from Funke et al. (2022) indicated a modified set of four dimensions was more appropriate. The adjusted factors are presented briefly in Table 1.
Cooperation questionnaire factors and descriptions.
Overall, these updated dimensions are still well aligned with the expectations of cooperating parties described by Deutsch (1980).
The Current Study
The current experiment was designed as a continuation of the research initiated by Funke et al. (2022) exploring factors affecting beliefs about cooperation in HMT. This was achieved in two ways: by inclusion of an additional actor pairing and by further manipulating violations of cooperation expectations presented in the vignettes participants read.
As described above, Funke et al. (2022) did not include a vignette describing a robot requesting cooperative joint action from a human teammate. The current study addresses this gap by including that actor pairing. We hypothesized, consistent with Funke et al. (2022), that participants would rate teammates as less cooperative if the vignette included violated cooperation expectations around requests from a human actor, but, consistent with Haring et al. (2022), cooperation would be rated somewhat greater if the violations were to requests from an agent actor.
Concerning our manipulation of cooperation expectations, Funke et al. (2022) focused on violations of expected assistance and communication in their vignettes. In the current study, we build upon that research by including vignettes that separately describe violations of expected assistance, communication, positive attitudes toward cooperating parties, and interference in goal attainment, and hypothesized that participants would rate the actors as less cooperative on the associated subscales when those violations were present in the story.
Methods
Participants
Participants in this study were recruited through CloudResearch and paid $3 for their participation. The questionnaire was hosted on Alchemer.com. All respondents were required to have a US-based IP address and to be native English speakers.
Seven hundred fifty-three individuals initially responded to our questionnaire. However, due to concerns that have been raised regarding data samples generated in online research studies (e.g., Arthur et al., 2021), prior to completing our questionnaire, respondents were required to successfully complete 4 items to determine if they were an automated survey responder (i.e., a “survey bot”); 42 responders failed to pass this requirement. In addition, respondents were required to correctly answer 3 out of 4 additional items designed to determine if they were native English speakers; 103 respondents failed to pass this requirement.
Finally, following the guidelines suggested by Arthur et al. (2021) to reduce low effort and careless responding in our data set, the data from 180 respondents were excluded from further analysis. Specifically, 9 responders failed to complete the questionnaire, 62 responders were outliers in their time to complete the questionnaire (too fast or slow), 107 responders provided invariant responding (e.g., responding with the same answer to all items), and 2 respondents failed “catch” items imbedded in the questionnaire (i.e., they did not respond with the answer cued in the item text).
As a result of these exclusions, our final sample included responses from 428 individuals (Mage = 41.58, SDage = 12.26, rangeage = 20-76 years; 234 men, 189 women,1 transgender woman, 3 non-binary, 1 preferred not to answer). The n for each condition is presented in Table 2.
Correlations between subscales and Chronbach’s α for each.
p < .01.
Note. Assist. = assistance and competence subscale; Comm. = effective communication subscale; Att. = attitudes toward cooperating parties subscale; Non-Int. = task-goal non-interference subscale; Chron. α = Chronbach’s alpha.
Experimental Design
In this study, we employed a 3 (actors) × 5 (vignette) between-groups design.
The first factor, actors, indicated if the characters of the vignette were presented as two humans (human-human condition), as a human and robot (human-robot condition), or as a robot and a human (robot-human condition). The order of the actor pairs in each condition indicates who could provide assistance to their partner in a vignette; the first member of the pair was always presented as having the opportunity to provide assistance to the second during the search of a debris pile. For example, in the robot-human condition, the firefighting robot is presented as having the opportunity to assist its human partner in completing the search task.
The second factor, vignette, referred to the context of the story participants were presented with. In each vignette, participants were told that two firefighters were called to the scene of a nighttime fire to search a burning building for trapped survivors. In all vignette conditions, the story culminated with one firefighter requiring assistance from the other to complete the search of a debris pile for a potential survivor. Vignette conditions were differentiated from each other on the attitudes and behaviors of the two firefighters described in each.
In the paragon vignette condition, the two actors were described as liking and respecting each other, as developing a search plan to coordinate their efforts before they entered the burning building, and as providing efficacious assistance to their teammate while searching the debris pile.
In the dislike vignette condition, the two actors were described as disliking each other, as developing a search plan to coordinate their efforts before they entered the burning building, and as providing efficacious assistance to their teammate while searching the debris pile.
In the poor communication vignette condition, the two actors were described as liking and respecting each other, as failing to develop a search plan to coordinate their efforts before they entered the burning building, and as providing efficacious assistance to their teammate while searching the debris pile.
In the no assistance vignette condition, the two actors were described as liking and respecting each other, as developing a search plan to coordinate their efforts before they entered the burning building, but the first firefighter does not assist the second while they are searching the debris pile.
Finally, in the bumbling assistance vignette condition, the two actors were described as liking and respecting each other, as developing a search plan to coordinate their efforts before they entered the burning building, and as assisting their teammate while they are searching the debris pile, but this assistance was particularly clumsy and slowed the speed of the search.
Further details regarding all manipulated factors are presented in Appendix A.
Apparatus
As described above, our cooperation questionnaire consisted of four subscales, each broadly corresponding to categories of expectations of cooperating individuals described by Deutsch (1980): assistance and competence, effective communication, attitudes toward cooperating parties, and task-goal non-interference. Participants rated questionnaire items on a 7-point bipolar scale; response choices were labeled as “Strongly Disagree,” “Mostly Disagree,” “Somewhat Disagree,” “Neither Disagree nor Agree,” “Somewhat Agree,” “Mostly Agree,” and “Strongly Agree.” Questionnaire items are presented in Appendix B.
The Alchemer software randomized the order of presentation of subscales and the items of each subscale for each participant.
Procedure
Participants were directed from our Mechanical Turk advertisement to the questionnaire posted on Alchemer.com. Participants were then presented with the informed consent document and asked if they would agree to participate in our study. If they agreed, they were instructed to complete the screening questions. After successfully completing the screening questions, they were assigned at random by the Alchemer software to an experimental condition. They were then asked to read their assigned vignette and to rate the cooperation of the two actors in the vignette using the cooperation questionnaire items. The Alchemer software randomized the order of questionnaire items for each participant. After completing the questionnaire, participants were thanked for their participation and they received a payment code to redeem for their compensation.
Results
Questionnaire Psychometrics
Correlations between the four subscales and Chronbach’s alpha for each are presented in Table 2. Correlations between subscales, and the internal consistency of each, were acceptable (all α > .70).
Manipulation Effects
To test the effects of our manipulations on participants’ ratings of cooperation, we conducted separate 3 (actor) × 5 (vignette) ANOVAs for each subscale of the cooperation questionnaire. The number of participants, means, and standard errors of the mean for each subscale and condition are presented in Table 3.
Number of participants, means, and standard errors of the mean (in parentheses) for each subscale and condition.
Note. Assist. = assistance and competence subscale; Comm. = effective communication subscale; Att. = attitudes toward cooperating parties subscale; Non-Int. = task-goal non-interference subscale.
Assistance and competence
The results of the ANOVA indicated a statistically significant main effect of vignette condition, F(4, 413) = 42.01, p < .01, ηp2 = .289. No other main effects or interactions were statistically significant (all p > .05). Statistically significant differences between vignette conditions for this subscale are presented in Figure 1.

Mean ratings for each subscale and vignette condition.
Effective communication
Analysis of this subscale indicated a statistically significant main effect of vignette condition, F(4, 413) = 31.04, p < .01, ηp2 = .180. No other main effects or interactions were statistically significant (all p > .05). Statistically significant differences between vignette conditions for this subscale are presented in Figure 1.
Attitudes toward cooperating parties
The results of the ANOVA revealed a statistically significant main effect of vignette condition, F(4, 413) = 109.49, p < .01, ηp2 = .515. No other main effects or interactions were statistically significant (all p > .05). Statistically significant differences between vignette conditions for this subscale are presented in Figure 1.
Task-goal non-interference
Analysis of this subscale indicated a statistically significant main effect of vignette condition, F(4, 413) = 53.65, p < .01, ηp2 = .237. No other main effects or interactions were statistically significant (all p > .05). Statistically significant differences between vignette conditions for this subscale are presented in Figure 1.
Discussion
The current study was a continuation of the research initiated by Funke et al. (2022) exploring factors affecting beliefs about cooperation in HMT using a measure based on Deutsch’s (1980) social interdependence theory. We hypothesized that participants would rate vignette actors as less cooperative if the story included violated cooperation expectations, but that these ratings may be mediated by the identity (human, robot) of the requestor. This hypothesis was not supported – the identity of the requestor as being human or a robot did not significantly influence participant ratings of cooperation in this experiment.
We also hypothesized that participants would rate vignette actors as less cooperative on the associated subscales of the cooperation questionnaire (i.e., assistance and competence, effective communication, attitudes toward cooperating parties, and task-goal non-interference) if the story included violations of these expectations. This hypothesis was supported by participant ratings. Surprisingly, however, participants consistently rated the vignette actors in the no assistance and bumbling assistance conditions as less cooperative than the paragon condition across all subscales of the questionnaire.
Actor Identities
Interestingly, the identities of the actors in our vignettes, presented as human-human, human-robot, and robot-human pairs, did not influence participant ratings of cooperation. This is more consistent with Funke et al. (2022), who found that actor identity did not strongly influence ratings of cooperation, than with Haring et al. (2021), who found that compliance was strongly influenced by the identity of the requestor. However, the difference may be in the methodologies applied in each experiment – Funke et al. (2022) utilized vignettes similar to those of the current experiment, while Haring et al. (2021) included human interaction with several different embodied robots. It may be that judgments of cooperation quality made by unaffiliated third parties, such as those the participants in this experiment were asked to make, may not be influenced by the identities of cooperating actors, but additional research is required to assess if this will still be the case for judgments made by the cooperating parties themselves during execution of a task.
Vignette and Cooperation Expectation Violations
As mentioned previously, vignettes in this experiment that included violations of cooperation expectations consistent with those described by social interdependence theory (Deutsch, 1980) resulted in reduced ratings of cooperation on the associated subscales of the cooperation questionnaire. However, participant ratings were also consistently decremented in the no assistance and bumbling assistance conditions relative to the paragon condition across all questionnaire subscales. This is consistent with Funke et al. (2022) who reported similar pervasive reductions in participant ratings of cooperation quality when the parties in their vignettes failed to engage in joint action, but also expands this finding to include that joint action must also be effective.
As asserted by Deutsch (1980), cooperative actions must productively advance the interests of cooperating parties toward goal attainment; inept actions that impede goal progress (what Deutsch termed “bumbling” cooperation) are to be avoided. The results of the current experiment strongly support this viewpoint, in that the vignettes that described one actor providing clumsy assistance to the other resulted in ratings of cooperation that were as low as vignettes where the actors provided no assistance at all.
These results strongly suggest that, to be viewed as cooperative by their human teammates, future autonomous agents will need to be responsive to requests for assistance, and that the assistance they provide must be perceived as effective by their human teammates. In fact, it may be desirable for future agents to refrain in situations where they will be unlikely to provide competent assistance, and instead provide an earnest estimate of their limitations (and perhaps an apology, Esterwood & Robert, 2022). Not providing assistance, or providing ineffectual assistance, will negatively influence beliefs about the cooperativeness of an agent teammate.
Limitations and Future Research
We feel our results here are somewhat tentative, though they largely align with those of Funke et al. (2022), as the description of the robot teammate in this experiment was extremely sparse. Previous research indicates that factors such as physical embodiment (e.g., Wainer et al., 2006) and anthropomorphic features may influence beliefs about a robot in complex ways (e.g., Roesler et al., 2021). As such, research employing additional verbal description, images, or even embodied robots to supplement vignette descriptions could influence participants’ judgments about cooperation in HMT. In addition, as mentioned previously, judgments about cooperation are likely to be influenced by context, and judgments made by actors participating in a cooperative task with an AI-agent may be different from those made by third-party observers. Further research exploring these issues is warranted.
Overall, we believe the results of the current experiment successfully expand on the research presented by Funke et al. (2022) and support the use of Deutsch’s (1980) social interdependence theory as a framework for investigating cooperation in human-human and human-machine teams.
Footnotes
Appendix A: Vignettes
Participants were presented with the following text before they completed the cooperation questionnaire items. Bracketed text in the passage was only presented to participants assigned to specific conditions (please see below for further explanation of the letter designations for each); participants not assigned to those conditions were not presented with that information.
There is a nighttime fire in a two-story apartment building and an emergency call is made to a local fire department. Firefighter 1 and Firefighter 2 are on duty and ordered to respond to the call.
[Firefighter 1 is a human. Firefighter 2 is a firefighting robot.]a [Firefighter 1 is a firefighting robot. Firefighter 2 is a human.]b They are told that the fire has spread very quickly and that there are people trapped in the building. Firefighter 1 and Firefighter 2 are ordered to search the building for survivors and help any they find to safety. Since they are entering a hazardous situation, they will both wear full personal protective equipment, including facemasks that will make talking difficult without radio communication. Firefighter 1 and Firefighter 2 are experienced and have worked together fighting several previous fires. [Both firefighters like each other and both think highly of the other’s firefighting knowledge, skills, and abilities.]c [Both firefighters dislike each other and both think poorly of the other’s firefighting knowledge, skills, and abilities.]d [While traveling to the burning building, Firefighters 1 and 2 talk about how they will coordinate their search. When they reach the fire, they have a plan for searching the apartment building that they both have agreed to.]e [While traveling to the burning building, Firefighters 1 and 2 sit quietly and do not talk about how they will coordinate their search. When they reach the fire, they do not have a plan for searching the apartment building that they both have agreed to.]f Firefighters 1 and 2 then enter the apartment building and begin to search each apartment for trapped people. As they search an apartment, Firefighter 1 sees a blocked door leading to a child’s bedroom and thinks it is important to check the room for trapped survivors because there may be a child asleep in the room. Firefighter 2 sees a pile of large debris and thinks that it is important to search the debris for trapped survivors because the pile moved as they entered the room. They must move quickly because the fire is intensifying. Firefighter 1 goes to the door and uses a radio to ask Firefighter 2 to break the door open with the axe. Firefighter 2 responds over the radio that the debris pile moved and may be on top of a survivor. [Firefighter 1 goes to the debris pile to assist Firefighter 2.]g [Firefighter 1 watches Firefighter 2 search the debris pile but does not help.]h [Firefighter 1’s efforts are very skillful and help Firefighter 2’s search. Firefighter 1’s effort speeds the search of the pile.]i [Firefighter 1’s behavior does not affect the search of the pile.]h [However, Firefighter 1’s efforts are very clumsy and hinder Firefighter 2’s search. Firefighter 1’s effort slows the search of the pile.]k
Notes. aPresented to participants in the human-robot actor condition. bPresented to participants in the robot-human actor condition. cPresented to participants NOT assigned to the dislike vignette condition. dPresented to participants assigned to the dislike vignette condition. ePresented to participants NOT assigned to the poor communication vignette condition. fPresented to participants assigned to the poor communication vignette condition. gPresented to participants NOT assigned to the no assistance vignette condition. hPresented to participants assigned to the no assistance vignette condition. iPresented to participants NOT assigned to the bumbling assistance vignette condition. jPresented to participants assigned to the bumbling assistance vignette condition.
Appendix B: Cooperation Questionnaire (V2) Items
Instructions: This questionnaire is concerned with your thoughts about cooperation between people. Please read the short scenario description and then rate each question according to the scale provided.
Please answer the following questions as honestly as you can with what you think is true. Do not choose a reply just because it seems like the 'right thing to say.' So you can describe your feelings and thoughts honestly, your answers will be kept entirely confidential. You should try and work quite quickly; there is no need to think very hard about the answers. The first answer you think of is usually the best.
In the story, Firefighter 1 and Firefighter 2…
Participants rated each item on a 7-point scale; response choices were labeled as “Strongly Disagree,” “Mostly Disagree,” “Somewhat Disagree,” “Neither Disagree nor Agree,” “Somewhat Agree,” “Mostly Agree,” and “Strongly Agree.”
Acknowledgements
This research was supported by the Air Force Office of Scientific Research (program manager Dr. Laura Steckman, Grant Num. 20RHCOR048). The contents are those of the authors and do not necessarily represent the official views of, nor an endorsement by, the USAF or the U.S. Government.
