Abstract
The aim of this study was to examine team functioning within the context of the AMADEE 18 Mars analog project, which took place in Oman in the winter of 2018. Five “Analog Astronauts” participated in this study. Each completed measures of individual-level variables, including demographics and personality, before the simulated Mars mission began. At several time points during the mission, and once at the end, participants completed measures of individual stress reactions, and teamwork-related variables, including several types of team conflict, citizenship behavior, in-role behavior, counterproductive behavior, and social loafing. Each participant also reported how well he or she felt the team performed. The results indicate an overall positive, successful teamwork experience. Factors including measurement issues, psychological simulation fidelity, and qualities of the team likely influenced these results. Measuring important team- and individual-level variables during additional space analog events, while considering factors related to psychological fidelity, will allow for the compilation of data to better understand the factors affecting teams in these unusual contexts.
1. Introduction
In preparation for future human long-duration missions to the Moon, Mars, asteroids, or other Solar System destinations, studies of teams are needed to improve our understanding of team function and crew selection, in order to increase the likelihood of efficient and effective completion of mission goals, and overall mission success. NASA's Human Research Roadmap (Hanson, 2017) has identified several “Gaps” within this research area. Some examples of threats and challenges to teams engaged in long-duration missions in space include (1) shifts in motivation/morale over time, (2) difficulties associated with communication delay between the crew and Mission Control, (3) the need to switch between independent and interdependent tasks, (4) cultural differences between crew members, and (5) degradation of team cohesion over time (Hanson, 2017). To address these gaps in our knowledge, each of which is relevant to the formation and functioning of crews engaged in autonomous, long-duration and/or distance exploration missions, we need (1) a better understanding of the key threats and challenges that these teams might face throughout their life cycles and that shape team functioning; (2) a set of validated measures, based on the key indicators of team functioning, to effectively monitor and measure their fluctuations in performance and team health; (3) psychological measures that can be used to select individuals who are most likely to function well on these teams; and (4) to identify psychological and psychosocial factors, measures, and combinations thereof that can be used to compose teams that will be highly effective under such conditions. This study builds on findings concerning the relations among team composition (O'Neill and Allen, 2011; e.g., psychological characteristics and competencies of team members), team processes (O'Neill et al., 2013; O'Neill and Allen 2014; e.g., conflict, conflict resolution), and team outcomes (e.g., team performance, team resilience) as they apply to teams in general. It does so by examining a team engaged in a simulation of a mission to Mars, the AMADEE-18 mission conducted in Oman in February, 2018.
Earth analogs of spaceflight, such as this AMADEE mission, offer several advantages over collecting data from astronauts during actual spaceflight. They afford researchers the potential to collect data from larger numbers of participants, they allow for more time participating in research than is typically available during spaceflight, and they are less expensive and logistically challenging (Sandal et al., 2017).
1.1. Personality in teams
It seems clear that the personality characteristics of team members influence how well teams perform. Team-level personality has been shown to have an impact on team performance. A meta-analysis examining the research on deep-level team member characteristics (personality factors, values, abilities) on team performance, for example, showed that the impact of personality on team performance varied depending on study context (whether data were collected in a lab or field setting) and the way in which personality factors were operationalized (Bell, 2007). Although relations between personality factors and team performance were negligible with lab studies, conscientiousness, agreeableness, and openness to experience were found in the meta-analysis to strongly predict team performance in field studies (Bell, 2007).
Conscientiousness can be thought of as an “additive” team resource; that is, the more conscientiousness that a team's members collectively have, the better the team will accomplish its tasks (O'Neill and Allen, 2011). Agreeableness is also thought to work in an additive way to increase the likelihood of positive interactions between team members, which, in turn, contributes to team functioning. Studies have shown the personality traits of Honesty/Humility, Emotionality, Extraversion, Agreeableness, and Openness to have a weaker impact on team performance compared with the impact of conscientiousness (O'Neill and Allen, 2011). Teams high in the trait of Emotionality might be expected to encounter disruptions to the workflow related to interpersonal problems and distress. It has been proposed that a lack of variance in Extraversion is a potential detriment to teams, especially when all members are extroverts or all members are introverts (Barry and Stewart, 1997; Mohammad and Angell, 2003). The personality facet of Openness has been shown to have a modest negative relationship with team performance. The personality factor of Honesty/Humility has not been researched as extensively as the other personality factors but might contribute to smooth interactions and healthy levels of trust among team members. For a more detailed rationale for the interpretation of individual and team-level personality in teams, see O'Neill and Allen (2011).
1.2. Stress in teams
Research has shown that acute stress has a negative impact on team performance, especially for teams in extreme contexts, such as space exploration or emergency medicine (Driskell et al., 2018). Stress can impact teams by creating distraction, contributing to task overload, increasing negative emotions or feelings of anxiety and worry, and making it difficult for team members to coordinate their work (see Driskell et al., 2018).
1.3. Team conflict
Teams experience several different types of conflict. Relationship conflict refers to animosity and tension between members of the team and has been shown to negatively impact team performance and satisfaction. Task conflict refers to differences in views and opinions between team members and has sometimes been shown to have a negative, neutral, or even positive impact on team performance. Process conflict refers to disagreements about how to go about accomplishing group tasks, including the allocation of tasks to team members, and strategies for completing tasks, and has been shown to have a negative impact on team performance and satisfaction (Behfar et al., 2011). Researchers have recently identified team conflict profiles, based on distinct patterns in levels of task conflict, relationship conflict, and process conflict. Teams high in task conflict, and low in relationship and process conflict, have been shown to tend to have better communication quality and higher performance (O'Neill et al., 2018). Task conflict is thought to be beneficial for teams when tackling complex tasks, but only when interpersonal issues and disagreements about team processes—how to go about accomplishing tasks—are at a minimum.
1.4. Teamwork behavior
Organizational citizenship behaviors refer to those tasks performed by employees that go beyond their prescribed job duties, that benefit the organization or individuals within in it in some way. In the context of teams, we can think of citizenship behavior as benefiting the team as a whole and/or other individuals within the team. In-role behavior, on the other hand, refers to those behaviors which are expected from someone in fulfilling one's role within an organization and/or team. Counterproductive work behaviors are those behaviors engaged in voluntarily by individuals which have a negative impact on the organization or team. Social loafing refers to the tendency for people to expend less effort when working as part of a group than when working individually (Latane et al., 1979). Within a teamwork context, citizenship behaviors and in-role behaviors tend to relate positively with team performance. Counterproductive behaviors and social loafing tend to relate negatively to team performance.
1.5. Familiarity in teams
Team familiarity—that is, the degree to which members of a team are familiar with one another—has been shown to have a positive influence on team performance in various settings (e.g., Kanki and Foushee, 1989; Crowston and Kammerer, 1998; Reagans et al., 2005). It is thought that team familiarity streamlines team processes by providing members with necessary knowledge about one another and by improving the effectiveness and efficiency of team communication (Harrison et al., 2003).
2. Method
2.1. AMADEE-18 Mars analog mission
A “Mars analog” is a place or situation on Earth that approximates geological, environmental, psychological, or other parameters that are similar to Mars. There is no “perfect” analog (e.g., no natural environment on Earth can approximate martian gravity and martian atmospheric conditions); therefore, a number of different analog sites on Earth are used to conduct Mars analog missions of different types. A “Mars analog mission” takes place at a Mars analog site (natural or in a laboratory) and can simulate a human and/or robotic mission, or certain elements of a mission, which may be focused on science, engineering, human factors, operational parameters, or some combination of these.
For four weeks during February 2018, the Austrian Space Forum—in partnership with the Oman National Steering Committee for AMADEE-18—conducted a high-fidelity Mars analog mission in the Dhofar region of Oman, which is a geological Mars analog environment featuring moderately isolated and extreme conditions. A crew of five analog astronauts were directed by a Mission Support Center in Austria and indirectly supported by an out-of-simulation field team. The analog astronauts gathered data and conducted experiments for 15 studies (including the study described here), in the fields of engineering, planetary surface operations, astrobiology, geophysics/geology, life sciences, and other. See the introduction to this special collection of articles in Astrobiology for a more detailed description of the AMADEE-18 analog mission.
2.2. Participants
The 5 AMADEE-18 Analog Astronauts (4 men and 1 woman) participated in the study. They ranged in age from 28 to 38 years, with a mean age of 33. Each was educated at or above the master's level. The team included 3 Analog Astronauts, 1 Analog Astronaut/Principal Investigator, and 1 Analog Astronaut/Deputy Field Commander.
2.3. Procedure
Participants were provided with information about the study and invited to participate; after agreeing to participate, each signed a consent form. Given the small number of participants in this study, participants were assured that their individual data would only be published in aggregate form. Several weeks before the mission simulation began, each participant completed an online survey in which information was collected about participant age, sex, mission role, education level, and personality. Participants were instructed to produce a code, which they entered into the online survey and onto each subsequent paper survey, to enable the compilation and linkage of data for each participant. During the mission simulation, each participant completed an additional questionnaire every 5 days. Each survey included measures assessing the individual's perception of team conflict (relational, task, logistical, contribution), team performance, and individual stress. On the last of these surveys, participants were also asked to rate each of their teammates, and themselves, on citizenship behavior, in-role behavior, counterproductive work behavior, and social loafing. Participants were also invited to write in any comments, concerns, or insights they would like to add, on the final survey. These procedures were approved by the Office of Human Research Ethics at the University of Western Ontario, Canada.
2.4. Measures
Personality was measured via the HEXACO-60 Personality Inventory (Ashton and Lee, 2009). The HEXACO model conceptualizes personality as consisting of six dimensions: Honesty-Humility (H), Emotionality (E), Extraversion (X), Agreeableness (A), Conscientiousness (C), and Openness to Experience (O). The questionnaire has a 5-point Likert-type response scale, with 10 items for each dimension. A validation study of the HEXACO-60 Personality Inventory showed internal consistency reliabilities above .70 and scale intercorrelations below .30, with all items for each scale showing primary loadings onto the appropriate factors (Ashton and Lee, 2009). These psychometric properties, coupled with brevity in comparison to longer personality measures, made the HEXACO-60 Personality Inventory ideal for this study.
Team conflict was measured via a 13-item questionnaire with a 5-point Likert-type response scale (Behfar et al., 2011). The questionnaire includes subscales with items measuring different types of conflict: relationship conflict, task conflict, process conflict related to logistical issues, and process conflict related to contribution issues. Measures based on a typology of team conflict which differentiates these three types (relationship, task, and process) have been shown to predict outcome variables in teams more accurately than less-specific measures of team conflict (Behfar et al., 2011). Such measures do so by better distinguishing process conflict—which appears to have a negative impact on group viability (group performance, satisfaction, and effective group processes)—from task conflict, which appears to have a weaker negative impact, or even a positive impact, on group viability (Behfar et al., 2011). Each of the subscales has been shown in a validation study to have suitable internal consistency, with item-total correlations for each subscale above .62, and scale items showed primary loadings onto their respective factors (Behfar et al., 2011).
Team performance was measured via a 5-item self-report measure adapted from work conducted by Ancona and Caldwell (1992). This measure assesses individual team members' perceptions of their team's performance. Each team member was asked to rate the team on 5 performance dimensions (efficiency, quality of innovation, goal attainment, adherence to schedules, and overall performance) using a 7-point scale (far below average to far above average). In their study examining predictors of new product team performance, Ancona and Caldwell (1992) found that a principal component analysis of scores on the 5 performance dimensions yielded a single factor, so items were averaged to create a single performance score, with an alpha of 0.83. We chose to measure performance in this way given the difficulties in assessing team performance more objectively in a simulated mission context in which many different goals were apparent (i.e., public engagement, scientific collaboration across disciplines and nations, scientific investigations).
Stress was measured via the 7-item Stress Subscale of the 21-item version of the Depression Anxiety Stress Scale (DASS-21; Lovibond and Lovibond, 1995), which has been shown to adequately measure features of hyperarousal and tension in both clinical and nonclinical groups (Antony et al., 1998). Participants indicated how well each statement applied to them, with respect to their experiences over the previous week; the response to each item was made on a 4-point response scale ranging from did not apply to me to applied to me very much or most of the time.
On the final survey, each participant rated each of their teammates, one by one, and then themselves, on the following: (1) citizenship behavior (2-item scale adapted from Lee and Allen, 2002); (2) in-role behavior (3-item scale adapted from Williams and Anderson, 1991); (3) counterproductive work behaviors (3-item scale adapted from Spector et al., 2010); and (4) social loafing (3-item scale adapted from George, 1992). Items from all scales were rated on a 7-point response scale, ranging from never to always.
On the final survey, each participant also rated how well they knew each of their teammates at the start of the simulated Mars mission on a single-item familiarity scale (created for this study), with a 4-point response scale, ranging from not at all to very well.
3. Results
The small number of participants in this study make the statistical analyses typically used to interpret survey data inappropriate; instead the nature of space-analog simulation studies has led some researchers to employ a qualitative interpretation of quantitative data. In essence, this involves taking a case-study approach to the reporting and interpretation of small sets of quantitative data (such as scores on a measure of team conflict) that, in the context of a study including large numbers of participants, would be analyzed in aggregate form to reveal broader trends and patterns (for a detailed discussion of this type of analysis, see Bell, Fisher, Brown, and Mann, 2016). In addition to this type of interpretation, in our program of research, we are continuously employing the same measures with other “extreme teams,” with the goal of accumulating a larger pool of data for statistical analysis.
Team personality traits can be operationalized in different ways (mean, variance, or minimum/maximum scores), based on a consideration of the personality trait, the task performed by the team, and how these variables are expected to interact (O'Neill and Allen, 2011). In addition to aggregating to the team level by way of averaging individual scores, team-level Honesty/Humility could also be operationalized by using the lowest (minimum) individual score, team-level Extraversion could be operationalized by using within-group variance, and team-level Openness could be operationalized as the highest (maximum) individual score within each team (see O'Neill and Allen, 2011, for an explanation of the operationalization process). In the present paper, we report only the mean scores, however, given the need to protect the privacy of such a small sample of participants. Personality trait scores, aggregated to the team level by way of averaging scores across participants (see Table 1), were similar to those of the normative sample reported by the measure's developers (see
Team-Level Personality Trait Scores
Throughout the mission simulation, relationship conflict (M = 1.58, SD = .71), logistical conflict (M = 1.58, SD = .52), and contribution conflict (M = 1.25, SD = .45) were all quite low, and task conflict was relatively high (M = 4.27, SD = .32) (see Fig. 1 for a comparison of conflict types over time). Team members also consistently rated their own team's performance as being slightly above average throughout the mission (see Fig. 2).

Team-level conflict over time: Relationship Conflict (Time 1 SD = .84, Time 2 SD = .64, Time 3 SD = .69, Time 4 SD = .80); Task Conflict (Time 1 SD = .50, Time 2 SD = .44, Time 3 SD = .34, Time 4 SD = .23); Logistical Conflict (Time 1 SD = .69, Time 2 SD = .37, Time 3 SD = .67, Time 4 SD = .38); and Contribution Conflict (Time 1 SD = .15, Time 2 SD = .15, Time 3 SD = .44, Time 4 SD = .51).

Aggregated self-reported team performance ratings.
Self-reported levels of stress (aggregated to the team level) were quite low and even fell slightly over the course of the mission simulation (see Fig. 3). Scores were highest and the most variable among individuals at the first measurement time-point, at the beginning of the simulation.

Individual self-reported stress over time (aggregated to the team level).
Participants indicated that they knew their teammates “somewhat well” to “very well” at the start of the mission, with a mean familiarity score of 3.75 out of 4 (SD = .35). Participants rated each of their teammates quite positively, with high ratings on citizenship behavior (M = 5.98, SD = .66) and in-role behavior (M = 6.37, SD = .53), and low ratings for both counterproductive work behavior (M = 1.65, SD = .19) and social loafing (M = 1.64, SD = .47). They also rated themselves in a similarly positive fashion, although these self-ratings were slightly lower for positive behaviors and slightly higher for negative behaviors than were the ratings they provided for their peers. See Fig. 4 for a comparison of peer- and self-ratings of behavior.

Self-ratings and aggregated peer-ratings for citizenship behavior (CB), in-role behavior (IRB), counterproductive work behavior (CWB), and social loafing (SL).
4. Discussion
Overall, the data reveal a well-functioning, cohesive team that did not seem to face a great deal of difficulty in dealing with the impositions of the simulation context. It is likely that several factors influenced our findings. Some relate to the way in which the variables of interest were measured. Some relate to the conditions, or “psychological fidelity,” of the mission simulation. Others relate to how successfully this team was assembled. These factors are addressed in the following paragraphs.
4.1. Measurement issues
In a general sense, since they are based entirely on survey responses, our findings must be considered in light of the limitations of self-report data. There is always the possibility that study participants will be motivated to respond to survey items in one way or another. This is especially the case when the content and/or context of the study are such that social desirability is salient (Crowne and Marlowe, 1960; Paulhus, 1991). Participants in the present study were likely highly motivated to participate in the AMADEE-18 simulation and highly invested in the success of the mission and future AMADEE missions. As a result, the data might be biased, at least to some degree, such that ratings of aspects of both the mission and the team are somewhat inflated.
An additional limitation involves the timing of our stress measures. At the individual level, there was some indication of stress, especially in some individuals at the first measurement time-point. We did not, however, take baseline, or pretest, measures to determine the typical stress level for each individual. Consequently, we are unable to determine from the data the degree to which this pattern of individual differences relates to differences in general temperament or to differences in reactions to the simulation context.
Further, the scope of our study might be considered overly narrow in that it involved only the Analog Astronauts. Researchers studying teams have become increasingly focused on the idea that teams do not necessarily operate as distinct, separate units within organizations but rather as components of complex webs of interconnected groups of individuals, referred to as multiteam systems (see Zaccaro et al., 2012). Indeed, Caldwell (2005) outlined the way in which the human component of spaceflight missions has developed into complex multiteam systems, with multiple mission control teams and flight crews interacting with one another. It is not surprising, then, that several participants in the present study indicated that, although their “team” was comprised of the 5 Analog Astronauts, the fact that they were interacting quite often with both the field crew on site in Oman and with Mission Control in Austria meant that they would have liked to provide comments on additional personnel whom they considered to be part of the mission team. An additional issue arising during this study involved a change to the membership of the analog astronaut team. Participants noted that a 6th Analog Astronaut entered the simulation partway into the mission, but this person was not included in their ratings of the team or its members, as per our survey instructions, since we had not anticipated this event. It is possible that scores on measures relating to the team would differ if these missing data were included, especially given the small size of the team.
4.2. Simulation fidelity
Earth analogs of spaceflight offer several advantages over collecting data from astronauts during actual spaceflight. These analogs, however, come in different forms, with no one analog mimicking every characteristic of spaceflight (Sandal et al., 2017). Each analog study might offer higher fidelity for addressing some types of research questions and lower fidelity for others. An important variable impacting psychological fidelity is duration. In their 2016 research review, Bell, Brown, and Mitchell emphasize the need for long-duration simulations, spanning 1 year or longer. This recommendation is based on mounting evidence that detrimental psychological effects and interpersonal problems tend not to begin until months, or even over a year, into spaceflight analog simulations. The AMADEE-18 mission simulation, being 4 weeks long, had a relatively short duration that might have limited the exploration of psychological factors. The pattern in the data for self-reported stress (highest at the start of the simulation, decreasing slightly thereafter), for example, might reflect a reaction not so much to conditions within the simulation but, rather, the adjustment to these conditions, and/or to beginning a new role. This response, arguably, might not be highly informative to those trying to better understand the potential for stress reactions to life on Mars.
Space analog studies also vary on the degree to which their conditions are extreme, isolated, and/or remote. Space analog studies tend to include some element of isolation from friends and family, physical confinement, and extreme conditions (Golden et al., 2018). These conditions have the potential to influence participants' psychological experiences and reactions. The AMADEE-18 Analog Astronauts were indeed isolated from their friends and family. They also experienced a form of confinement, in that they remained within the simulated Mars station and were required to wear space suits when venturing outdoors. Although the environment might not fully present the treacherous and physically dangerous conditions said to be characteristic of extreme team settings (Golden et al., 2018), the AMADEE-18 mission did take place in an isolated, remote location, with many challenges presented by a desert climate and terrain. On the other hand, the Analog Astronauts were not entirely alone. For the duration of the mission simulation, a sizeable field crew was housed on location. If necessary, Analog Astronauts could contact members of the field crew, and emergency assistance could be accessed relatively easily. This is in contrast, for example, to analog studies taking place in an Antarctic research station, in which the crew itself is physically isolated from civilization and might not easily access assistance when needed (Sandal et al., 2017). It is possible that the participants did not experience feelings of isolation or danger that were strong enough to trigger measurable stress responses or teamwork deficiencies.
4.3. Qualities of the team and its members
Conscientiousness scores for the Analog Astronaut team, aggregated to the team level, were slightly higher than the norm provided by the developers of the HEXACO measure (see
The AMADEE-18 Analog Astronaut team members indicated that they were very familiar with one another, having worked together before the start of the mission. It is likely, then, that they had established good rapport before the simulation began. Indeed, more than one team member indicated in their written comments at study completion that they were very satisfied with the way in which the analog astronaut team worked together. These positive impressions of their teammates were reflected in the highly positive ratings of behavior they provided for one another.
The results for the measures of team conflict are also illustrative of a highly functioning team. The AMADEE-18 Analog Astronaut team reported low levels of relational and process (logistic and contribution) conflict and high levels of task conflict, reflecting the ideal conflict profile as proposed by O'Neill et al. (2018) and described earlier. This pattern of conflict might have contributed to their high levels of performance, by facilitating effective problem solving, while minimizing the process losses associated with unproductive disagreements.
5. Conclusion and Implications
The main goal of this study was to examine several variables known to affect team functioning within the context of the AMADEE 18 Mars analog project. In the process, we also gained a great deal of insight into the unique considerations—both practical and theoretical—of conducting research with teams engaged in space analog missions. Research design and analytical practices used routinely in teamwork research, and psychological research in general, tend to require large groups of participants to ensure the robustness of findings. Results from most measures used in the present study are quantitative in that they apply numerical scores to psychological processes; however, they are interpreted here in a more qualitative way, since such a small number of data points does not allow for statistical analyses beyond the descriptive results provided in this paper. These data could, however, be combined with data from similar studies to create data sets large enough to justify more complex statistical investigations. Future research in the space analog context should explore important teamwork processes in a coordinated way, using research protocols and measurement tools which are as compatible as is practical, to allow for the amalgamation of data. The field might also benefit from a narrower focus on addressing knowledge gaps relevant to teams in extreme contexts, such as those identified by Bell, Brown, and Mitchell in their 2016 research review, including team level affect (as opposed to the affect in individual team members), team resilience, and conflict related to status and leadership. In addition, it is vital that researchers carefully ensure the appropriateness of the context in which each study is conducted, by carefully considering factors such as duration, isolation, participant selection, and setting.
Footnotes
Author Disclosure Statement
All authors indicate that no competing financial interests exist.
