Abstract
This study explores the effects of group size, group composition, and group argument frequency on group cognitive complexity (GCC). We evaluated a sample of 509 students organized into 106 groups who participated in a group cognitive mapping activity. As hypothesized, we found that group argumentation has an inverted U-shaped association with GCC. Group member familiarity did not moderate this relationship. We also found that task-related arguments mediate the relationships between group size and gender diversity on one hand, and GCC, on the other. Moreover, we found that optimal group-level cognitive benefits were observed in group discussions in which the ratio between task-related and nontask-related group arguments was 3 to 1. The discussion focuses on the practical and theoretical implications of these findings.
Collaborative learning is extensively used in academic environments around the world to foster the acquisition of curricular knowledge and to help students develop teamwork skills, widely sought after by modern organizations (Curşeu, Chappin, & Jansen, 2017; Loes, Culver, & Trolian, 2018; Loes & Pascarella, 2017). Beyond helping students in their individual learning process, collaborative groups develop novel collective knowledge structures that transcend individual cognition (Curşeu & Pluut, 2013). Imagine a three-member student group that has to find feasible solutions in a case study on organizational change. As this case relies on various insights on organizational structure, culture, leadership, and other relevant organizational processes that were taught during one semester, it is very likely that not all group members will master these concepts equally well. Their success in dealing with the complex case, therefore, depends on how well they can share what they each know and integrate the insights in a comprehensive collective group knowledge structure. If group members do not share information (i.e., low number of group arguments), it is very likely that the group will fail to take into account how the organizational change will affect all relevant organizational subsystems and processes. Excessive information sharing (high number of group arguments), however, could prevent the members to engage in in-depth information evaluation and effective knowledge integration. Therefore, group members have to balance their communication in such a way that enough information is pooled and carefully integrated to generate a comprehensive collective understanding and, ultimately, feasible solutions to the complex case.
In a recent comprehensive meta-analytic study, Marlow, Lacerenza, Paoletti, Burke, and Salas (2018) show that communication has an overall positive effect on group performance. Group communication is, therefore, believed to be positive for group performance. Could it be that too much communication (a very high number of group arguments) actually diminishes group performance? In their meta-analysis, Marlow and colleagues (2018) explicitly argue that “future research should investigate whether the relationship between communication frequency and team performance is curvilinear” (p. 148). We take on this call because, so far, only scattered empirical attempts have been made to explore this nonlinearity (Patrashkova & McComb, 2004; Patrashkova-Volzdoska, McComb, Green, & Compton, 2003; Peltokorpi & Hasu, 2014). We use the general too-much-of-a-good-thing (TMGT) meta-theoretical framework (Busse, Mahlendorf, & Bode, 2016; Grant & Schwartz, 2011; Pierce & Aguinis, 2013) to argue that the relationship between the number of group arguments used during discussions and group cognitive complexity (GCC) is nonlinear.
To develop specific hypotheses concerning this relationship, we use the framework of differentiation and integration in cognitive and social structures (Driver & Streufert, 1969; Gruenfeld & Hollingshead, 1993; Gruenfeld, Mannix, Williams, & Neale, 1996). Complex cognitive structures are highly differentiated (contain many different concepts) and, at the same time, highly integrated (concepts are richly interconnected; Gruenfeld & Hollingshead, 1993). In line with this framework, we posit that complex, group-level cognitive structures emerge when group communication facilitates the sharing of enough arguments to secure cognitive differentiation and, at the same time, allows the integration and in-depth analysis of these group arguments. Groups benefit little from the intelligence brought in by their members when the number of arguments shared during conversation is low, and they may experience cognitive overload when the number of arguments shared during conversation is too high. We rely on the optimal cognitive load claim (Kirschner, Paas, & Kirschner, 2009), as a theoretical mechanism, to argue that the relationship between the number of group arguments (Meyers & Brashers, 1998) and the complexity of emergent group cognition has an inverted U shape. Moreover, we build on a resource-based view on groups, and consider group size and group diversity as proxies for the pool of cognitive resources available to the group. Group communication is the process through which these cognitive resources are integrated to generate group-level cognitive structures. Therefore, we set out to test the extent to which the number of group arguments mediates the association between group size and gender diversity, on one hand, and GCC, on the other hand.
Number of Group Arguments and GCC
The quality of group communication is an important antecedent of academic performance in collaborative learning groups (Curşeu et al., 2017; Loes, An, Saichaie, & Pascarella, 2017). The emergent view on group cognition (Curşeu, 2006; Curşeu, Schruijer, & Boroş, 2007; Grand, Braun, Kuljanin, Kozlowski, & Chao, 2016) and collective learning (Zoethout, Wesselink, Runhaar, & Mulder, 2017) places the coevolution of individual cognitive structures and cognitive convergence (Ervin, Bonito, & Keyton, 2017; Staggs, Bonito, & Ervin, 2018) at the core of group-level information processing. The emergent view on group cognition states that individual cognitions are continuously changed during interpersonal interactions in groups so that novel insights that transcend individual understanding are generated (Curşeu, 2006; Curşeu, Janssen, & Raab, 2012; Curşeu & Schruijer, 2010). The conversational transactivity approach argues that during conversations, group members act on each others’ reasoning to clarify or generate novel ideas (Zoethout et al., 2017). Hence, group discussion is deemed essential for the emergence of group-level cognition, but researchers call for additional work to fully understand the relationship between social interaction and social cognition; our “current understanding of the role of social interaction in social cognition is limited” (De Jaegher, Di Paolo, & Gallagher, 2010, p. 442). We build on the idea that in collaborative learning groups, social interactions enable emergent group cognition through cognitive convergence (Ervin et al., 2017; Staggs et al., 2018), and we set out to test the effect of the number of group arguments (Meyers & Brashers, 1998) on the complexity of group-level cognitive structures.
Recent meta-analytical evidence supports the linear association between frequency of communication and group performance (Marlow et al., 2018). However, in a study on 80 cross-functional teams, Patrashkova-Volzdoska and collaborators (2003) show that the frequency of face-to-face communication has an inverted U-shaped relationship with goal achievement. In a follow-up computational study, Patrashkova and McComb (2004) build on an information pool model and show that, when communication frequency is low, groups pool insufficient information to perform effectively, whereas when communication frequency is high, information overload perturbs collective performance. The best group performance is, therefore, achieved at optimum communication frequency for both synchronous and asynchronous communication (Patrashkova and McComb, 2004). In another empirical study on 139 research groups, Peltokorpi and Hasu (2014) showed that the transactive memory system has an inverted U-shaped relationship with group innovation. As group communication is central to the emergence of transactive memory systems (Peltokorpi, 2008), this study establishes the call for more exploration of the nonlinear association between communication frequency and the emergence of complex cognitive structures in groups. As group cognition is the linking pin between communication frequency and group performance (Cooke, Kiekel, Salas, Stout, Bowers, & Cannon-Bowers, 2003), it is our intention to address a particular aspect of communication frequency, namely, the number of group arguments used during group debates as an antecedent of GCC.
A key assertion in group cognition literature is that groups are sociocognitive systems (Gruenfeld & Hollingshead, 1993) or information processing systems (Hinsz, Tindale, & Vollrath, 1997) that integrate the knowledge and expertise of their members through communication. Cognitive emergence is, therefore, rooted in group communication, and novel cognitive structures result from the interplay and convergence of intraindividual and interindividual cognitive processes (Ervin et al., 2017; Kozlowski, Chao, Grand, Braun, & Kuljanin, 2016; Staggs et al., 2018). In the student group described earlier, individual insights on particular organizational subsystems and processes are integrated through group discussions to generate collective insights that will allow the group as a whole to tackle the systemic intricacies of organizational change. In this cognitive emergence process, individual knowledge relative to the case (individual cognitive structures) will not only converge but also change and coevolve as group arguments are discussed, supported, or criticized. In our study, we focus on the complexity of the collective knowledge structures that emerge from the cognitive convergence in groups. A complex collective structure eventually includes insights on how the organizational change will unfold in a diverse range of organizational subsystems and processes (cognitive differentiation) and, at the same time, it will capture the systemic interdependence related to organizational change or the way in which changing a particular subsystem will eventually affect the dynamics of related organizational processes or subsystems (cognitive integration).
Previous research showed that the complexity of group-level cognitive structures is influenced by the quality of within-group interactions (Curşeu et al., 2012; Curşeu et al., 2007) and, in turn, has a positive impact on group performance (Curşeu & Schruijer, 2010). Moreover, minority dissent or the open disagreement with the opinions expressed by the majority seems to enrich the cognitive repertoire of the group (Curşeu et al., 2012; Curşeu, Schruijer, & Fodor, 2017; Van Swol, Carlson-Hill, & Acosta Lewis, 2018). Normative interventions or interaction rules that stimulate the expression of diverse opinions and help the groups to achieve consensus foster the emergence of complex collective cognitive structures (Curşeu & Schruijer, 2012). These studies indirectly suggest that the number of group arguments used in group discussions is a key element for the emergence of GCC.
In line with the cognitive load theory (Kirschner et al., 2009), communication processes facilitate knowledge integration in collaborative learning groups; yet, they also add to the cognitive load experienced by the individual group members as they have to process a variety of conversational views. The cognitive load theory implies, therefore, that collaborative learning groups perform worse when cognitive load is either too low (not enough information pool for cognitive convergence) or too high (excessive information pool prevents convergence to occur).
In a study that explored the interplay between the quality and quantity of group reflection on group performance, Otte, Konradt, and Oldeweme (2018) identified a complementary interplay between these two indicators of reflection, in that, group performance was found to be high when at least one of them was high. Moreover, the dominance of reflection quality over quantity seemed to have a positive influence on group performance improvement. As group reflection is an important form of group communication, it follows that, at very high levels, reflection quantity may come at the expense of reflection quality and reduce cognitive convergence and the development of complex collective cognitive structures.
As previous research pointed out (Ancona & Caldwell, 1992; Patrashkova-Volzdoska et al., 2003; Smith et al., 1994), the frequency of task-related communication entails both benefits and costs. These mixed results can be explained by the TMGT effect (Busse et al., 2016; Grant & Schwartz, 2011; Pierce & Aguinis, 2013). As the antecedent increases to a moderate level (i.e., up to an inflection point), the benefits exceed the costs, but at higher levels of the antecedent, the costs exceed the benefits. Specifically, in our study, we claim that the number of task-related group arguments (TRGAs) is beneficial for GCC as it increases from low to moderate levels, and it becomes costly for GCC as it exceeds the moderate level. For instance, when the number of group arguments is low, the exchange of task-related information suffers, group members consider a limited amount of ideas and perspectives, and they do not get the chance of developing connections among them (cognitive convergence is impaired). The group will, thus, be unable to achieve an optimal level of both differentiation and integration of the conceptual domain (Gruenfeld & Hollingshead, 1993; Gruenfeld et al., 1996), and group performance is likely to be impaired. Conversely, when the number of group arguments is high, group members exchange too much (and possibly diverse) information, and they end up in a situation of cognitive overload. Thus, the group will be unable to integrate the large number of perspectives to achieve an optimal level of both differentiation and integration, and group performance will suffer again. In line with these arguments, we claim that the number of group arguments used during group debates should facilitate both cognitive differentiation and integration of the conceptual domain.
Therefore, we hypothesize the following:
Next to task-related communication, groups engage in conversations that serve mostly socialization purposes and are not directly related to the task at hand. Chatting about various aspects that are not related to the task may serve as a social lubricant and, as a consequence, it is expected to have a positive influence on group atmosphere and ultimately on collective performance. However, if groups spend too much time discussing issues that are not related to the task, these discussions will ultimately distract the team from the task, generate lock in effects, and, therefore, decrease performance. Previous research (Oh, Chung, & Labianca, 2004) supports this claim and shows that group closure in informal ties has an inverted U-shaped relationship with group performance. In line with these arguments and previous empirical evidence, we, therefore, expect that some nontask-related communication is conducive for GCC up to a point, but as the team discusses too many distracting issues, the association between the frequency of nontask-related communication and GCC becomes negative. We, therefore, hypothesize the following:
Team Familiarity, Group Arguments, and GCC
Team familiarity refers to the previous experience team members have on working with each other (Huckman, Staats & Upton, 2009), and a number of studies showed that it is an important predictor of team performance (e.g., Goodman & Leyden, 1991; Gruenfeld et al., 1996; Shah & Jehn, 1993). Gruenfeld and her collaborators (1996) found that familiar groups performed significantly better than unfamiliar groups in a decision-making task, when task-related information was partially shared among group members. For instance, familiar groups were more likely to share unique information held by each member than unfamiliar groups. Also, familiar groups have a better understanding of the means through which the group will achieve its objectives (Bechky & Okhuysen, 2011). Member familiarity stimulates the development of shared mental models and transactive memory systems that ultimately facilitate the information coordination and group discussions (Mathieu, Heffner, Goodwin, Salas, & Cannon-Bowers, 2000; Stasser & Stewart, 1992; Wittenbaum & Stasser, 1995). When group members are familiar with each other, they feel more comfortable in expressing opposing views compared with groups composed of unfamiliar members (Gruenfeld et al., 1996). Shah and Jehn (1993) found that in a decision-making task, groups of friends experienced a higher level of beneficial task conflict and analyzed more task-relevant assumptions than groups of acquaintances.
Another stream of research showed that team member familiarity can negatively affect team performance. In an aircrew simulation study, Barker, Clothier, Woody, McKinney, and Brown (1996) showed that crew members who worked together for an indefinite period committed significantly more minor errors in every mission than newly formed crews. Katz (1982) found an inverted U-shaped relationship between group longevity and project performance in such a way that teams that stayed together for less than 1.5 years and teams that stayed together for more than 5 years had a lower project performance. Sieweke and Zhao (2015) found a U-shaped relationship between team familiarity and team coordination errors in professional basketball teams—initially, when team familiarity increased, team coordination errors decreased, but when team familiarity exceeded an optimal point, team coordination errors increased again. These studies suggest that overfamiliarity among team members impairs team coordination and performance due to the lack of critical examination of important information (Hensley & Griffin, 1986).
Both streams of research summarized above show that teams composed of familiar members may need less communication to achieve consensus than teams composed of unfamiliar members. If group members know each other and have collaborated previously on similar tasks, they already have the coordination processes in place that will ultimately allow them to converge rather quickly in how to deal with the organizational change case. Members will be willing to express their divergent views on the case study early on in the process and will more likely know how to effectively resolve disagreements when they emerge. We, therefore, argue that familiar team members will communicate more efficiently due to the shared mental models that allow them to have a common understanding about members’ responsibilities, expectations, and patterns of interaction (Mathieu et al., 2000). Once the team’s transactive memory system is developed, fewer group arguments need to be shared because team members know who knows what (Kanawattanachai & Yoo, 2007). Thus, familiar teams have more interpersonal knowledge that allows them to achieve faster cognitive convergence through a better coordination and a faster conflict resolution. Therefore, familiar team members will achieve an optimal level of both differentiation and integration faster, which means that the inflection point in the relationship between the number of TRGAs and GCC will be lower for teams with high rather than low familiarity. We further hypothesize the following:
Group Size, Group Arguments, and GCC
We build on the input-process–output model (McGrath, Arrow, & Berdahl, 2000) to consider group discussion as a process that converts specific group inputs (i.e., team and task features) into group outputs. Group size and group diversity are variables often used as proxies for the cognitive resources available in groups. As a consequence, research to date has explored group size and gender diversity as important antecedents for the emergence of GCC. Although most of this literature theorized linear associations between group size and performance (either positive, in a resource-based view, or negative, in a coordination costs perspective; see Stewart, 2006), scholars also acknowledge that group size may be nonlinearly related to performance (Oliver & Marwell, 1988). A recent study (Curşeu et al., 2017) shows that there is a nonlinear relationship between group size and GCC. As group size increases from small (three members) to average (five or six members), GCC also increases due to the knowledge pool added by each additional member to the group, whereas, when adding more team members (over seven members), the relationship between group size and GCC becomes negative as coordination costs increase for larger group sizes. We add to this finding by arguing that group size has an indirect effect on GCC via the number of group arguments used during group debates. Competition for speaking time is lower in small groups than in large ones (Nijstad & Stroebe, 2006; Stasser & Taylor, 1991); therefore, members of a small group are likely to engage in a reduced number of longer communication episodes. Small groups will exchange less diverse information related to the task at hand; therefore, they will be unable to achieve an optimal level of both differentiation and integration of the conceptual domain. Moreover, the brainstorming literature showed that productivity loss (i.e., quality and quantity of ideas) increases with group size in real (interacting) groups compared with nominal groups (i.e., individuals who perform alone and whose ideas were then pooled together) (Bouchard, Barsaloux, & Drauden, 1974; Bouchard & Hare, 1970; Mullen, Johnson, & Salas, 1991). We argue that, as group size increases, groups discuss more, but at larger group sizes, group discussions may become redundant, and so, the differentiation and integration of the conceptual domain will be impaired. Also, as competition for speaking time increases in larger size groups (Nijstad & Stroebe, 2006; Stasser & Taylor, 1991), group members may experience cognitive overload and fail to integrate the ample number of ideas generated during group discussions. Thus, we hypothesize the following:
Gender Diversity, Group Arguments, and GCC
According to Harrison and Klein (2007), group diversity can be conceptualized as separation (when group members have different beliefs, attitudes, or values), variety (when group members differ in terms of their knowledge and type of expertise), and disparity (when there are inequalities in resources, status, or power among group members). Variety reflects the horizontal differentiation in terms of cognitive resources brought in by group members. Minimum variety occurs when group members have similar knowledge, information sources, or type of expertise, whereas maximum variety reflects a high knowledge differentiation within the group. In our study, we focus on gender diversity conceptualized as variety. Previous research showed a small positive effect of gender diversity on GCC (Curşeu & Pluut, 2013; Curşeu et al., 2007). Gender heterogeneous groups seem to be more cognitively complex than gender homogeneous groups (Curşeu & Schruijer, 2010), and they generate a higher number of alternatives and ideas (Curşeu et al., 2007). Also, Rogelberg and Rumery (1996) showed that decision quality is higher in a mixed gender group because men and women have different experiences and skills (i.e., women are more sensitive to the need of reconciling different points of view, whereas men are more task oriented).
We argue that as gender variety increases, team members will exchange more group arguments because the amount of different knowledge will also increase. For instance, in groups with low gender variety, members will share fewer perspectives or have a unidimensional perspective related to the task and they will engage in fewer communication episodes. Therefore, they will be unable to achieve an optimum level of differentiation and integration. Conversely, in groups with high gender variety, members will share too many different perspectives related to the task and, as a result, they will share a substantial number of group arguments. Because coordination will suffer due to the large amount of information they have to process, heterogeneous groups will also be unable to achieve an optimum level of differentiation and integration. In groups with a moderate gender variety, members will use a moderate number of group arguments (will discuss a moderate number of alternatives related to the task), and so, they are more likely to achieve an optimum level of differentiation and integration. We further hypothesize the following:
Figure 1 presents the overall model with the four hypothesized relations.

Summary of H1 to H4.
Method
Participants
The study was approved by the Institutional Review Board (IRB) of Babeș-Bolyai University, Cluj-Napoca, Romania. The sample consists of 509 students (377 women) distributed in 106 groups (with an average group size of 4.8 members). As the gender distribution was strongly skewed toward women, we have tried to form as many gender heterogeneous groups as possible (40 groups were women only, one group contained men only, and the remaining groups were heterogeneous). Students were enrolled in several courses at a Romanian university. All participants were required to work throughout the semester on a complex team project and complete a cognitive mapping task (as a measure of the dependent variable described below) at the end of the semester.
Measures and Procedure
We used a cognitive mapping technique (Curşeu & Schruijer, 2010) to explore the emergence of collective cognitive structures and to derive the cognitive complexity index (i.e., the dependent variable). Each group received an envelope with 20 different concepts derived from the course topics that were studied throughout the semester. Teams were asked to render their understanding of the conceptual domain by organizing the 20 concepts on a map, drawing connections among them, and specifying the nature of the relations between them. The cognitive map resulting from these group interactions reflects the cognitive convergence processes (Ervin et al., 2017; Staggs et al., 2018) and, as such, the cognitive maps indicate the collective knowledge structures the groups developed in relation to a particular knowledge domain.
The collective cognitive complexity of each map was assessed using the following formula (Curşeu et al., 2007): CMCO = (CMC × CMD)/NoC, where NoC refers to the total number of concepts used in the map, CMC represents the total number of connections established between the concepts, and CMD stands for the number of distinct type of relations established between the concepts. According to Gómez, Moreno, Pazos, and Sierra-Alonso (2000), there are seven such distinct relations: causal, association, equivalence, topological, structural, chronological, and hierarchical relations. All three indicators (NoC, CMC, CMD) were coded by external raters after the completion of the task.
While the groups were engaged in solving this task, an external observer rated each number of group arguments on two dimensions: TRGAs and NTRGAs, as well as the time needed for the group to complete the cognitive map. The authors acted as coders for group communication events and although they were not blind to the hypotheses of the study, their observations were separated from the cognitive map coding to reduce the contamination effects. In coding both task and nontask-related group communication, we have used a group argument perspective (Meyers & Brashers, 1998). A group argument was defined as a communication sequence in which one statement or claim made by a particular group member is supported or disconfirmed by the initiator or other group members (Meyers & Brashers, 1998). Observers discussed and agreed on coding these communication sequences in which a group argument can be identified as a distinguishable event, and they recorded the frequency of these group arguments during the cognitive mapping exercise. As observation took place in real time (no recordings of the cognitive mapping sessions were possible), observers focused on the content of the group argument to distinguish the sequencing of two subsequent group arguments and did not exclude the repetition of particular group arguments during the debate. Therefore, the score for TRGA and NTRGA reflects the absolute number of group arguments (including repetitions) used during the debate. A group argument was coded as TRGA if the discussion content was related to the concepts used in the cognitive map and their relations, whereas a group argument was coded as an NTRGA if the discussion content was not related to the concepts used in the cognitive map (a nonarguable statement as labeled in Meyers & Brashers, 1998). After task completion, the students were asked to report demographics and the number of team meetings they participated in while working on their team project, as well as the average duration of a regular meeting.
Team familiarity was computed by multiplying the average number of minutes that each group had spent in a meeting, with the number of meetings that each group attended throughout the semester. As the current task was performed in a face-to-face context, and requires substantial interdependence and coordination, only the face-to-face meetings were included in the familiarity measure and we did not ask the participants to report the number of online group meetings.
Gender variety was computed using Teachman’s (1980) formula,
Results
The descriptive statistics and bivariate correlations are presented in Table 1. Because groups were involved in different courses, the nonindependence of observation assumption was checked. In addition, an analysis of variance (ANOVA) was performed with the specific courses as factor and GCC as the dependent variable, and the results reveal that the within-group variance is not significant from the between-group variance, F(2, 103) = 2.92, p = .06, and as the ICC(1) = .07, we can conclude that the observations across different courses are independent. Therefore, the use of ordinary least squares (OLS) regression analysis is appropriate. Because we intend to test nonlinear associations and because outliers can lead to biased results in regressions with polynomials (Nikolaeva, Bhatnagar, & Ghose, 2015), outliers identified using the centered leverage values were removed before performing further analyses.
Means, Standard Deviations, and Correlations.
Note. CM time = time to finalize the cognitive map; Familiarity = total time spent together as a group prior to the cognitive mapping exercise; TRGA = task-related group argument; NTRGA = nontask-related group argument.
p < .05. **p < .01.
To test H1 and H2, we have conducted an OLS regression analysis. In the first step, we entered group size, gender variety, NTRGA, time spent for completing the cognitive map, TRGA, and familiarity; in the second step, we entered the squared TRGA and NTRGA, respectively; in the third step, we entered the cross-product term between familiarity and TRGA; and in the fourth step, we entered the cross-product term between familiarity and squared TRGA. Before entering the variables in the regression models, they were grand mean centered (Aiken & West, 1991) to reduce the nonessential collinearity, and the interactions were computed with the centered variables. Centering the predictors also helps to interpret the results of the regression analysis by introducing meaningful zero points in the score range (Dalal & Zickar, 2012).
H1a states that the number of TRGAs has a nonlinear association with GCC. As illustrated by the results presented in Model 4 in Table 2, this hypothesis was fully supported as the beta coefficient for the quadratic term is negative and significant (β = −.36, p = .008), whereas the beta coefficient for the linear effect of TRGA is positive and not significant (β = .20, p = .12). Figure 2 presents the inverted U-shaped association between TRGA and GCC and depicts the inflection point at around the sample mean (the computed inflection point for the centered variable is .0000556).
Results of the OLS Regression Analyses for GCC and Frequency of TRGAs (N = 106).
Note. Standardized regression coefficients are presented in the table. OLS = ordinary least squares; GCC = group cognitive complexity; CM time = time to finalize the cognitive map; FA = familiarity or the total time spent together as a group prior to the cognitive mapping exercise; TRGA = task-related group argument; NTRGA = nontask-related group argument.
*p < .05. **p < .01. ***p < .001.

The inverted U-shaped association between the frequency of TRGAs and group cognitive complexity.
H1b states that NTRGA has an inverted U-shaped association with GCC and, as illustrated by the results presented in Model 4 in Table 2, this hypothesis was also supported. As the linear effect of NTRGA remained negative (β = −.17, p = .16), yet only the quadratic effect is negative and significant (β = −.23, p = .04), we can conclude that the association between NTRGA and GCC is likely to be increasing negative. Figure 3 depicts the nonlinear association between NTRGA and GCC, depicts the inflection point at around the sample mean (the computed inflection point for the centered variable is .000053), and supports this interpretation of the regression results.

The nonlinear association between the frequency of NTRGAs and group cognitive complexity.
Additional exploratory analyses were computed to examine the proportion of TRGAs in the group communication and whether it displays a nonlinear association with GCC. This additional analysis explores the optimal balance between TRGAs and NTRGAs in collaborative learning groups. The inflection point computed for the grand mean centered variable is at .002, therefore it is observed around the sample average (the average proportion of TRGA in our study was .747) as depicted in Figure 4. We can, therefore, conclude that, according to our analyses, the optimal balance between TRGA and NTRGA is 3 to 1.

The nonlinear association between the proportion of TRGAs and group cognitive complexity.
The second hypothesis that states that familiarity moderates the nonlinear association between TRGA and GCC was not supported as the interaction between familiarity and the quadratic term for TRGA is nonsignificant. To check the robustness of the findings, in Model 4 (Table 2), we tested the nonlinear associations and the interaction effects without the nonsignificant effects of squared group size and squared gender diversity. The results concerning the nonlinear association remained significant, whereas none of the interaction effects was significant. We can, therefore, conclude that our results are robust and support H1a and H1b, whereas the moderation hypothesis is not supported. The results of the regression models are presented in Table 2.
Furthermore, to test H3 and H4 we conducted two mediation analyses using MEDCURVE in SPSS (Hayes & Preacher, 2010). This technique allows us to estimate the instantaneous indirect effect (θ) of X (group size/gender diversity) on Y (GCC) through M (TRGA or NTRGA) at low (one standard deviation below the mean), moderate (sample mean), and high (one standard deviation above the mean) values of X. The MEDCURVE procedure is based on bootstrapping and has various advantages over other methods, in particular, is suited for testing mediation with rather small sample sizes and in comparison with parametric methods, it does not require distributions assumptions to be met (Hayes & Preacher, 2010). In the mediation analyses, we used as covariates the variables with a significant effect on GCC as illustrated in the regression analyses (team familiarity and the time needed to realize the cognitive map) as well as NTRGA because they correlate positively with TRGA.
The third hypothesis posits that the number of TRGAs mediates the association between group size and GCC was partially supported. The results show that at low values of group size, the instantaneous indirect effect is significant (
The fourth hypothesis stating that the number of TRGAs mediates the association between gender variety (X) and GCC was also partially supported. The results indicate that, in groups with low gender diversity, the instantaneous indirect effect is significant (
In line with the recommendations presented in Hayes and Preacher (2010), we explored whether a simple linear mediation model fits the data well and, as such, provides a more parsimonious explanation for the mediation effect of TRGAs. The results of the resampling mediation procedure (using the PROCESS macro for SPSS) show that the indirect effect of group size on GCC (mediated by the TRGAs) is not significant, effect size = .01, SE = .06, CI = [−.12, .14], as the CI includes zero. Moreover, the linear indirect effect of gender on GCC is also not significant, effect size = .04, SE = .18, CI = [−.34, .42]. Given that linear mediation is not a plausible explanation, we can conclude that our results on the nonlinear mediation models are robust.
Discussion
In this study, we argued that the relationship between the number of group arguments used during group debates and GCC has an inverted U shape (H1), and it is moderated by team familiarity in such a way that familiar teams will achieve an optimum level of both differentiation and integration faster than unfamiliar teams (H2). Moreover, we expected that the number of group arguments used during debates mediates the curvilinear association between group size and GCC (H3) and between gender diversity and GCC (H4).
Our results are in line with previous studies (e.g., Curşeu et al., 2012; Curşeu et al., 2007; Grand et al., 2016; Meslec & Curşeu, 2015) emphasizing the critical role of communication for the emergence of group cognition. However, we also answer the call for more research into the TMGT effect in psychology (Grant & Schwartz, 2011) and management (Busse et al., 2016; Pierce & Aguinis, 2013), as ways to extend our understanding of various elements of group dynamics. Specifically, we found that GCC increases as the number of exchanged group arguments increases from low to average, whereas a further increase of the number of TRGAs, from average to high, leads to a decline in GCC. Therefore, the benefits of task-related group communication exceed its costs as the values increase from low to a moderate level (i.e., up to an inflection point), but at higher levels of task-related group communication, the costs exceed the benefits. In a similar fashion, frequency of NTRGAs has a nonlinear association with GCC, in such a way that the negative association between NTRGA and GCC becomes stronger for higher levels of NTRGA.
Communication is, therefore, a key process for the emergence of group-level cognitive structures; yet, too many group arguments decrease the integrative complexity of collaborative learning groups. According to our results, the inflection point for the number of TRGAs is around 120 group arguments. Therefore, we can state that at around 120 group arguments per group debate, the relationship between TRGAs and GCC is positive, whereas after 120 group arguments, the relationship becomes negative. This observation is in line with the growing body of empirical evidence suggesting that an optimal level of debate is most conducive for team performance and innovation (Chang, 2017; De Dreu, 2006). We add to these insights, and we show that the emergence of group cognition is one plausible mechanism that explains the nonlinear association between the frequency of group communication and group performance. Further research could directly explore these claims in field studies.
The second hypothesis stating that interpersonal familiarity moderates the inverted U-shaped relationship between TRGAs and GCC was not supported. A plausible explanation for these results could be the way in which we have evaluated familiarity. Our measure was based on self-reports regarding the average time that the groups had spent together in meetings for project completion and, thus, it could be subjected to recollection biases. As a future research direction, we recommend using a more objective measure of interpersonal familiarity. However, previous studies that investigated the impact of familiarity on team performance showed mixed results. Familiarity influences team performance positively (e.g., Gruenfeld et al., 1996; Huckman, Staats, & Upton, 2009), negatively (e.g., Barker et al., 1996), or in a curvilinear way (e.g., Katz, 1982; Sieweke & Zhao, 2015). Further studies could consider other factors or mechanisms that can explain the role of familiarity in the relationship between task-related group communication and GCC (i.e., the nature of task). Finally, a potential explanation for not finding a moderation effect of familiarity could be the gender-skewed distribution within groups. As mentioned earlier, the majority of our respondents are women and 40 groups were composed exclusively of women, whereas the majority of the gender heterogeneous groups had a women majority. As illustrated in previous research, the percentage of women in groups is positively associated with collective intelligence (Woolley, Chabris, Pentland, Hashmi, & Malone, 2010) and with collective emotional intelligence of groups (Curşeu, Pluut, Boroş, & Meslec, 2015). Women’s higher social sensitivity explains the positive effect of the percentage of women in groups on their cognitive and affective dynamics. We could, therefore, argue that the high percentage of women in our sample could have shadowed the potential moderation of familiarity. High social sensitivity may, in principle, overrule the potential moderating role of interpersonal familiarity. Future research could disentangle these differential effects.
In addition, the results of the mediation analyses partially confirm the third and fourth hypotheses, indicating that the number of TRGAs shared during group debates mediates the impact of group size (at low and moderate values of group size) and gender diversity (at low and moderate values of gender diversity) on GCC. At high values of both group size and gender diversity, the mediation through TRGAs was not significant. The results suggest that both relationships (between group size and GCC, and between gender diversity and GCC) could be explained by other mechanisms besides the frequency of TRGAs. As a further research direction, one could investigate what other mechanism can mediate the relationship between group size and GCC.
The literature to date (Curşeu & Pluut, 2013; Curşeu et al., 2007) has shown that gender diversity (conceptualized as gender variety) has a small positive impact on GCC. We contribute to the literature by showing that the relationship between gender diversity and GCC is mediated by the number of group arguments shared during debates. Considering that the curvilinear relationship between gender diversity and GCC was only partially mediated by the number of TRGAs, we can conclude that, besides TRGAs, there are other mechanisms that could explain the relationship. Therefore, another research direction could be to investigate those mechanisms.
A limitation of the current study is the sample used (sample of students); therefore, the results should be generalized with caution, especially because the gender distribution is strongly skewed toward women. A second limitation of our study is the index used to measure familiarity that was self-report and also prone to memory biases because we asked members to recall the average time spent together in a meeting for the project achievement task. Also, the same measure excluded online group meetings, and this could have biased our evaluation of team familiarity. A third limitation refers to the cognitive mapping task and the formula used to compute GCC, as these are boundary conditions of our study. For example, while coding the types of relations, we have assigned equal weights to all observed relations, as based on the initial taxonomy (Gómez et al., 2000), no such weights were discussed and introducing it in our coding procedure would have been rather arbitrary. Then, we have used an integrative score for GCC that reflects the relative differentiation and integration per concept used in the cognitive map; therefore, it is a global indicator of integrative complexity. Future research could explore using different tasks and different coding procedures for the differentiation and integration, and separate these collective cognitive processes in time. One could imagine a design in which groups first engage in collective brainstorming and in the coding of ideas, different idea generation categories receive different weights depending on their sample frequency (Lucas, van der Wijst, Curşeu, & Looman, 2013), and then, the groups engage in integration as a separate task. Such a task separation could disentangle the differentiation and integration processes and shed more light on the role of communication processes in shaping cognitive differentiation and integration in groups. Finally, another limitation of the current study was the use of single coders to capture the frequency of communication events. Ideally, we should have used two coders for each group, blinded to the hypotheses of the study and we should have videotaped the cognitive mapping sessions for further reference and analysis. The fact that we have carried out this study as part of the regular curricular activities made these ideal choices rather impractical.
Practical Implications
The present findings have important practical implications for designing effective collaborative learning groups. Being aware of the relevance of group communication processes, educators using collaborative learning groups and consultants can explore and implement interventions that have the potential to prevent the negative effects of TRGAs and to increase the beneficial ones. As our results show, an optimal level of task-related group communication during the cognitive mapping task has between 100 and 120 group arguments that can be effectively integrated. In the sample used in our research, cognitive convergence (Ervin et al., 2017; Staggs et al., 2018) seems to be optimal at this number of group arguments. Student groups could be trained to optimize their group discussions by striving for a number of group arguments that fits this range. Moreover, our additional exploratory analyses have identified that the optimal balance between TRGA and NTRGA in groups is when TRGAs account for 75% of the group arguments. Therefore, the optimal ratio of TRGAs and NTRGAs identified in our analyses is 3 to 1, in other words, for each non-task related argument, groups have to share and discuss 3 TRGAs. As we have argued in our theoretical framework, NTRGA is important for socialization and securing a pleasant work atmosphere in groups, yet the incidence of NTRGA should not exceed 25% of the arguments discussed during the debates. Group trainings could emphasize the important role of NTRGAs as well as the fact that student groups should strive for an optimal balance between TRGA and NTRGA.
Theoretical Implications
Our study answers the recent call for more exploration of the non-linear association between communication frequency and group outcomes (Marlow et al., 2018) and provides support for a TMGT effect of TRGAs on the complexity of emergent group cognition. Our results reveal the need to explore distinct mechanisms that explain the link between group communication, on the one hand, and integrative cognitive complexity (Driver & Streufert, 1969; Gruenfeld & Hollingshead, 1993; Gruenfeld et al., 1996) and cognitive convergence (Ervin et al., 2017; Staggs et al., 2018). In line with the emergent view on group cognition (Curşeu, 2006; Kozlowski et al., 2016), cognitive emergence requires differentiation in both, cognitive terms (Curşeu et al., 2007; Grand et al., 2016), as well as well as in the structure of social interactions (Curşeu et al., 2012). In other words, group members have to actively share diversified information to generate a significant group knowledge pool (Grand et al., 2016) that will ultimately generate group level knowledge structures through cognitive convergence. When the number of group arguments shared during debates is too high, groups may experience significant cognitive load (Kirschner et al., 2009) that prevents them from effectively integrating the knowledge pooled via group discussions (Grand et al., 2016). Our results therefore point to the relevance of exploring group communication using a too-much-of-a-good thing framework. Field and experimental studies could tackle the mechanisms that explain the role of communication for cognitive differentiation and integration, treated as separate cognitive processes. Moreover, as already indicated by Grand and collaborators (2016), computational studies could further explore the interplay between differentiation and integration in groups and shed more light on how differential mechanisms associated with too little or too much communication impact on emergent group cognition.
Conclusion
In our study, we have explored the effect of group arguments (number of TRGAs and NTRGAs) on the emergence of collective cognition in collaborative learning groups. Building on insights from the too-much-of-a-good thing framework, we show that the number of TRGAs and NTRGAs used during group discussion have an inverted U shape association with GCC. Our additional analyses show that the optimal proportion of TRGAs during group debates is .747, Therefore, although useful (up to a point) for the emergence of group cognition, the non-task related argumentation should not exceed the frequency of the task-related argumentation, as the optimal cognitive results are achieved in group discussions in which the TRGAs dominate in a ratio of 3 to 1 the NTRGAs. Moreover, based on team cognition literature we have hypothesized that group member familiarity moderates this non-linear association, yet this moderation received no empirical support. Finally, building on a resource-based view on groups, we tested whether the number of TRGAs mediates the effect of group size and gender diversity on GCC. Our results supported the mediation claims and showed that the effect of group size and gender diversity on GCC was mediated by the number of TRGAs.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Petru L. Curșeu and Oana C. Fodor were supported by a grant of the Romanian National Authority for Scientific Research, CNCS—UEFISCDI, project number PN-II-ID-PCE-2011-3-0482. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
