Abstract
Collaboration is a construct comprising diverse definitions and frameworks. Additionally, being a latent variable and because of its complexity and interactive nature, collaboration is difficult to measure. Therefore, this systematic literature review was guided by two fundamental questions: what to measure and how to measure. Through the review and synthesis of 28 carefully selected studies we derived an integrative framework displaying indicators for peer collaboration in higher education and beyond. Moreover, the results give insights into measurement approaches of collaboration comprising information on data collection and analysis as well as contextual factors (e.g., task type, time).
In a digital era, humans not only need better technical preparation but also appropriate skills to adapt to the changing requirements at the workplace (Carnevale & Smith, 2013). The knowledge and skills that prepare individuals for lifelong learning in a complex and fast changing world have been labeled 21st century competencies, including communication, collaboration, critical thinking, creativity, problem solving, and digital skills (Buckingham Shum & Crick, 2016; van Laar et al., 2017). Especially communication and collaboration skills seem to play a crucial role, as a huge amount of literature highlights the importance of social interactions during knowledge building and skill acquisition (e.g., Ghazal et al., 2019; Martin et al., 2016). In fact, collaboration can contribute positively to the development of other skills like problem solving, creativity, and critical thinking (Ghitulescu, 2018; van Laar et al., 2019). Students merging from schools and universities into the workforce are expected to collaborate with others in order to solve complex problems and to create innovations (Y. Rosen & Mosharraf, 2016; Sartori et al., 2018). This is also why the Program for International Student Assessment (PISA) integrated collaborative problem solving as an additional assessment domain besides science, reading, mathematics, and financial literacy in 2015 (OECD, 2017). Beyond that, universities in particular are encouraged not only to transmit discipline-specific skills (often referred to as hard skills) but also the so-called soft skills like interpersonal skills, as they are particularly sought after and valued by employers (Andrews & Higson, 2008; McMurray et al., 2016). The development of soft skills like collaboration is often embedded within discipline-specific learning, as it would be hard to find room for specific courses in crowded curricula (Kember et al., 2007). Accordingly, collaborative learning is often referred to as an educational approach to learning (Laal & Laal, 2012; Van Aalst, 2013). However, Barron (2003) calls for shifting the view of collaboration from an instrumental orientation to one that highlights that learning to collaborate is an essential human competence. Deiglmayr and Spada (2010, p. 104) agree by pointing out that collaboration is not only seen as a vehicle for solving problems or acquiring knowledge but “learning how to collaborate effectively is now considered a goal in itself.” The importance of collaborative working and learning has already been known for a long time, and research on collaboration has a long tradition. Nevertheless, some fundamental questions and challenges remain. Collaboration is a complex socio-cognitive phenomenon consisting of multiple, interacting factors like individual skills and attitudes, team roles, relationships, and trust, as well as tools, tasks, and context (Andrews-Todd & Forsyth, 2020; Patel et al., 2012). Due to the ongoing developing information and communication technologies, collaboration is becoming increasingly complex (Stahl, 2006). Given its complexity and interactive nature, collaboration is a construct that is difficult to measure. Further, it is frequently treated as a latent variable (meaning it is not directly observable), which adds to the difficulty of comprehending it fully.
Using a systematic literature review approach, the current study focuses on collaborative processes in adult peer learning, with two main goals. Firstly, it seeks to contribute to the conceptualization of collaboration as a complex, multifaceted, and context-dependent construct. Secondly, it provides an overview of best practices in measuring collaborative working and learning. Unlike previous reviews, which had a more generalized focus across various contexts and either considered conceptualization or measurement approaches, our review will adopt a context-specific approach and encompass both aspects simultaneously. Based on this synthesis, we present an integrative framework, as well as a roadmap (including what to measure, how to measure, and under which circumstances), for researchers and practitioners interested in collaborative processes. Although our findings are mainly tailored to the higher education context, they hold the potential to inspire researchers and practitioners of any other context in which collaboration is prevalent. In the following, we provide a summary of the theoretical foundation and the present research landscape, which encompasses prior reviews. This summary will then lead us to formulate our research questions and establish the significance of this study. Afterwards, we explain the systematic review procedure and present our findings. Building on this, we discuss strengths and limitations of this study and conclude with implications for research and practice.
Collaboration: Definition and Differentiation
“For a concept so widely used in everyday language there is a surprising lack of a clear understanding of what it is to collaborate” (Patel et al., 2012, p. 1). Indeed, Patel et al. (2012) emphasize that collaboration is a construct comprising diverse definitions and frameworks. Etymologically, the word collaboration stems from the Latin verb collaborare, meaning to work together (Merriam-Webster, n.d.). That might be the reason why in everyday life the term is often used interchangeable with group work, although they are not synonymous (Summers & Volet, 2010). For example, imagine a group of students doing a project work. If they split the task so that each of them works on a subtask individually and afterward they put their results back together, they solved the task as a group but they did not work and learn collaboratively (Summers & Volet, 2010). That is why it is important to differentiate collaboration from related constructs like cooperation and coordination (Bedwell et al., 2012; Jeong et al., 2017). The most widely used definition of collaboration is the one by Roschelle and Teasley (1995). Both research in the learning context and work context refer to their work (e.g., Bedwell et al., 2012; Schoor & Bannert, 2012). They define collaboration as “a coordinated, synchronous activity that is the result of a continued attempt to construct and maintain a shared conception of a problem” (Roschelle & Teasley, 1995, p. 70). Whereas collaboration focuses on shared conceptions, cooperation refers to the division of labor, meaning that the workload is shared (as demonstrated in the aforementioned example; Dillenbourg, 1999). Sharing the workload is expected to result in faster and more efficient task resolution (Schmidt, 1991), as coordination means to organize the work and actions to be done in time (Baker, 2015; Salas et al., 2000), aiming for harmonious actions (Shah & Leeder, 2016). Collaborative learning and working, on the other hand, usually takes place in order to solve (specific) problems, as highlighted by Roschelle and Teasley (1995). Thus, the term collaborative problem solving is becoming increasingly popular. Having a shared goal (to solve a problem) is an important characteristic of collaboration and often the reason for working together (Bedwell et al., 2012). Other definitions of collaboration further include references to the number of actors (“two or more people learn”, Dillenbourg, 1999, p. 1) as well as temporal aspects (“within a single episode or series of episodes”, Patel et al., 2012, p. 1). Additionally, some authors describe collaboration as being an interaction between equals, meaning that the learners have the same rights and statuses to intervene during the group work (e.g., Baker, 2015). Considering this, the focus of the present study lies on peer collaboration.
Collaboration can take place at different levels, such as within teams, across teams, or encompassing entire organizations. Moreover, depending on the discipline, the understanding of collaboration varies. Bedwell et al. (2012) conducted a multidisciplinary literature review, which highlights that, whereas collaboration is seen as a process in some fields of research (e.g., education), it is seen as a structure in others (e.g., management). In the majority of their reviewed literature, however, collaboration is conceptualized as a process (Bedwell et al., 2012). In school and university settings, the terms collaborative learning and cooperative learning are primarily used (e.g., Laal & Laal, 2012; Slavin, 1996; Van Aalst, 2013). Although some researchers underline the difference between those two terms, others use the terms interchangeably (Olivares, 2008). Whereas cooperative learning is the more traditional term (cf. O’Donnell & Hmelo-Silver, 2013), recently, the term collaborative learning has been used more frequently. This is especially due to the growing field of computer-supported collaborative learning (CSCL), where the distinction between cooperative and collaborative learning is explicitly made (Stahl et al., 2006). In contrast, research related to workplace settings employs the term computer-supported cooperative work (CSCW) more commonly. The traditional use of the term cooperative instead of collaborative might be due to the fact that work teams often make use of their distributed expertise to complete a task (Fransen et al., 2013) and strive for efficient task resolution. In work contexts, the focus often lies on outcome measures, including team effectiveness criteria like performance quality and efficiency (Marks et al., 2001; Mathieu et al., 2008) as well as team viability (Sundstrom et al., 1990). In contrast, the primary goal of teams in education context is individual learning and shared knowledge construction (Fransen et al., 2013).
As mentioned above, terms like coordination, cooperation, teamwork, and collaboration are often used interchangeably. However, collaboration is a higher-level process which is increasingly used in practice; “therefore, science needs to thoroughly understand what it is and what it is not in order to help practitioners maximize its effectiveness and usefulness” (Bedwell et al., 2012, p. 142). To this end, it is important to consider potential differences with respect to disciplines/domains, educational levels, and technologies used to support collaboration (Bedwell et al., 2012; Jeong et al., 2019). Moreover, especially in education context, the question arises whether the focus lies on collaborating to learn (educational approach) or learning to collaborate (essential human competence). Yet, they are not mutually exclusive. Indeed, the former one can be seen as a mean to encourage the latter one (Salmons, 2019). Child and Shaw (2019) argue that learning to collaborate is becoming more important at the end of a key stage of schooling, and by referring to reports of the Organisation for Economic Co-operation and Development (OECD), they claim that collaborative activities can be assessed from around 15 years of age onward. However, it is important to note that the complexity of tasks and materials increases with the progression to higher educational levels (Jeong et al., 2019). In addition, adult learners are more self-directed (Fanning & Gaba, 2007), embedded in different socio-cultural environments (Jeong et al., 2019), and bring diverse expertise and experiences together. Compared to school settings where the teachers are encouraged to monitor and support interaction and participation in collaborative groups (e.g., Rasmussen & Ludvigsen, 2012; Webb et al., 2021), students at higher educational levels are often less supported and expected to self-regulate their (collaborative) learning. Keeping these differences in mind, this study seeks to understand and conceptualize peer collaboration in higher education and beyond. As the primary goals of group work in higher education are learning and shared knowledge construction, the processes are of particular importance. In line with this, the current work’s focus lies on collaboration as a process. These insights are important for suitable assessment approaches to collaboration, as the conceptualization of a construct affects the way in which it is appropriately measured.
Measurement of Collaboration
The measurement of collaborative learning processes is important because it helps researchers and practitioners to assess the knowledge and skill level of individuals and groups. This, in turn, allows them to distinguish between high and low performers (e.g., Schneider et al., 2020) and helps to understand which mechanisms lead to successful learning. Subsequently, they can guide, adapt, and improve learning experiences. Furthermore, assessing and reflecting learning processes allows students or workers to monitor their own personal learning and skill development (Buckingham Shum & Crick, 2016). Additionally, measurement findings can be used for the development of group awareness tools, facilitating learners in becoming more aware of their interaction partners, thus catalyzing advantageous learning processes (Bodemer et al., 2018).
There are several issues associated with the assessment of collaboration. For example, a big challenge of measuring collaboration is resolving the issue of what aspects are being assessed (Child & Shaw, 2019). Assessment criteria are dependent on the purpose of the assessment (Strijbos, 2011), which can be either summative (focus on the learning outcome) or formative (focus on the learning process). Therefore, it is crucial to identify appropriate operationalizations of collaboration processes and outcomes as well as the corresponding measurement methods (Strijbos, 2016). Besides process and outcome variables, input variables like participant background (e.g., personality) and tasks (e.g., well vs. ill-defined tasks) should be considered when assessing collaboration (Kyllonen et al., 2017). A further question arising is whether the assessment should focus on the group or individual level. With respect to summative assessments in educational settings, for example, it is common that a group product is evaluated or that the individual level of knowledge following the group work is assessed.
Another difficulty in measuring collaboration lies in the high number of interactions that take place during collaborative learning (Caballero-Hernández et al., 2020). The learning patterns that take place in collaborations are much more complex than those in individual learning settings (Cen et al., 2016), which is especially true for bigger groups and groups that work together over several episodes of time. Furthermore, it is crucial to recognize that significance lies not only in the interactions among learners, but also in their utilization of the provided resources to address tasks, along with their engagement with the task environment, such as through different kinds of technologies (cf. Lämsä et al., 2021). Especially, when learning groups can decide independently when and how to work together (e.g., when they work together in a hybrid setting over the course of a semester), measuring their collaboration becomes extremely challenging. Consequently, studying collaboration requires considering the circumstances in which it takes place.
Prior research reflects a variety of methods for measuring collaboration like questionnaires (e.g., Caniëls et al., 2019), diary studies (e.g., Shah & Leeder, 2016), social network analysis (e.g., Kent & Cukurova, 2020), and lag sequential analysis (e.g., Cheng & Chu, 2019). Whereas some rely on self-reports, others are observations of learners’ interactions and behaviors in the digital environment. Digital environments, learning analytics, and educational data mining are becoming prevalent, as they establish an ecosystem where collecting data continuously to assess learning processes is feasible and manageable (Aldowah et al., 2019). For example, researchers have presented innovative technological frameworks for automating collaboration analysis (e.g., Anaya & Boticario, 2013; Duque et al., 2011; Noel et al., 2018). However, these techniques are mostly algorithmic and rarely based on theories (Baker et al., 2021), hence lacking a pedagogical framework. In turn, a missing theoretical foundation leads to an inadequate operationalization of collaboration. For example, in some research, it is unclear how collaboration is operationalized, while in some other, collaboration is simply operationalized as interaction and participation rates. Yet, this does not reflect the main focus of collaboration which lies in creating shared conceptions. To better understand learning processes, research utilizing learning analytics and educational data mining should establish meaningful links between the recorded data and the targeted construct (Wise et al., 2021). To build this bridge from “clicks to constructs” (Buckingham Shum & Crick, 2016, p. 16), a set of theoretical based descriptions is needed. As the aim of the current review is to shed light on collaborative processes more precisely, the results can serve as a solid theoretical basis for learning analytics and educational data mining approaches.
Prior Reviews and the Present One
Several review studies (e.g., Bedwell et al., 2012; Child & Shaw, 2019; D’Amour et al., 2005; Lai, 2011) have already been conducted to give an overview of collaboration processes and to propose theoretical and operational frameworks. Whereas reviews by Lai (2011) and Child and Shaw (2019) refer to school and university contexts, Bedwell et al. (2012) focused on collaboration at work and D’Amour et al. (2005) reviewed interprofessional collaboration in health fields. These reviews primarily contribute to the conceptualization of collaboration in terms of construct definition and differentiation from other constructs. Yet, they do not give detailed descriptions of the sub-facets of collaboration, provide a list of concrete behavioral indicators, or discuss appropriate measurement methods. Moreover, none of them was set-up or executed as a systematic literature review. While there are systematically conducted literature reviews focusing on the measurement techniques used in CSCL (e.g., Gress et al., 2010; Lämsä et al., 2021), the included studies often lack well-established theoretical underpinnings or precise definitions of collaboration (e.g., “the included studies did not always have strong theoretical groundings,” Lämsä et al., 2021, p. 4; “the constructs of interest were at times less defined,” Gress et al., 2010, p. 811), making it difficult to classify and interpret the results. Accordingly, prior reviews on collaboration either focused on construct definition and differentiation from other constructs or measurement approaches. A systematic review considering both conceptualization and measurement of collaboration is missing.
To extend the existing body of research, the current review uses a selective and purposeful approach (cf. Xiao & Watson, 2019) by identifying best practice studies and analyzing them in depth. Hence, this study is expected to extend previous reviews by (a) giving a holistic overview with detailed descriptions of sub-facets of collaboration, while focusing on a specific target group, namely adult peer learners, (b) depicting best practice approaches to measuring collaboration, and (c) linking the conceptualization and measurement of collaboration. Since measures need to be selected and formulated to effectively capture the theoretical content (M. A. Rosen et al., 2018), it is important to establish a connection between the conceptual and measurement method domains. Considering the challenges in analytics (e.g., to map digital traces to collaborative learning constructs, Wise et al., 2021), reviewing best practices can enrich future measurement approaches. Taken together, this systematic literature review was guided by the following research questions: What aspects are important to consider when conceptualizing collaboration (RQ 1)? Further, how can they be measured (RQ 2)? Answering these research questions enables us to make a valuable contribution to the conceptualization of collaboration as well as to derive recommendations for measurement approaches.
Method
The systematic literature review was conducted based on the guidelines by Siddaway et al. (2019) as well as Gough et al. (2017). These guidelines provide recommendations for planning (e.g., selection of databases, formulation of search terms, and inclusion/exclusion criteria), conducting (e.g., considering interrater reliability), and organizing the review process (e.g., references to helpful tools and applications), as well as the presentation of results (e.g., description of included studies). Accordingly, the review process consisted of the following main steps: (1) identification, (2) screening, and (3) analysis.
Identification: Literature Search Strategy
The literature search was conducted in September 2020. We consulted two important databases in the field of educational research (i.e., Web of Science, PsycInfo) to identify relevant studies. Our initial search that encompassed terms like teamwork and cooperation yielded over 20,000 results. To enhance precision, we fine-tuned the search criteria through consultations with colleagues. We further narrowed down the results using the Social Sciences Citation Index, concentrating on pertinent research domains such as Education, Psychology, Computer Science, and Business Economics. To maintain a manageable scope of literature, the decision was made to focus the search on articles with the concept of interest included in the title. The final search string used was: TITLE: (collaborat*) AND TOPIC: (skill* OR abilit* OR competenc* OR behav* OR learning) AND TOPIC: (concept* OR framework OR theor* OR assess* OR measur*). No restrictions with respect to the publication year were made. Further studies were gathered by employing literature snowballing techniques and reviewing the literature referenced in the authors’ previous research projects. After setting a language filter (English) and removing duplicates, the literature search strategy resulted in 5,107 publications.
Screening Process
Taking into account the theoretical foundation outlined in the present study, the studies from our search results had to fulfill the inclusion criteria specified in Table 1 to be eligible for further analysis. The screening took place in two phases: an initial screening of titles and abstracts, and a screening for eligibility of the full-texts. The screening procedure is summarized in Figure 1. In the initial phase, two coders independently rated the first 100 abstracts. The calculation of Cohen’s Kappa resulted in κ = 0.71, reflecting a substantial agreement. Publications on which the coders did not agree were discussed until consensus was reached. Afterwards, the first author screened the remaining titles and abstracts using Rayyan, a tool for systematic reviews (Ouzzani et al., 2016). As recommended by Siddaway et al. (2019), the abstracts and titles were screened sensitively, resulting in 651 potentially relevant publications.
Study Inclusion Criteria.

Flow chart summarizing the screening procedure.
Next, we reviewed the publications using the inclusion criteria. At this stage, the focus shifted from sensitivity to specificity (in accordance with Siddaway et al., 2019), resulting in the inclusion of 28 publications for the final analysis. In certain instances, a case could have been made for either inclusion or exclusion. These borderline cases were discussed with colleagues. Additionally, approximately 10% of the studies, constituting a sample of 65, were independently assessed by a second coder. This evaluation aimed to ascertain the clarity of the inclusion criteria and the potential reproducibility of the decision-making process. The calculated Kappa value of κ = 0.74 reflects a substantial agreement indicating interrater reliability. Publications on which the coders did not agree were discussed until consensus was reached.
Analysis of Included Literature
To analyze the literature, qualitative content analysis was conducted using MAXQDA 2020 (VERBI, n.d.-a). An inductive-deductive approach was employed and the analysis was led by the two fundamental questions of what to measure and how to measure it. Each study was examined with respect to important characteristics and traces of collaborative processes, as well as the measurement methods used. Moreover, information on participants (who) and setting (under which circumstances) was collected. The final coding scheme can be found in the Appendix (Tables A1–A3). A sample of 21.4% of the studies was used to test interrater reliability. The coding scheme was introduced to a second coder who coded n = 6 of the included publications using MAXQDA. Then, we used MAXQDA to check intercoder agreement by examining the occurrence or absence of the codes in the documents, such that if both coders have assigned the same code to one document, it was counted as an agreement. The percentage of agreement was approximately 90%. For the percentage of agreement corrected by chance, MAXQDA calculates a Kappa value (Rädiker & Kuckartz; for details, see VERBI, n.d.-b), which was >0.85 for each of the documents. Additionally, we reviewed each of the six publications for any discrepancies, which were then deliberated upon. Following these discussions, minor adjustments were made to the coding scheme.
Results
The final set of included studies (N = 28) consisted of n = 26 journal articles, n = 1 conference proceeding, and n = 1 book chapter (see Table A4 for an overview of included studies). Most of the studies (n = 21) were published within the last 10 years, and only a few were published before 2011. Most studies were published in high-impact journals. The majority of studies included students as participants (n = 24). In the remaining publications, participant status was not explicitly specified. Half of the studies did not report participant ages. In the other half, the ages of the participants ranged from 19 to 51 years, whereby most of the studies reported their participants being in their 20s. The reported sample size varied from very small (e.g., analysis of one group with six participants; Arvaja, 2007) to very large (e.g., analysis of communities with over 1,000 participants; Gašević et al., 2019). Overall, sample sizes were mostly smaller than 60, with many studies having a sample size of fewer than 20 participants (Range N = 5–1,214 participants). Almost all studies investigated collaborative processes either in small groups consisting of three to six members or in dyads. Two of the included studies measured collaborative processes in learning communities, explaining the broad range of sample sizes.
Besides information on participants, we considered information about context and setting of the studies. In order for collaboration to occur and be measured, it is essential that there is an environment that enables participants to exhibit pertinent behaviors. Therefore, we categorized how the interaction between learners took place, what kind of problem-solving tasks were present, and whether they had to be solved in a short or long period of time. Furthermore, we investigated what kind of tools were used (for an overview, see Appendix A1). The included studies investigated collaborative problem-solving activities in face-to-face, online, or hybrid settings, whereby online settings were the most frequent (n = 15). Almost all of the studies (n = 24) explicitly stated that the collaboration was supported by technology, such as by using computers (e.g., Wiltshire et al., 2018) or tabletops (Martinez-Maldonado et al., 2015). In the majority of the included studies (n = 21), the groups worked on ill-defined tasks characterized by ambiguity and an unknown solution (cf. Care et al., 2015), as compared to well-defined tasks (n = 7) with clear steps and a clear solution. The tasks to be solved were either planned for a fixed session (e.g., 30 min, n = 16) or for an extended period (e.g., several days or weeks, n = 11). In one study, this information was missing. Tasks tackled over an extended period predominantly consisted of ill-defined tasks (e.g., Gašević et al., 2019; Khosa & Volet, 2014). Based on Jeong et al. (2019) and Lämsä et al. (2021), we differentiated types of technologies used in the included studies, namely communication tools (e.g., chats), sharing and co-construction tools (e.g., Wikis), and dynamic tools (e.g., programming tools). Not surprisingly, communication tools were only necessary in hybrid or online settings. Whereas sharing and co-construction tools appear to be commonly employed for tasks characterized as ill-defined or those demanding an extended duration of work (e.g., Cullen et al., 2013; Kent & Cukurova, 2020), dynamic technologies were preliminary used for well-defined tasks (e.g., Andrews-Todd & Forsyth, 2020; Sun et al., 2020).
What to Measure? Conceptualization of Peer Collaboration
The inductive-deductive analysis approach led to an iterative formation of four broad categories that reflect the core facets of collaborative processes, including (1) managing task and progress, (2) constructing shared knowledge and solutions, (3) maintaining a functional team, and (4) individual and joint participation. These core facets correspond to the well-established categories and processes of learning and working together; namely (1) metacognitive (regulative), (2) cognitive, (3) affective, and (4) behavioral (e.g., Järvelä & Hadwin, 2013; Kozlowski & Ilgen, 2006). We examined the operationalization of collaboration in the studies included in our analysis to classify them by identifying differences and commonalities. Although the studies used different terms and classifications, commonalities with regard to the descriptions of collaborative acts and their meanings could be synthesized across the included studies (see also Finfgeld-Connett, 2014; Rousseau et al., 2006). We clustered and subsumed similar indicators of collaboration to create an integrative framework reflecting the metacognitive (regulative), cognitive, affective, and behavioral aspects, while further including differentiated sub-facets and collaboration-specific labels.
During analysis, it became clear that in most of the studies the frameworks presented were based on previous research and therefore served for a deductive approach (e.g., Sun et al., 2020; Wiltshire et al., 2018), while in some cases they were formed inductively from the study material (e.g., Cullen et al., 2013). Noticeably, two well-established frameworks were often referred to; namely the rating scheme by Meier et al. (2007, e.g., in Schneider et al., 2020), and the PISA framework (e.g., in Hao et al., 2017). Whereas the rating scheme by Meier et al. (2007) focuses on collaborative learning (CSCL), the PISA framework (e.g., OECD, 2017) focuses on collaborative problem solving (skills). Additionally, some studies (e.g., Seeber et al., 2013) referred to taxonomies related to collaboration in work contexts (Fiore et al., 2010). With the aim of providing an overview of best practice approaches, the current framework combines different foci and extends prior taxonomies by including behavioral indicators across multiple contexts (e.g., different kinds of computer-support, tasks, group sizes). We note that our systematic review and analysis of various approaches resulted in different labels, classifications, and extensions. With regard to the labels, we decided to use predominantly active formulations that describe the socio-cognitive, socio-affective, and regulative processes and behaviors through which learners collaborate, which reflect what they are able to do (cf. Anderson et al., 2001). In the following, we elucidate the integrative framework, which encompasses four central facets, each composed of corresponding sub-facets. We will provide examples from the included articles to illustrate these aspects (for an overview, see Appendix A2). The facet constructing shared knowledge and solutions was found in n = 25 studies, managing task and progress as well as individual and joint participation were present in n = 19 studies, and maintaining a functional team in n = 14 studies.
Managing Task and Progress
This facet refers to actions and communications used to develop a plan and distribute roles. Furthermore, it includes observing and reflecting the collaborative progress in order to make adjustments as well as making use of tools and resources to solve the task.
Planning Activities
Planning activities encompasses a group’s involvement in gaining a comprehensive understanding of the task or issue at hand. It includes task clarification (Schoor & Bannert, 2012), managing the scope and content (Cullen et al., 2013; Shah & Leeder, 2016), determining goals and sub-goals (Andrews-Todd & Forsyth, 2020; Wiltshire et al., 2018), and developing a strategy of how the task will be approached (Shah & Leeder, 2016). This might be especially important in the beginning but remains relevant throughout the subsequent stages of collaboration, such as when planning the next steps to take (Hao et al., 2017) or deciding when to step forward (Seeber et al., 2013). Khosa and Volet (2014) further distinguish between high and low levels of planning, whereby a high level is not only proposing an approach on how to proceed but also involves conceptual justification for doing so.
Coordinating
Coordination encompasses a group’s action in dividing the work and arranging the task at hand. It includes task division (Meier et al., 2007; Schneider et al., 2020) in order to allocate subtasks, roles, and duties (Cullen et al., 2013; Schoor & Bannert, 2012). Moreover, the sub-facet also covers behaviors like accepting or refusing role distribution (Avry et al., 2020).
Monitoring and Reflecting
This sub-facet encompasses behaviors aimed at monitoring and assessing progress toward the goal, as well as reflecting on and adapting activities accordingly (e.g., Hmelo-Silver et al., 2008). This includes checking “what has been done, and what is still to be done” (Avry et al., 2020, p. 6) and therefore reflecting on what is required to solve the task (Khosa & Volet, 2014). To do so, the investigated groups in the included studies often summarized their current state (Hmelo-Silver et al., 2008; Vuopala et al., 2019). In addition, this sub-facet includes monitoring the remaining time for solving the task (Meier et al., 2007; Schneider et al., 2020). Some of the studies differentiated between individual and group monitoring (Hmelo-Silver et al., 2008; Schoor & Bannert, 2012).
Handling Tools and Resources
Handling tools and resources comprises behaviors related to technical skills as well as making use of resources to solve the problem, including how groups managed collaborative tool usage (Avry et al., 2020), whether groups used tools beneficially (Meier et al., 2007), and if they made use of helpful resources like course material (Arvaja, 2007).
Constructing Shared Knowledge and Solutions
Constructing shared knowledge and solutions refers to actions and communications used to pool all relevant information to solve the problem at hand and ensuring that all group members have a mutual understanding. Furthermore, it encompasses elaborating each other’s ideas as well as challenging them by discussing.
Gathering and Sharing Information
This sub-facet entails behaviors aimed at acquiring all pertinent information necessary to address the task. It comprises sharing one’s own information and insights (e.g., Cullen et al., 2013; Hao et al., 2017; Soller, 2001) as well as requesting relevant information from others (e.g., Arvaja, 2007; Wiltshire et al., 2018; Zhao et al., 2014).
Creating a Common Ground
Creating a common ground refers to communicative actions that ensure “what has been said is understood” (Andrews-Todd & Forsyth, 2020, p. 6) which reflects a shared understanding. In line with Cukurova et al. (2018), the basis for having a common ground is the ability to understand the cognitions, behaviors, and attitudes of others. For example, in many of the included studies, this entails asking clarifying questions (e.g., Chanel et al., 2013; Hmelo-Silver et al., 2008) to verify the adequate understanding of others’ ideas (Andrews-Todd & Forsyth, 2020; Sun et al., 2020). This process is supported by explaining ideas in one’s own words (Volet et al., 2009) and using examples (Arvaja, 2007).
Elaborating and Negotiating to Reach Consensus
Elaborating and negotiating to reach consensus reflects behaviors aimed at finding a solution for the problem-solving task. Hence, it includes developing and proposing solutions (e.g., Seeber et al., 2013; Sun et al., 2020) as well as elaborating and negotiating. Elaboration involves the expansion of ideas and behaviors that signify co-construction, such as developing further a previously offered information (Arvaja, 2007), deepening and broadening ideas (Chanel et al., 2013), and engaging with knowledge objects produced by others (Martinez-Maldonado et al., 2015). Negotiating becomes evident during an episode of incompatibility of ideas and opinions (Oliveira & Sadler, 2008). It is reflected by communications expressing agreement or disagreement of ideas (e.g., Andrews-Todd & Forsyth, 2020; Hao et al., 2017) and by arguing about them (e.g., Chanel et al., 2013; Soller, 2001), which often involves using justifications derived from one’s own conceptualizations or grounded beliefs (Arvaja, 2007; Hmelo-Silver et al., 2008). Elaboration and negotiating typically result in reaching consensus and making decisions (Khosa & Volet, 2014; Meier et al., 2007; Shah & Leeder, 2016).
Maintaining a Functional Team
Maintaining a functional team involves actions and communications reflecting a positive and respectful atmosphere during collaborative problem solving. Moreover, this facet encompasses behaviors related to managing motivation and emotion.
Establishing a Positive Atmosphere and Cohesion
In a positive atmosphere, “collaborative behaviour can flourish” (Cullen et al., 2013, p. 428). A positive atmosphere is established through socially appropriate language like greeting or apologizing for interruptions (e.g., Hao et al., 2017), listening actively (e.g., Isohätälä et al., 2018), showing awareness of the other group members (Cullen et al., 2013), acknowledgment (e.g., Oliveira & Sadler, 2008), expressing appreciation (e.g., Soller, 2001), and humor (e.g., Avry et al., 2020). In some of the included studies, different levels of acknowledgment were differentiated (Oliveira & Sadler, 2008), or reverse-coded indicators were proposed (“Makes fun of, criticizes, or is rude to others,” Sun et al., 2020, p. 4). Beyond that, this sub-facet includes showing and regulating motivation (Schoor & Bannert, 2012), which is related to group cohesion (Kent & Cukurova, 2020). In Zhao et al. (2014), cohesion was identified through the use of inclusive pronouns, for example.
Managing Emotions
This sub-facet reflects communications and behaviors related to one’s own emotions as well as to the emotions of others. Emotion management comprises sharing one’s own emotions, including positive ones like enjoyment or satisfaction and negative ones like frustration (Hao et al., 2017; Shah & Leeder, 2016). It also encompasses the ability to perceive emotions in others and engage in communication about them (e.g., Chanel et al., 2013). In written communications like forum discussions, the use of emoticons can serve as an indicator for expressing emotions (Zhao et al., 2014).
Individual and Joint Participation
This facet reflects participation on the individual level (individual engagement) as well as the group level (equality) during the collaborative process. In collaborative problem solving, each group member should engage actively (Isohätälä et al., 2018; Meier et al., 2007), “making sure that they undertake their share of the work and feel personally responsible for the group’s success while others are also undertaking their share in completing the task” (Cukurova et al., 2018, p. 96). This, in turn, leads to joint participation, meaning the whole group is taking part in on-task behaviors (Isohätälä et al., 2018). It pertains to the presence of verbal contributions from multiple group members, contrasting with a predominant single speaker approach (Summers & Volet, 2010), or the absence of active involvement from a single learner (Martinez-Maldonado et al., 2015). Ideally, every group member should contribute equally, which is manifested in the symmetry of their engagement (Martinez-Maldonado et al., 2015; Meier et al., 2007).
Further Insights Into the Collaborative Process
Some information on what was measured could not be assigned to the previous facets and were present in only a few studies. For example, Schoor and Bannert (2012) used the category, appraisal of partner’s cognition, as demonstrated by attempting to discern a group member’s thought process. Shah and Leeder (2016) addressed personal matters that could impact group work (e.g., p. 628, “A group member expresses having difficulty or fails to contribute because they are balancing other responsibilities or schedule conflicts”).
How to Measure? Measurement Approaches
With regard to the measurement approaches, we were interested in both how collaboration data were collected and how they were analyzed. We also checked what kind of additional data were collected in order to triangulate the results. For an overview of the whole category system created to answer RQ 2 (How to measure), see Table A3 in the Appendix. In the following, we sum up the most salient results. Afterwards, we synthesize the findings of RQ 2 with those of RQ 1 (What to measure).
Data Collection
With the exception of self-report data (diary study, questionnaire) in the studies by Shah and Leeder (2016) and Chanel et al. (2013), all measures of collaboration were based on observational data (visual, verbal, and log data). Eleven of the included studies captured visual data, n = 25 captured verbal data, from which n = 15 were spoken (e.g., communication through microphone headsets in Avry et al., 2020) and n = 10 written data (e.g., chat conversations in Dascalu et al., 2015). We assumed that all studies that captured visual data (n = 11) captured verbal information as well because they used video data, which typically include sound and images (e.g., video recordings in Isohätälä et al., 2018). We note that not all of them made use of verbal data (e.g., Cukurova et al., 2018 focused on nonverbal indexes of students’ physical interactivity). Twelve of the publications captured digital traces in the form of log data, including information on who did what at which time (e.g., Seeber et al., 2013). In addition to the main data collected to measure collaboration, n = 21 studies collected additional or relational data in order to triangulate the results, including ratings based on observation by others (e.g., Martinez-Maldonado et al., 2015), data on prior knowledge and experiences (e.g., electronics knowledge in Andrews-Todd & Forsyth, 2020), as well as outcome data (e.g., task performance in Wiltshire et al., 2018). Triangulation data were included primarily when log data were collected.
Data Analysis
Since most studies employed observational data, it is not surprising that the data have been analyzed primarily by content coding (n = 21) and rating procedures (n = 2). While most coding schemes were based on an extensive literature review (e.g., Andrews-Todd & Forsyth, 2020) or adopted form prior research (e.g., Seeber et al., 2013), Cullen et al. (2013) used an inductive approach. Whereas the coding schemes reflect which collaborative behaviors are present, allowing for counting frequencies (Hmelo-Silver et al., 2008; Isohätälä et al., 2018), rating schemes like the one by Meier et al. (2007) include ratings in the form of numbers, such as “−2 (very bad) to +2 (very good)” (Meier et al., 2007, p. 73) to reflect the depth/quality of collaborative behaviors.
Several studies (n = 19) conducted frequency analyses to display and compare the occurrence of codes. For example, Andrews-Todd and Forsyth (2020) rank-ordered the frequencies in order to create skill profiles. Other data analysis methods include network and discourse analysis to depict relations between actors (social network analysis in Kent & Cukurova, 2020) or elements (cohesion network analysis in Dascalu et al., 2015) as well as data mining analyses (e.g., machine learning in Viswanathan & VanLehn, 2018). Whereas network analyses were primarily conducted in online settings, discourse analysis was applied on data from face-to-face settings (Isohätälä et al., 2018). Besides visualizations (e.g., control-flow view in Seeber et al., 2013) and exploratory analyses (e.g., Andrews-Todd & Forsyth, 2020), some studies conducted further analysis in order to triangulate and to detect relations. For example, Schneider et al. (2020) detected correlations between measures of physiological synchrony and human ratings on collaborations, and Avry et al. (2020) investigated relationships between collaborative acts and emotion sharing. Considering the level of investigation, almost all included studies comprised group-level information, such as how many collaborative acts were coded for each group (Avry et al., 2020). Thirteen studies also included individual-level information (e.g., displaying the physical activity for each learner in Martinez-Maldonado et al., 2015). One study (Sun et al., 2020) focused on the individual level. Several studies took a closer look at specific groups, such as by extracting high-performing and low-performing groups to compare them (e.g., Schneider et al., 2020) or by conducting a micro-level case analysis for a group that showed favorable processes (Isohätälä et al., 2018).
Synthesis: What, How, and Under Which Circumstances?
Based on the findings, previous frameworks (e.g., Meier et al., 2007; OECD, 2017) can be integrated into one. Given the review’s focus and our consideration of the studies’ context and settings, the definition of collaboration can be customized to align with the higher education context. Integrating previous definitions of collaborative learning, working, and problem solving (e.g., Bedwell et al., 2012; Patel et al., 2012; Roschelle & Teasley, 1995), we define collaboration in higher education as an evolving process whereby two or more students interact with each other as well as with tools and resources, within a single episode or series of episodes, working on a fixed task toward a shared goal. This process of co-elaboration is characterized by mutuality and equality of the students engaging in socio-cognitive (e.g., coordinating, creating a common ground) as well as socio-affective processes (e.g., establishing a positive atmosphere and cohesion). Ideally, the student group does not only find a joint solution for solving a problem, but each student profits through new knowledge and improved collaborative skills.
We used MAXQDA (VERBI, n.d.-a) to explore patterns and relations between RQ 1 (what) and RQ 2 (how). We also considered contexts and settings (e.g., task type, time). Notably, socio-affective aspects like managing emotions were rather present in studies investigating collaboration during continuous work (e.g., Cullen et al., 2013; Isohätälä et al., 2018). Studying collaboration in continuous work is also a theme in the studies that investigated managing task and progress (e.g., planning, coordinating, handling tools) as well as gathering and sharing information. When investigating a series of episodes, written data as well as sharing and co-construction tools seems to be of particular importance, whereas visual data and dynamic tools were rather present in studies investigating collaborative processes in fixed sessions. In both cases, log data can support data analysis. Remarkably, technological-advanced studies, such as ones using network and discourse analyses (e.g., Dascalu et al., 2015; Viswanathan & VanLehn, 2018), focused on specific facets of collaboration like individual and joint participation or constructing shared knowledge and solutions. Fewer (sub-)facets of collaboration were identified and coded in the studies that used technologically driven analyses with larger sample sizes (e.g., Gašević et al., 2019; Wiltshire et al., 2018). Furthermore, studies relying on log data and technologically driven analyses tend to require triangulation data and analysis (e.g., in Martinez-Maldonado et al., 2015, p. 73, coding and rating procedures were used to “establish a ground truth assessment of collaboration” as a complement to automatic assessments based on digital footprints). Studies in which all four core facets were found used coding and rating procedures conducted by humans (e.g., Andrews-Todd & Forsyth, 2020; Meier et al., 2007; Zhao et al., 2014). Using further (relation) analyses helped to detect relations, such as showing measures of physiological synchrony (e.g., directional agreement) correlate with behaviors of constructing shared knowledge and solutions (Schneider et al., 2020). Additionally, the study by Avry et al. (2020) revealed that emotion sharing is related to other collaborative acts (e.g., being focused while managing the task).
Taken together, Figure 2 serves as a compact overview of the derived integrative framework including the proposed facets of collaboration as well as potential measurement. Being a latent variable, collaboration and its facets are not directly observable (see Figure 2, illustrated by dashed lines) but must be made measurable by observable indicators (solid lines). Based on the reviewed literature, we came up with examples for potential data sources for observing them. The figure is inspired by Wise et al. (2021) discussing the main challenge of mapping (digital) traces to latent constructs in collaborative learning analytics. Since our included studies stemmed from various communities (such as CSCL and learning analytics) the integrative framework combines their respective strengths by being theoretically robust and presenting methodologically strong tools and techniques (cf. Wise et al., 2021). Given the selective and purposeful approach of our conducted review, this framework represents best practice operationalizations for measuring collaboration.

Overview of the proposed integrative framework for measuring collaboration (inspired by Wise et al., 2021).
Additionally, detailed suggestions can be found in Table 2 depicting a roadmap (including what to measure, how to measure, and under which circumstances) for researchers and practitioners interested in investigating collaborative processes. It includes recommendations on suitable tasks, tools, and techniques for each of the sub-facets of collaboration in higher education. This way, researchers and practitioners can design learning environments or study designs that fit the dimensions (metacognitive, cognitive, affective, behavioral) of collaboration in which they are interested. The other way around, the table also depicts which dimensions of collaboration can be appropriately addressed and measured, when the context is predefined and not modifiable. For example, if interested in affective dimensions of collaboration, choosing continuous work as well as ill-defined tasks would be appropriate choices. First, having several work episodes allows monitoring the development of groups’ atmosphere and tone over time. Second, ill-defined tasks are characterized by ambiguity, creating a challenge, which might lead to emotional arousal. This way, researchers and practitioners could not only observe how groups’ atmosphere is established over time but also how emotions are handled. A peculiarity with the affective facets (e.g., managing emotions) is that they are conceptualized to be bipolar: The importance lies not only in the prevalence and intensity of the behavior but also in verifying the presence of its opposite (e.g., contra-productive behavior). When collaboration takes place where computer-supported and technological resources are available, like tools able to track log data, Table 2 shows that these circumstances might support a fruitful evaluation of (meta)cognitive dimensions. For example, log data can be used to learn about the distribution of information within the group (gathering and sharing information) as well as how the group interacts with and elaborates given materials (elaborating and negotiating to reach consensus). Moreover, having a computer-supported setting is especially suited to the investigation of groups’ handling of tools and resources.
Roadmap for Researchers and Practitioners Interested in Collaborative Processes.
Discussion
The results of this in-depth analysis of 28 systematically and carefully selected publications depict a detailed picture of both the conceptualization (RQ 1) as well as measurement approaches (RQ 2) of collaboration. With regard to RQ 1, we iteratively formed four broad categories derived from the included publications that reflect facets of collaboration: (1) Managing task and progress, (2) Constructing shared knowledge and solutions, (3) Maintaining a functional team, (4) Individual and joint participation. Moreover, sub-facets, explanations, and examples for each of the facets were presented. Being a synthesis of best practice approaches, the derived facets and sub-facets of collaboration are not new. This research rather confirms a common ground on relevant dimensions of collaboration and bundles several taxonomies into one. It is therefore not surprising that this framework shows overlaps with well-established frameworks such as that of Meier et al. (2007). However, our systematic review and analysis led to different labels, classifications, and extensions. Additionally, given the rapid development in technology leading to a multitude of new learning opportunities (Yeung et al., 2021), this study adds descriptions and behavioral markers that previous frameworks did not include. For example, considering that the dimensions in Meier et al. (2007) focus on collaboration via a desktop conferencing system, the current study further enables to give examples for written collaboration (e.g., use of emoticons, Zhao et al., 2014) as well as collaboration using dynamic tools (e.g., interacting with other student’s objects at the tabletop, Martinez-Maldonado et al., 2015). Thus, this review includes useful examples of indicators pertinent to the field of learning analytics and educational data mining. Compared to other frameworks, the goal of this review was not to provide an analysis of a specific form of interaction (like debates as in the rainbow framework by Baker et al., 2007) but to serve as general framework for analyzing various collaborative activities. Although collaboration has been studied extensively, there is a lack of consensus concerning its conceptualization, which hampers comparability and consistency among studies (cf. Introduction). With respect to RQ 1, this article provides an integrative framework combining insights of collaborative learning, working, and problem solving by giving detailed descriptions of collaborative processes. The results regarding RQ 2 suggest that, for measuring collaborative processes, observational data, specifically verbal, visual, and log data, would be particularly suitable. Notably, most studies investigated collaboration using ill-defined tasks and during fixed learning sessions. In contrast to the findings by Gress et al. (2010), the majority of measures were not self-reports but observations. This could be due to the fact that our review focused on collaborative processes, whereas the review by Gress et al. (2010) mostly included studies focused on collaboration outcomes. The measurements conducted after collaborative activities often consisted of questionnaires (Gress et al., 2010). Moreover, it should be noted that we differentiated between the main data collected and data collected in order to triangulate the results. The included studies in our review encompassed qualitative, quantitative, and mixed-methods approaches. Our findings on RQ 1 and RQ 2 were synthesized, resulting in an integrative framework for measuring collaboration. It comprises best practice operationalizations allowing for rigorous mapping of (digital) traces and dimensions of collaboration. After discussing the strength and limitations of our study, we will draw further implications for research and practice in the following.
Strengths and Limitations
This systematic literature review extends previous reviews by investigating and integrating two important domains of collaboration: the conceptualization and measurement domain. Therefore, it sheds light on the two fundamental questions for researchers who are interested in collaborative processes (Meier et al., 2007). In evaluating the present study, a strength that can be noted is that the synthesis of several best practice approaches in measuring collaboration resulted in an integrative framework displaying descriptions of collaborative behaviors. To the best of our knowledge, this framework of collaborative indicators is the first that is based on a systematic literature review. It gives a holistic overview with detailed descriptions of sub-facets of collaboration, while focusing on a specific target group, namely adult peer learners. Additionally, it enriches prior work by selecting and depicting best practice approaches in measuring collaboration while linking the conceptualization and measurement of collaboration. Thus, it is comprised of helpful recommendations for researchers and practitioners interested in collaborative processes on what to measure, how to measure, and under which circumstances (cf. Table 2). With regard to the methodology, we considered the guidelines by Siddaway et al. (2019) as well as Gough et al. (2017). Particularly, it is noteworthy that interrater reliability was conducted at each step of the screening process including subsequent refinements.
Beside these strengths, there are also some noteworthy limitations of the study. First, the number of included studies (N = 28) is relatively low. However, this is due to the narrow inclusion criteria leading to the exclusion of several studies, such as those focusing on collaboration in primary or secondary education contexts (e.g., Harding et al., 2017; Yuan et al., 2019). Moreover, several studies presented interesting and innovative technological approaches for automating collaboration analysis but were excluded due to their vague use of theory or lacking construct clarification (cf. Table 1). A considerable number of these studies are centered around technology, aligning with the observations of Hew et al. (2019), who identified a lack of explicit theoretical engagement in the majority of their reviewed articles concerning educational technology. Baker et al. (2021) also summarize that algorithms developed to identify recurring patterns are rarely linked to theories of cognition and dialog. In our review, many of the included articles stems from CSCL research which is characterized by the strength to be theoretically robust (Wise et al., 2021). Thus, we argue that our narrow inclusion criteria led to a careful and specific study selection reflecting relevance and rigor. Giving an overview of selected best practice approaches will be helpful for researchers and practitioners in the field. Next, it has to be noted that, although our aim was to focus on collaboration in higher education and beyond, most of the included studies focused on collaboration in university context. This is probably due to the search strategy. As we tried to come up with a manageable number of search results, we removed closely related terms of collaboration like cooperation and teamwork, which are often used in workplace settings (e.g., Fransen et al., 2013; Salas et al., 2005). Thus, on the one hand, there is a possibility that further relevant studies might have not been taken into account. On the other hand, as highlighted in the introduction, not all group work interactions are collaborative and it is important to differentiate collaboration from related constructs. In addition, some of the studies focusing on employees were excluded because of the chosen inclusion/exclusion criteria (e.g., due to the significant involvement of a facilitator, as in Schaefer et al., 2019). Nevertheless, our framework shows similarities to well-established frameworks of team processes in work settings (Marks et al., 2001; Rousseau et al., 2006). Importantly, students coming together to work on an assignment or project are hardly comparable to well-functioning teams, which often have a team-leader and strive toward goals aligned with larger organizational visions (Marks et al., 2001; Oakley et al., 2004). That’s why our framework considers the aspects of equality and mutuality (e.g., through the facet individual and joint participation), which seems to be of particular importance in education settings, where uneven workload is widespread, which is one of the predominant reasons leading to negative perceptions about group assignments (Tucker & Abbasi, 2016; Wilson et al., 2018). Consequently, our framework is context-specific (i.e., higher education) and particularly suited for researchers and practitioners interested in collaborative processes of learning/student teams. 1
Implications for Research and Practice
The integrative framework developed (see Figure 2) provides a solid basis but also has the potential to grow or be adapted through further research. As some of the categories show overlaps (e.g., using humor can be classified as establishing a positive atmosphere and cohesion or managing emotions), it is important to formulate distinctive descriptions including helpful examples. Due to the combination of both conceptualization and measurement, this review is unique and innovative. The roadmap depicted in Table 2 will enrich future research on collaboration by giving recommendations on how each facet can be measured. Furthermore, helpful suggestions on context and setting are given. Our findings suggest that technology-driven approaches (e.g., data mining techniques) are not advanced enough at present to automatically and reliably measure the construct of collaboration in its entirety. Nevertheless, its advantages can be exploited by combining different approaches. For example, network analysis could be applied to measure individual and joint participation as well as to examine coordination costs (i.e., managing task and progress; cf., Kent & Cukurova, 2020), discourse analysis could support the measurement of constructing shared knowledge and solutions, whereas sentiment analysis (which was not used in the included studies but seems to be a promising approach for emotion mining, cf. Ollesch et al., 2022) could support the measurement of maintaining a functional team. The detailed and theoretically sound descriptions of facets and sub-facets of collaboration will help researchers in the field of educational data mining and learning analytics to elaborate digital traces (cf. Wise et al., 2021), allowing them to build a bridge “from clicks to constructs” (Buckingham Shum & Crick, 2016, p. 16). As it is not sufficient to solely focus on quantitative data in the form of interaction or participation rates (e.g., who talks to whom and how often), we suggest adopting a multi-layered approach to data collection that permits the synergy of human strengths and computationally observable collaborative behavior. This would enable the comprehensive measurement of the intricate collaboration construct, both in its entirety and over extended time frames (Baker et al., 2021), while human coding could contribute essential contextual insights to enhance data understanding (M. A. Rosen et al., 2018).
In line with Baker et al. (2013), we call for greater consideration of socio-affective processes during group work, for the following reasons. These processes are important as research suggests that affective support (e.g., appreciation of behaviors, appeals of cheering) and affective tone (e.g., shared feelings of excitement) within teams have the potential to raise motivation (Hüffmeier & Hertel, 2011; Hüffmeier et al., 2014) and performance (Paulsen et al., 2016). Contrarily, a negative climate might lead to negative emotions and lower satisfaction of group members (Bakhtiar et al., 2018). These findings also underline the importance of considering the bipolar conceptualization of the affective dimensions of collaboration (cf. Table 2). Furthermore, the study by Ollesch et al. (2020) investigating group awareness attributes showed that students weighted the visualization of friendliness higher than that of participation. This, in turn, leads to the question of the influence of group familiarity. As this aspect was not covered in our study, it should be considered in future research. Beyond that, further potential influencing factors on collaborative processes like collective task value (S. L. Wang & Hong, 2018) or learner readiness (Xiong et al., 2015) could also be investigated. Since our findings suggest that some sub-facets are of particular importance in certain group phases (e.g., planning activities especially in the beginning), future research is further encouraged to investigate the occurrence of collaborative behaviors over time.
Our framework is context-specific (i.e., higher education) and particularly suited for researchers and practitioners interested in collaborative processes of learning/student teams. How can our findings assist higher education professionals (e.g., professors, teaching assistants) in fostering collaboration among students? Since students often do not feel well prepared to work collaboratively (Wilson et al., 2018), higher education professionals are encouraged to integrate the detailed descriptions and context-related example indicators into their teaching to help students understand what collaboration means. For students, gaining a deeper insight into team dynamics and the individual contributions that contribute to its achievements holds significance not only due to the heightened demand for 21st-century skills in the job market, but also in light of the challenges and transformations brought about by student teamwork during the COVID-19 pandemic, as outlined by Wildman et al. (2021). A well-differentiated rating scheme that allows a clear classification and interpretation of collaborative behavior would be helpful for students to self-assess and reflect upon their collaboration. We believe that Behaviorally Anchored Rating Scales (BARS) might be a promising approach in this regard as they provide numerous advantages and focus on the quality instead of frequency of behavior (for an overview, see Georganta & Brodbeck, 2020). Including behavioral anchors representing different quality levels, BARS have been found beneficial in similar areas of application (e.g., self- and peer-evaluation of team member effectiveness, Ohland et al., 2012). Considering that students at higher educational levels are often less supported and more expected to self-regulate their (collaborative) learning, integrating self- and peer-evaluation as well as reflections based on BARS could hold promise in fostering their collaboration. Furthermore, we aim to heighten the awareness of higher education professionals regarding the selection of suitable tasks, tools, and time constraints during a collaborative process. Additionally, we seek to sensitize them to the value of providing feedback on students’ collaboration, wherever feasible. Multiple sources can be used to observe student collaboration; however, our findings underline the limitations of solely relying on quantitative data and the importance of using multi-layered measurement approaches.
Considering the inclination toward self-organized, agile teams characterized by minimal or absent hierarchical structures, alongside the growing organizational emphasis on cultivating a culture of lifelong learning, our framework could prove valuable within the organizational context as well. Researchers and practitioners are encouraged to explore the suitability of these indicators for collaboration within professional working environments. Hereby, it is important to note that practitioners often need more simple and unobtrusive measures than researchers (Salas et al., 2017) and that implementation is often dependent on corporate policy and strategy. Considering the drive for higher education institutions to cultivate the socio-relational skills of the forthcoming workforce, the framework could hold particular significance for practitioners within human resources departments. It could potentially be integrated into processes such as personnel selection. Additionally, it could be used as a source for giving feedback and team trainings, as well as for implementing joint reflections in the field of personnel development.
Conclusion
This systematic literature review synthesized research focusing on best practices in measuring collaborative processes in higher education. An integrative framework containing diverse indicators of collaboration were deduced from the included articles. The insights of our study aim to enrich multidisciplinary research and therefore have significance for several communities, including CSCL, learning analytics, and educational data mining. Moreover, the findings support researchers and practitioners (e.g., higher education professionals) who want to understand, measure, and foster student collaboration. This review can serve as a guideline when defining what to measure and how to measure collaboration. In addition, it provides insightful information on past research and practices, such as common group sizes and potential tasks.
Supplemental Material
sj-docx-1-sgr-10.1177_10464964231200191 – Supplemental material for Conceptualization and Measurement of Peer Collaboration in Higher Education: A Systematic Review
Supplemental material, sj-docx-1-sgr-10.1177_10464964231200191 for Conceptualization and Measurement of Peer Collaboration in Higher Education: A Systematic Review by Verena Schürmann, Nicki Marquardt and Daniel Bodemer in Small Group Research
Footnotes
Appendix
Overview of Included Studies.
| Study number | Authors and date | Source | Short description | Descriptive data |
|---|---|---|---|---|
| 1 | Andrews-Todd and Forsyth (2020) | Computers in Human Behavior | Investigated CPS skills in a digital environment by using a CPS ontology which was based on an extensive literature review of prior frameworks (e.g., Meier et al., 2007). | N = 129 students |
| Small groups (3) | ||||
| 2 | Arvaja (2007) | Computer-Supported Collaborative Learning | Explored similarities and differences in groups’ collaborative knowledge construction activity primarily based on communicative functions and contextual resources. | N = 6 students |
| Small groups (3) | ||||
| 3 | Avry et al. (2020) | Frontiers in Psychology | Investigated CPS using a coding scheme based on prior research (e.g., Baker et al., 2007). Moreover, they collected data on emotion sharing to discover relations between emotion sharing and collaborative acts. | N = 22 participants |
| Dyads | ||||
| 4 | Chanel et al. (2013) | Proceedings of the Affective Computing and Intelligent Interaction (ACII) | Collected self-reports using a questionnaire as well as physiological data on groups’ collaboration in order to investigate if physiological signals can predict collaborative processes. | N = 60 participants |
| Dyads | ||||
| 5 | Cukurova et al. (2018) | Computers & Education | Collected physical interaction data as well as human ratings on groups’ collaboration (based on Meier et al., 2007) in order to investigate if CPS competence levels can be derived from learners’ non-verbal behavior data. | N = 36 students |
| Small groups (3) | ||||
| 6 | Cullen et al. (2013) | English Language Teaching Journal | Examined the processes of online collaborative learning through a (inductive) qualitative analysis of the postings made in a Wiki. | n = 5 students a |
| Small groups (4–5) | ||||
| 7 | Dascalu et al. (2015) | Computer-Supported Collaborative Learning | Present an approach to measure collaboration based on dialogical and social knowledge-building models. The results of an automated analysis based on learning analytics were triangulated with human rater judgements on collaboration. | N = 110 students b |
| Small groups (4–5) | ||||
| 8 | Gašević et al. (2019) | Computers in Human Behavior | Combined social network analysis and epistemic network analysis to analyze collaborative learning in a massive open online course. | N = 1214 students |
| Learning Community | ||||
| 9 | Hao et al. (2017) | Innovative Assessment of Collaboration (Book) | Present the Collaborative Science Assessment Prototype (CSAP). The approach was developed to assess collaborative problem-solving skills in the domain of science. | N = 966 participants |
| Dyads | ||||
| 10 | Hmelo-Silver et al. (2008) | Instructional Sciences | Used Chronologically-Ordered Representations of Discourse and Tool-Related Activity (CORDTRA) diagrams to understand collaborative learning processes in a problem-based class. | n = 11 students c |
| Small groups (5–6) | ||||
| 11 | Isohätälä et al. (2018) | Learning, Culture and Social Interaction | Investigated socio-emotional processes during argumentation in collaborative learning interaction and conducted an exemplary micro-level case analysis of one group. | N = 19 students |
| Small groups (3–4) | ||||
| 12 | Kent and Cukurova (2020) | Journal of Learning Analytics | Present the Collaborative Learning as a Process (CLaP) approach which uses social network analysis to investigate collaboration (and especially the balance between interactivity gains and coordination costs) in learning communities. | Two Communities of students N = 42|32 |
| Learning Community | ||||
| 13 | Khosa and Volet (2014) | Metacognition Learning | Explored similarities and differences in groups’ collaborative learning primarily based on cognitive and metacognitive regulation processes (based on Volet et al., 2009). | n = 11 students c |
| Small groups (5–6) | ||||
| 14 | Martinez-Maldonado et al. (2015) | International Journal of Human-Computer Studies | Present the Tabletop-Supported Collaborative Learning (TSCL) model which can classify group work, based on a Best-First decision tree that considers verbal and physical activity. Results were triangulated with human ratings based on Meier et al. (2007). | N = 60 students |
| Small groups (3) | ||||
| 15 | Meier et al. (2007) | Computer-Supported Collaborative Learning | Developed an evaluated a rating scheme for assessing the quality of computer-supported collaboration processes. | N = 80 students d |
| Dyads | ||||
| 16 | Oliveira and Sadler (2008) | Journal of Research in Science Teaching | Examined cognitive and social processes in group interactions that shape collaborative learning in a face-to-face science class. | N = 10 students |
| Small groups (3–4) | ||||
| 17 | Schneider et al. (2020) | Computer-Supported Collaborative Learning | Investigated the relationship between measures of physiological synchrony and collaborative learning whereby collaboration was measured using the rating scheme by Meier et al. (2007). | N = 84 participants |
| Dyads | ||||
| 18 | Schoor and Bannert (2012) | Computers in Human Behavior | Explored collaborative activities focusing on social regulation during collaborative learning and their relationship to group performance. | n = 42 students c |
| Dyads | ||||
| 19 | Seeber et al. (2013) | Group Decision and Negotiation | Applied the COllaboration PRocess Analysis technique (CoPrA) for measuring team knowledge building. The underlying framework for analysis was the one by Fiore et al. (2010). | N = 18 students |
| Small groups (3) | ||||
| 20 | Shah and Leeder (2016) | Journal of Information Science | Used a diary study approach to investigate the collaborative work among students based on the C5 model of collaboration by Shah. | N = 54 students |
| Dyads/Small groups (3) | ||||
| 21 | Soller (2001) | International Journal of Artificial Intelligence in Education | Presents a model of collaborative learning which describes potential indicators of effective collaborative learning and therefore can aid the analysis of collaborative learning conversation and activity. | N = 15 students |
| Small groups (4–5) | ||||
| 22 | Summers and Volet (2010) | European Journal of Psychology of Education | Examined the engagement in high-level collaborative learning and its relationship with individuals’ cognitions based on the framework by Volet et al. (2009). | N = 53 students |
| Small groups (5–6) | ||||
| 23 | Sun et al. (2020) | Computers & Education | Present a generalized CPS competency model with multiple verbal and non-verbal indicators. The authors used principal component analysis to investigate whether their empirical data aligned with the theorized model. | N = 108 students e |
| Small groups (3) | ||||
| 24 | Viswanathan and VanLehn (2018) | IEEE Transactions on Learning Technologies | Used machine learning to induce a detector for classifying collaborations. The results were compared with human codings. | N = 28 students |
| Dyads | ||||
| 25 | Volet et al. (2009) | Learning and Instruction | Investigated collaborative learning with the help of a situative framework combining the constructs of social regulation and content processing. | N = 18 students |
| Small Groups (6) | ||||
| 26 | Vuopala et al. (2019) | Learning, Culture and Social Interaction | Investigated how students engaged in knowledge co-construction activities and task-related monitoring in collaborative learning. | N = 19 students |
| Small groups (3–4) | ||||
| 27 | Wiltshire et al. (2018) | Cognitive Science | Applied a sliding window entropy technique to detect problem-solving phase transitions during collaborative processes which were coded based on the framework by Rosen. | N = 86 students |
| Dyads | ||||
| 28 | Zhao et al. (2014) | British Journal of Educational Technology | Examined collaboration within an asynchronous computer conferencing by conducting content analyses of discussion protocols. | N = 18 students |
| Small groups (3) |
Twelve groups of four to five students completed the task but the article focuses on the analysis of the interactions posted by one of the groups.
N = 110 students took part in the validation experiment which focused on the assessment of 10 chat conversations, selected from a corpus of more than 100 chats in a course. No information on how many students took part in the course was given.
The context of the study was a university course but the article focuses on the analysis and comparison of selected groups.
The article consists of two studies (development of a rating scheme and its evaluation). We focused on the evaluation study.
The article consists of two studies (middle school students and university students). We focused on the study having university students as participants.
Acknowledgements
We would like to thank Kira Wolff, Louise Küry, Simon Heintzen and Ace Lat for their valuable support, especially when discussing study selection and conducting interrater reliability.
Author’s Note
Verena Schürmann is also affiliated with Research Methods in Psychology – Media Based Knowledge Construction, University of Duisburg-Essen, Duisburg, Germany.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
