Abstract
Situated in a creative crowdsourcing challenge organizational community, this research examines how different types of homophily affect network formation among ideators, the challenge innovators. Longitudinal analysis of comment network events among ideators was conducted with relational event models. Status homophily was examined based on ideators’ self-reported public profiles. Findings show that ideators’ location by country significantly influences relational events, but occupational homophily did not. Value homophily was examined based on the content similarity of ideators’ idea submissions. Results indicate that ideator’s choices of platform content categories, those that are designed by the challenge management, significantly influenced ideators’ communication patterns. Specifically, long-term needs were a salient homophily factor among platform categories, whereas short-term needs were heterophilous factors. Finally, a multidimensional homophily measurement using latent class analysis on idea-category-submission patterns predicted the occurrences of communication relational events and achieved an improved model fit compared to modeling all platform categories together.
Keywords
Introduction
Crowdsourcing challenges are gaining relevance among organizations as offering open and cost-effective tools for problem-solving and innovation (Greengard, 2023), bypassing traditional internal research and development setups. Crowdsourcing innovation mode creates a unique communication phenomenon that encourages the wisdom of crowds (Surowiecki, 2005) to emerge through open communication. Recent successful applications of crowdsourcing challenges include the virtual habitat 3D printing challenge supported by NASA (NASA, 2025) involving an emergent community of human idea generators from outside of NASA itself; a competition hosted by XPRIZE to generate practical quantum algorithms (Sundin, 2024); and a global initiative to solve climate change using artificial intelligence and machine learning modeling (International Atomic Energy Agency, 2023). These examples signal the broad adoption of crowdsourcing challenges and their impact on innovation across disciplines.
On crowdsourcing challenge platforms, ideators—the innovative contestants who enter by submitting ideas to solve problems—form temporary communities around shared issues and problems that need addressing. This study focuses on the homophily factors that influence dynamic ideator-driven communication networks in emergent crowdsourcing challenge communities (CCC). In online settings, when people have limited information about one another, homophily becomes a significant social factor driving interactions because similarity is a heuristic people use to cope with uncertainty (Hu et al., 2023).
This study is motivated by a few important yet underexplored questions. First, innovation on CCC is driven by co-opetition (a combination of cooperation and competition; Brandenburger & Nalebuff, 1996). Yet, underexplored are the factors underlying the emergent and temporary communication networks that evolve in relation to the challenge as important channels to improve ideation quality. Ideators often provide comments to help each other improve ideas, but ultimately, they are also competing for contest prizes. When these two potentially paradoxical motivations exist, it is important to inquire which force dominates their communication patterns. Second, homophily has been found to be a significant mechanism for communication network structures (e.g., Bond & Sweitzer, 2022; Huang et al., 2013; Siciliano et al., 2021), but to what extent does it play a role in complex crowd-driven CCCs? Driven by the contest competition and also by the structure of the platform, do people seek diverse communication, or do they still seek to connect with similar people? Relatedly, what are some platform architectures, such as platform-stipulated categories to which ideas may be assigned, that guide the development of communication relational events (i.e., discrete and directed communication action—comments in our context—between pairs of ideators at a specific point in time; Butts, 2008)? This research explores which attributes/categories are salient in initiating communication. Thirdly, while homophily is recognized as multidimensional (Block & Grund, 2014), most studies examine dimensions separately (Bu et al., 2022; Klepsch, 2023), ignoring the interdependencies across attributes. Yet online communities feature complex, overlapping human-generated categorization systems (Y. Xu et al., 2022) with redundancies, creating challenges for homophily analysis. This latter question is related to a deeper inquiry of homophily related to the challenges of conceptualizing key independent dimensions of homophily affecting communication when presented with multiple overlapping categories.
Fourth, CCC presents a distinct contemporary organizational form for innovation. While CCC are temporary and distributed in structure, they exhibit key organizational properties: (1) governance structures (i.e., challenge rules, and presence of management teams), (2) designed member roles (i.e., ideators as innovators, management, and audiences), and (3) formal communication affordances (i.e., comment functions). Yet antecedents to the communication network patterns under such complex organizing structure are seldom examined (Pilny & Contractor, 2024). Organizational communication networks are central organizing logics in today’s online communities (Castells, 1996; Pilny & Contractor, 2024). Two aspects of CCC are particularly relevant to these networks. First, as discussed, CCC are characterized by co-opetitive processes. In CCC, innovation is fueled by both knowledge sharing and competition (Kohler & Chesbrough, 2019; Levin et al., 2012). Second, CCC are built on a hybrid platform architecture that combines self-organized and top-down management. Challenge management, usually the sponsoring organizations and the crowdsourcing platforms, inspires and guides ideation through designing community architectures in areas such as problem statements, contest processes, platform categories, and evaluating criteria. At the same time, CCC are also self-organized, where crowds enjoy free and voluntary participation and information exchange. There is a lack of research examining what drives CCC communication network structural dynamics (except Safadi et al., 2021) while considering its unique organizational structure. Thus, this research aims to demystify the complex communication dynamics of CCC, and, as indicated below, does so from a perspective of how homophily across potential communication partners influences choices to interact in the CCC context.
Homophily defines a social phenomenon in which similar people tend to be attracted to one another in a variety of different contexts (McPherson et al., 2001), including, importantly, in the context of communication networks (Monge & Contractor, 2003). Homophily is often approached from two angles, specifically, status and value homophily, reflecting multifaced identities that influence communication (Fu & Cooper, 2022; Lazarsfeld & Merton, 1954). Status homophily refers to social positions such as demographic, occupational, and location. Value homophily describes affinity in people’s inner states, such as thinking, opinions, interests, etc. In the context of CCC, where geographically dispersed crowds of strangers are attracted to interact (Dissanayake et al., 2019), their profiles and idea contents become the venues that ideators use to facilitate social interactions. Consequently, in this current research, profile information is used to examine status homophily, and idea content is adopted to analyze people’s value homophily. Driven by the inquiry of communication dynamics, this research examines how status and value homophily produce communication patterns over time.
Situated in OpenIDEO, a dynamic organizational challenge community, this research adopts longitudinal network analysis on comment communication among ideators, which can be shaped in part by status and value homophily. Comments are an important communication infrastructure (Cooper & Shumate, 2012) and observable interaction on CCC sites. Longitudinal analysis based on comments is especially suitable because it not only reveals relational dynamics as the challenge progresses but also offers a robust observation of people’s social behavior after specific idea submission. Results indicate that profile-based status homophily in the aspects of location in the same country was significant in predicting communication network structures. Based on an analysis of platform categories to assess value homophily, this research notes that homophily exists among ideators sharing similar long-term resource needs. However, heterophily, the opposite of homophily, exists among ideators sharing similar short-term challenge impacts and partners. Impacts (See Table 1; all related to accessing resources) and partners (as it is framed as partner need within the next 3 years) focus on relatively more short-term needs for their idea design, reflecting a potential urgent and diverse informational need of ideators seeking useful information from different others.
Platform Categories.
This study offers several significant contributions. Primarily, this study is among the few studies that adopt a multidimensional inquiry of homophily in shaping communication dynamics (Block & Grund, 2014; Wang & Shin, 2023). The variables used for describing multidimensional space that captures ideators’ similarities are generated via latent class analysis (LCA; Hagenaars & McCutcheon, 2002), a robust dimension-reduction method. Key and independent dimensions are identified from many platform categories that are potentially messy and overlapping in meaning. Multidimensional value homophily can be calculated by ideators’ distance to one another in the multidimensional space. Findings indicate that when ideators have more multidimensional similarities, they are more likely to form communication network events with one another.
Additionally, this research contributes to the development of homophily theory in several ways. First, this research highlights salient status and value attributes that shape online communication patterns in the context of CCC that have not been well-researched in the current communication literature. Second, this research informs our understanding of complex communication dynamics and organizing structures of CCC. Based on the result that homophily and heterophily coexist in CCC, it reflects that social interactions might be driven by both the need to bond with others as well as gain access to valuable information. Moreover, platform categories, which are designed by community management to guide ideation, can be deemed as an important top-down force shaping community interactions. Multidimensional value homophily signals that ideators’ have similar orientations toward the platform categories, and findings suggest that multidimensional value homophily influences ideators’ communication network formation. This finding also reflects the bottom-up community-innovation patterns inspired and constrained by the platform architecture of categories.
Literature Review
Guided by homophily theory, this research examines how the communication dynamics of CCC are shaped by similarity in different attributes of the ideators. Homophily is a central and effective mechanism that influences communication and is a salient heuristic when people have limited information about others in online communities (Hu et al., 2023; McPherson et al., 2001; Siciliano et al., 2021). Homophily is especially relevant for online CCC, where strangers choose who to communicate with based on limited profile information of one another. In co-opetitive contexts, homophily further helps explain how communication unfolds under dual organizing forces of competition and cooperation. The literature review will first introduce CCC communication dynamics and then provide a detailed review of homophily theory, leading to specific hypotheses.
Communication Dynamics of Crowdsourcing Challenge Communities
Crowdsourcing challenges have become increasingly prevalent over the last decade as an alternative solution for organizations to access innovative ideas from a mass crowd of Internet users (Brabham, 2013; Kietzmann, 2017). Through third-party platforms such as InnoCentive, Kaggle, OpenIDEO, and Wazoku Crowd, organizations publish problems and issue incentives (e.g., monetary awards, platform rankings, invitations to educational/social events) to attract innovators to compete (Liang et al., 2018; Sun & Majchrzak, 2020). Through open innovation on crowdsourcing challenges, traditional organizations (e.g., businesses, nonprofit organizations, and government agencies; Brabham, 2012; Liu, 2017) can solve challenging issues in cost-effective ways that utilize external resources from organized online crowdsourcing spaces beyond traditional organizational boundaries (Greengard, 2023). Ideators, the innovators participating in the crowdsourcing challenges, form challenge-based communities to practice social innovation activities (Mulgan, 2006; cited from R. Wang & Chen, 2023). In CCC, ideators exchange expertise and knowledge through communication, aiming for viable solutions for challenging issues. Paradoxically, in addition to social and collaborative innovation activities, CCC are also competitions, where selective ideas are rewarded. Idea innovation is powered by both a cooperative force, such as a socially-distributed and self-organized process of knowledge sharing and accumulation for collaborative problem-solving (Lettice & Parekh, 2010; Levin et al., 2012) as well as competitive forces such as competing for higher ranking, or rewards (e.g., medals or money; Twyman et al., 2023).
In addition to the co-opetitiveness, CCC feature unique hybrid organizing structures because they are both self-organized as well as managed by organizational professionals. Ideators voluntarily contribute to each other’s idea improvement through communicative interaction and form “a community of contributors with self-organizing social structures” (Blohm et al., 2013, p. 208). At the same time, innovation in CCC is also guided by top-down management forces. The problem scope, intended goals, and directions of innovation are defined by the sponsoring organizations and are communicated, coordinated, and managed by the intermediary platforms. For example, in OpenIDEO, challenge organizers design platform categories to guide ideation, and require ideators to label platform categories to their ideas upon submission.
In knowledge innovation communities such as CCC, the importance of communication has been emphasized, as it is closely linked to idea generation (Guth & Brabham, 2017). Communication dynamics on crowdsourcing platforms also offer critical information about the organizing structure, hierarchy, mobilization, and patterns of knowledge production and exchange (Shaw, 2012).
Although CCC can be a key mechanism driving knowledge creation and innovation, longitudinal communication structural analysis of co-opetitive CCC communication is rare. Few studies have investigated the communication dynamics of crowdsourcing sites. Renard and Davis (2019) suggested that the combination of competition and cooperation can lead to beneficial creative outcomes for the community. Similarly, Fuger et al. (2017) studied communication networks on OpenIDEO and identified types of users based on idea contribution and communication activities. They found that collaborators, who contribute a limited number of ideas, but actively communicate with peers, are most likely to author good quality ideas. An additional study on OpenIDEO conducted by Fuge et al. (2014) identified significant network assortative patterns, which showed that highly connected users prefer communicating with less connected users compared to other highly connected users. Research by Twyman et al. (2023) discovered that strategic positioning in CCC collaboration networks predicted positive increases in performance outcomes. Yan et al. (2021) analyzed longitudinal behavioral-trace data on a crowdsourcing site for evaluating graphic designs and found that shared task experience tended to promote crowd performance, but network centralization was inversely related to crowd performance.
There is a lack of studies examining the longitudinal antecedents of CCC communication network structures of CCC longitudinally (except for e.g., Safadi et al., 2021), in a cooperative crowdsourcing context), leaving the communication dynamics and community organizing of CCC understudied. Examining the antecedents of networks centers communication networks as the focus of research, which leads to critical insights that inform theories of communication networks (Monge & Contractor, 2003). Such an approach places communication as the dependent variable and explores driving forces of communication partner selection, rather than exploring what outcomes communication actions lead to. This research aims to examine the communication patterns of CCC through longitudinal network analysis.
Profile-Based Status Homophily and Content-Based Value Homophily in CCC
Homophily describes the tendency for people to form social connections with similar others (McPherson et al., 2001) with respect to attributes such as demographic information, values, beliefs, and attitudes (Lazarsfeld & Merton, 1954). The motivation behind homophily can be understood through two theoretical mechanisms (Monge & Contractor, 2003). First, people tend to interact with similar others to avoid psychological discomfort and cognitive dissonance (Festinger, 1954; Trope, 1975; c.f. Shi et al., 2022), and this mechanism is often referred to as “similarity-attraction” (Heider, 1958). The second mechanism is informed by self-categorization theory, which describes that people tend to classify themselves and others into social categories (e.g., race, ethnicity, job positions), and they prefer connecting to others they perceive as falling within the same categories as themselves (Turner & Oakes, 1986). Homophily theory is especially relevant in explaining co-opetitive communication patterns because homophily is a universal social mechanism (Rivera et al., 2010) among many diverse motivations that drive interactions among a distributed crowd of strangers (Benkler, 2017). Homophily is also strongly connected to affect and trust among community members (Rivera et al., 2010).
The effect of homophily on relationship-building in online communities in general has been well-researched. For example, gender homophily (Xiong et al., 2020) and semantic homophily (Šćepanović et al., 2017) significantly affect the formation of communication networks in social media. Age homophily and location homophily were also found to affect relationship building in online games (Huang et al., 2013). However, homophily mechanisms in co-opetitive communities like CCC have not been explored.
It is crucial to situate the inquiry of how homophily affects communication patterns (Monge & Contractor, 2003) in the context of CCC because the effect changes when social contexts shift. Bond and Sweitzer (2022) found that political communication is influenced by ideological homophily, but this effect is often context-dependent. During major political events with heightened discussion, the impact of ideological homophily on conversation networks tends to decrease. Kim and Ihm (2020) found that homophily affects individuals’ news-sharing communication, and this effect is intensified in asymmetrical social media that emphasize visibility affordances and content persistence (Leonardi, 2011; Treem & Leonardi, 2013) rather than interpersonal relationship-building messaging platforms.
The homophily inquiry should capture the complex and manifold characteristics that can potentially affect social interaction patterns (Klepsch, 2023). Typically, two types of homophily exist—status and value homophily (Lazarsfeld & Merton, 1954). Status homophily describes people’s social positions, such as demographic information, socioeconomic status, job titles, etc. Value homophily, in comparison, emphasizes people’s thinking, opinions, beliefs, etc.
CCC are open innovation platforms featuring strangers from geographically distributed locations around the world (Dissanayake et al., 2019). Consequently, ideators use profiles and idea content to get to know one another and socialize. To start, self-disclosed profiles contain essential status-related information about the ideators and are employed to examine status homophily in the current research. In addition to status homophily, value homophily is presented in a unique way in the context of CCC. Ideators’ thinking is substantially communicated in the ideas they create, and therefore, ideas can also serve as informative instruments facilitating ideators’ social interactions. This makes sense because ideation is an anchor activity in a knowledge innovation community, and ideas are the subject of competition in CCC. Therefore, the role of content homophily that reflects ideators’ challenge-related value homophily should be included in the examination of communication patterns in CCC.
Profile Homophily as Status Homophily
Status homophily usually varies across communities because the platforms’ norms and affordances (Leonardi, 2011; Treem & Leonardi, 2013) differ. This research examines two types of profile homophily—location and occupation—because these are the only status information present on ideators’ profiles, and thus these two signals likely will be the only information available to ideators to assess status.
Location Homophily
Location homophily, also referred to as proximity, positively influences communication (Kossinets & Watts, 2009). Previous research has found location homophily to significantly influence online communication in different contexts. For example, Bastos et al. (2018) found that location is associated with echo chamber political communication on Twitter. Zhang et al. (2017) examined the diffusion of health information on social media and found that when users are co-located, they participate in information diffusion in a faster manner. In offline communication contexts, geographical proximity increases exposure and the chances of interaction (Goodfriend & Hack, 2022; Monge et al., 1985). In online space, in comparison, location homophily functions more as “identity signs of a particular nation and its own personality relative to the global tendencies of uniformity” (Goyanes, 2015, p. 1506). In CCC, where the status information is limited to user profiles, location information could serve as the main function in assessing similarities among users, as has been found in health and political contexts. For instance, co-located users on CCC may have the same time zones to work online (Huang et al., 2013), and they may have the similar cultural and economic status and similar understanding of local problems and social issues, which prompts them to communicate on the same topics. For example, people from the same country are likely to share similar agricultural challenges, food cultures, and regulation systems, driving a higher likelihood for them to engage in conversations on these shared challenges. The first hypothesis posits that location homophily based on country would be a significant predictor for communication relational events in the context of CCC.
Occupational Homophily
Occupational homophily was found to be related to friendship building in the classical research by McPherson et al. (2001). In organizational settings, occupational homophily is a driving force behind the founding of entrepreneurial teams (Aldrich & Ruef, 2006). In this current research, two types of occupational homophily are considered—homophily of job titles and that of industrial backgrounds. Job positions and industrial backgrounds are often associated with people’s interests and expertise. Individuals generally find it easier to communicate with others who have comparable professional backgrounds (Rogers & Bhowmik, 1970). In CCC, when ideators have similar job positions or industrial backgrounds in their profiles, it is likely that they would have similar language and think in similar manners, which may lead to higher chances of developing communication network events.
Content Homophily as Challenge-Specific Value Homophily
Value homophily describes thinking, beliefs, opinions, interests, tastes, and attitudes (Dahlin et al., 2019; Gu et al., 2014; Lazarsfeld & Merton, 1954; Lo & Lin, 2017), and is an internal status influencing individuals’ future orientations (McPherson et al., 2001). People holding similar values like to connect because they feel good and confident when others agree with them or recognize their viewpoints (Gu et al., 2014). In the context of CCC, challenge-specific values that are specifically attributed to problem-solving in crowdsourcing challenges are adopted because they reflect ideators’ ways of thinking and approaches that directly relate to problem-solving tasks. The reason for this focus is that ideators’ internal values are not visible and, therefore, are not directly relevant to the social dynamics in CCC. Instead, their choices of categories in labeling their ideas are a visible and self-expressed proxy of value homophily that possibly drives communication patterns. Platform categorization is the material platform design created by the challenge management. These categories afford visibility (Treem & Leonardi, 2013), and thus make ideators’ ideas discoverable to others through category-based filtering and searching. The perceived opportunities these categories afford depend on their meanings (Leonardi, 2011) and, in turn, influence social action outcomes. For example, ideators may perceive the long-term needs category as a feature that affords opportunities to identify like-minded others with similar idea visions. When challenge-specific values overlap among ideators, confirmation and shared understanding may generate further communication activities (Gu et al., 2014).
Previous research has used features of content to detect value homophily. For example, Gu et al. (2014) analyzed investors’ opinion homophily by examining sentiment in the messages they sent. They found that investors like to consume others’ messages when they have opinion homophily. Similarly, Andersson (2021) utilized word clustering methods to detect clusters of values and found that YouTube discussions around Greta Thunberg, the Swedish climate activist, exhibit homophilous patterns.
Built on the value homophily literature, a categorical perspective of value homophily is logical. After all, “testing for homophily essentially implies testing whether a category is salient, or meaningful, in a particular context” (Roman, 2016; p. 20). Moreover, “the basis of attraction between two individuals is the perceived value of one’s action based on the cultural (and categorical) membership of the other” (Sabzehzar et al., 2020; p. 4). Similarly, Aiello et al. (2012) used topical similarity among users to proxy interest homophily of friends and found that interest homophily contributes to friendship formation. Value homophily has been found to significantly influence communication. For example, Lo and Lin (2017) found that when Facebook friends have similar values and interests, they have a higher intention to share a piece of e-commerce information with friends.
However, value is a multidimensional construct. Categories are, by definition, “subsets of a multidimensional space of domain-relevant features” (Goldberg et al., 2016; p. 223). As a result, there should also be multiple types of categories that represent the categorization of the value space. However, existing research has not adequately examined multiple dimensions of value homophily. This research explores multiple categories simultaneously and also proposes a process to identify meaningful value dimensions.
Value Homophily Based on Platform Categories
Platform categories are labels created and managed by the platform management to classify content. For example, on Threadless, a crowdsourcing T-shirt design site, platform categories are called themes. Labels such as science, vintage, funny, and more are used to describe art designs. The platform categories on CCC are an important part of their social infrastructures which are designed by platform management to recruit and motivate participants (Roth et al., 2015) and match participants with preferred tasks (Yuen et al., 2011). Platform categories touch upon different aspects of the ideation, such as topics, demands for the idea solutions (short-term and long-term), methods, and status of the idea. These aspects reflect values ranging from topical interests, problem-solving approaches, design, and opinions. In addition, platform categories communicate clear tasks and aspects of participation for easy navigation (Golumbic et al., 2020). Diverse platform categories specify different aspects of the problem calling for solutions (D. J. Wang & Soule, 2016). For example, in OpenIDEO, ideators are prompted to label their ideas using platform categories upon idea submission. Ideas can be filtered based on different platform categories. Platform categories are also institutionalized features signaling different “taken-for-granted” aspects of the idea content (Durand et al., 2017). When ideators submit to the same platform categories, it means that they have similar features in their idea design. For example, ideators who submit ideas into the “sustainability” category should contain solutions for environmental challenges and promote eco-friendly practices. Platform categories are visible labels of idea-specific values, and therefore, can have a significant connecting force on like-minded ideators. The next hypothesis underscores the potential influence of value homophily based on platform categories on communication patterns.
Capturing the Multidimensionality of Content Homophily
Although it is widely recognized that homophily is a multidimensional concept, most research studies different dimensions separately (e.g., Bu et al., 2022; Klepsch, 2023) rather than examining the underlying structure of multiple dimensions. This study employs a multidimensional definition of homophily. Multidimensional homophily examines how actors are similarly or differently positioned in a multidimensional space, defined by multiple characteristics examined simultaneously. The potential overlaps, interdependencies, and distinctiveness of these characteristics are also considered through adopting an appropriate clustering method—LCA, which captures the overlapping interrelations among the various homophily variables rather than ignoring interdependencies. An appropriate multidimensional measurement of homophily helps capture the core attributes among many overlapping variables (Hagenaars & McCutcheon, 2002; Sinha et al., 2021), representing homophily holistically rather than redundantly (Block & Grund, 2014), and reducing the multicollinearity and overfitting risks that can arise when interdependent attributes are entered as separate predictors (Weller et al., 2020).
People are “multidimensional beings,” and their social relationship-building can be affected by a “multitude of sociologically relevant dimensions” (Block & Grund, 2014, p. 189; 192). However, not all variables are equally important, or capture unique meaningful aspects, which poses challenges for modeling and interpreting the core dimensions of various constructs contributing to the homophily effect.
The limited previous research on multidimensional homophily in primarily non-CCC contexts found complex dependency among variables. For example, Block and Grund (2014) studied offline friendship formation and found that when friends are similar in many dimensions, they are likely to cease being friends in the long run. They explain that being too similar may create undesired redundancy or that not all attributes matter the same way in friendship formation. Although homophily theory was not utilized, Wang and Shin (2023) explored the interaction of identity overlap and category overlap of users of Kickstarter.com, and found that when users share identity overlap (i.e., being creative commons creators) and category overlap, they tend to form funding network ties, as compared to when they merely have category overlap. There has not been any research studying how to capture core aspects of multidimensional homophily in an effective way, nor is there any research that is situated in communication patterns in online communities. Capturing core dimensions of multidimensional homophily is an especially important task because most online communities feature complex human-generated categorization systems (e.g., Y. Xu et al., 2022) that might feature a magnitude of characteristics and significant redundancies.
In the context of CCC, we argue that being similar in core dimensions increases ideators’ visibility to one another in the community and, therefore, increases their chances of interaction. Multidimensional homophily captures shared submission patterns from an abstracted level, reflecting ideators’ similarities across core dimensions simultaneously rather than independently. People submitting to the similar platform categories tend to occupy similar community niches, making them more visible to each other to develop relational events (Hannan et al., 2019). This visibility effect on social behaviors is even more amplified when ideators use platform categories to filter and search ideas from other ideators. Therefore,
Method
Data Collection
The research site is OpenIDEO, a crowdsourcing challenge platform that aims at solving social issues such as education, human rights, poverty, and health concerns. On this site, sponsoring organizations (e.g., nonprofit and for-profit organizations) and the OpenIDEO organization craft social issue-based challenges by raising a problem that needs solving. They also design problem statements, platform categories, and challenge procedures to guide idea innovation. Through comments, ideators can provide feedback to each other. Ideators are often given opportunities to revise and improve their ideas. Finally, based on the organizers’ choices, the top solutions are selected as winning ideas.
Researchers collected data on an OpenIDEO challenge in April 2022. A Python-based package, “Selenium,” was used to collect the data. Selenium provides a web driver that mimics users’ web-scrolling actions and allows automatic extraction of long web materials spanning several pages on the research site.
The chosen challenge, “The Food Systems Game Lab Challenge,” was aimed at solving the issue of “How might we build a better food future for everyone, everywhere?” It was organized to address global food issues, with the goal of a safer, more sustainable food future. The challenge ran from March 24 to June 7, 2021, with the open call-for-ideas phase closing May 25, 2021. In total, 540 ideators submitted 442 unique ideas before the challenge concluded. Among the 540 ideators, 157 were included as network nodes because they participated in at least one comment activity, and comments from their 442 unique ideas were included as network edges. A comment-specific communication network was built, featuring 157 people as nodes and 291 comment events as edges over time. A fuller data procedure is available in Supplemental Appendix A.
Measures
Profile Homophily as Status Homophily
Location Homophily
All ideators’ locations by country were collected. Ideators self-identified that they came from 24 countries. A symmetric inter-ideator matrix was created. If a pair of ideators were co-located, the matrix cell was one, and otherwise, it was zero. Country affiliation is provided in Supplemental Appendix C.
Job Position Homophily
The researchers manually identified keywords that indicated ideator’s job titles from profiles. An inter-ideator matrix was created indicating how ideators’ identities overlap with one another. If a pair of ideators have the same job positions, the matrix cell would be one, and otherwise, zero. This matrix was used as an event covariate in the model. Detailed job titles and related keywords can be found in Supplemental Appendix D.
Industry Homophily
Due to page limits, industry categorization details are included in Supplemental Appendix E. Like the inter-ideator matrices mentioned above, if a pair of ideators have an overlapping industrial background, the matrix cell was one, and otherwise, it was zero.
Content Homophily as Value Homophily
Content homophily is measured by the similarity of ideators’ content categories. Categories are labels of idea content; ideators’ category affiliations were defined based on their authorship of idea content. An ideator was affiliated with a category only after an idea submission, because their ideas and associated categorization became visible only then, when community members could use specific platform categories in search filtering to find these associated ideas. Therefore, all measures of content homophily are time-varying. At each time point when a comment happened, only those who had already completed the idea submission were considered.
Platform-Category-Based Value Homophily
Platform categories are designed by the challenge management team. As discussed, not all platform categories are important in the same way in shaping ideators’ social behaviors, and the large number of platform categories and the potential overlapping meaning of the categories posed challenges for modeling and result interpretation. Each platform category has multiple options; there are 62 options 1 across all eight platform categories (See Table 1). Ideators may select more than one option across categories to label their ideas. To holistically capture the impact of platform categories on communication networks, the first approach is to feed all platform categories into the model, assuming that they are independent constructs. This model is explored by adding all eight platform categories as binary matrices: by each time point, among those who submitted ideas. If pairs of ideators have at least one common category submission, the value of 1 was inserted, and if they had none in common, a value of zero. This approach serves three purposes: (1) directly testing H2 by identifying which specific platform categories significantly predict comment events, (2) testing whether shared submissions to each platform category predict comment events, (3) providing a benchmark model against the second approach—a more complex multidimensional measurement—to compare model parsimony and fit.
This research also explores a multidimensional approach due to the potential limitations of the previous unidimensional approach discussed earlier. First, dependencies might exist among all platform categories. For example, ideators who submit their ideas under the theme of “access to funding” for a long-term plan might also tend to indicate “investors” as intended partners. Treating all platforms as independent might cause potential multicollinearity and challenges in differentiating the effects of the variables. Fitting too many categories may also result in model overfitting or difficulty interpreting meaningful patterns among the categories. To address these limitations, another approach was explored for comparison with the model that includes all platform categories. Latent Class Analysis (LCA) was selected to detect the latent data structure of the platform informed by idea-category-submission patterns of the community. Importantly, LCA was fit on all 442 ideas from the full set of 540 ideators rather than the 157 network nodes, so that the latent classes represent the platform’s underlying content structure. Method description and analytical procedures are reported in the analytical procedures section below.
Informed by the associated variables with the highest probability for the LCA modeling, the three core classes are (1) Incubation and stakeholder leverage, (2) Systematic, institutional, and cross-domain innovation, and (3) Capital-driven growth. For each time point, the assumption is that only those who submitted have a clear content identity based on their submitted ideas. Therefore, ideators’ identities shift if they submit multiple ideas 2 . At each given time point, if an ideator has submitted one idea, their probability to the three classes equals the pattern of this idea. However, if they have submitted more than one idea, their probability is calculated by the average of probabilities for all submitted ideas. Pairwise, Euclidean distance among ideators is calculated as a reverse measure of homophily. The distance among those who have not yet submitted an idea has the maximum distance of the matrix plus 1 so that they have the largest distance in comparison to others in the community. In this way, the LCA multidimensional measurement captured the core features of all platform categories. Also, by adopting a probability calculation of each ideator to each of the core ideas, it also calculates the multidimensional distance among ideators.
Analytical Procedures
Detecting Multidimensional Value Homophily
LCA was chosen for multidimensional value homophily measurement because it is informed from the actual observations, rather than being variable-driven. LCA does not assume balanced group size, and therefore, well maps the real category distributions (Sinha et al., 2021). LCA also provides probabilistic class membership, allowing a nuanced multidimensional homophily through a probability matrix. To prepare for model fitting, time-varying similarities among ideators are calculated based on the probability matrix between ideas and the three classes. Full model selection justification and procedures are reported in Appendix F and Appendix G. Detailed LCA results are reported in the results section.
Longitudinal Network Modeling
Hypotheses were tested using ordinal relational event modeling (REM; Butts, 2008). REM is a powerful and flexible longitudinal network modeling tool capable of recording each relational event between every pair of people and the specific time when each event happened, and thus, the order of the events (Brandes et al., 2009; Butts, 2008). The model is based on a rate function which indicates that “the rate of an event represents its pace over time; more frequent events have a higher likelihood of occurring, relative to events with a lower rate” (Schecter & Contractor, 2019; p. 4). Individuals’ exogenous traits, time-varying or static, can also be added to the model to consider whether a sender effect or a receiver effect exists that influences the odds of the relational event. Each relational event was both the dependent variable of all previous events and a part of the independent variable predicting all future events (Welles et al., 2014). In this research, comments were the relational events and were ordered by the sequence of the event based on the specific timestamp of the comments. The R package, relevant, was used to analyze the data (Butts, 2021). Goodness of fit (GoF) statistics for the overall models (e.g., AIC, BIC, AICC, and comparison of the model deviances with the null model) were evaluated first. Subsequent tests are provided to report the statistical significance of the proposed hypotheses in each model at the .05, .01, and .001 probability levels.
The reasons for using REM are twofold. First, REM can be used to analyze communication events that are short-lived events instead of relationship states such as friendship or collaboration ties, which violates the assumptions of an alternative longitudinal network model—specifically, the Stochastic Actor-Oriented Model (SAOM; Snijders et al., 2010). Second, unlike other longitudinal network models from the exponential random graph modeling (ERGM) family of analytics (e.g., SAOM and temporal ERGM), REM allows examination of the sequence or specific time points of the events without the risk of losing nuanced information by aggregating relational events into arbitrary time ranges (Quintane et al., 2014).
Results
Analysis of LCA Measurement Data
The PoLCA R package was used to conduct the LCA modeling. The best-fitting LCA model identified three classes with an excellent entropy of 0.94 (Weller et al., 2020). All ideas (Nidea = 442) submitted to participate in the challenges were included in the LCA model to capture the community’s underlying content structure as mentioned above in the Method section. To determine the optimal number of latent classes, models for 1 to 30 latent classes were run to identify the best-fit models with the lowest BIC (See Figure 1 for the Elbow plot for LCA model fit). See Figure 2 for the heatmap of the category-option-loading to the three classes and Table 2 for the three classes.

Elbow plot for the AIC and BIC for deciding the number of latent classes.

Heatmap for category-option-loading to the three latent classes.
Description of the Three Latent Classes Informed by the Best-Performing LCA Model.
Note.
This reflects that when ideators are still seeking mentorship and partnership, their ideas tend to still be in initial development stage. bClass 2 aligns with research on institutional entrepreneurs and boundary-spanning actors in innovative ecosystems, connecting education, policy, and technical domains to drive coordinated change (O’Mahony & Bechky, 2008). It is a multistakeholder, institutional, and systematic approach. Additionally, although Class 2 appears to be a generalist category, it is not the dominating class, and the proportion is not skewed. The mean posterior probability for the three classes is .403, .320, and .277, respectively.
REM Results
All models show significantly better model fit than the null model, which means that variables added significantly contribute to explaining the occurrence of communication network events in the community. First, a model with only endogenous network signatures was fit to capture the baseline network structure (See Model 1 in Tables 3 and 4). Different known baseline network structures were examined by adding them to the model (Butts, 2008). The analysis shows that 11 out of the 30 network signatures contributed to improving the baseline GoF (i.e., AIC, BIC, AICC as suggested in Pilny et al., 2016) of the model from the null model (See Figure 3). The 11 retained signatures are reported as Model 1 in Tables 3 and 4. These endogenous mechanisms, such as prior network activity, recency, and shaped comment formation, are independent of ideator attributes and are controlled in all subsequent models.
Relational Event Modeling Results.
Note that the variable added to test the multidimensional value homophily measure is the Euclidean distance among ideators, and, therefore, is the reverse measure for homophily. The negative and significant parameter supports the homophily hypothesis.
p < .05. **p < .01. ***p < .001.
Model Fit Statistics.
McFadden’s Pseudo-R2 = 1 – residual deviance/null deviance; Values around .20 or higher are commonly interpreted as indicating strong model fit for McFadden’s pseudo-R2 (McFadden, 1972).
p < .001.

Illustration of the network structures.
The Baseline Model
This section provides the results of developing the baseline model. Indegree receive describes the normalized indegree of each node, which predicts the future rate of communications received (β = −14.91, SE = 3.94, p < .001; See Model 1). Outdegree send (β = 15.50, SE = 1.31, p < .001) refers to the normalized outdegree of each node which predicts future comment event rates (Butts & Marcum, 2017). These structures represent a baseline network inertia pattern in which existing social behaviors tend to repeat themselves in the future, a frequently observed phenomenon for longitudinal social networks (Butts & Marcum, 2017). In other words, inertia is “loosely analogous to the role played by a positive AR(1) term in an autoregressive time series process” (Butts, 2008, p. 169). Outdegree receive is the normalized outdegree which predicts future event receiving rate (β = 10.04, SE = 1.42, p < .001). Indegree send (β = −12.82, SE = 3.80, p < .001) is the normalized indegree that predicts future event sending rate. These two network structures describe whether ideators’ outgoing and incoming actions are returned (Li, 2026). Recent receive send is the recency of event-receiving which predicts future sending rate (β = 4.33, SE = 0.18, p < .001) and recent send send is the recency of event-sending which predicts future event sending rate (β = .63, SE = 0.22, p < .01). OTPSnd (β = −1.59, SE = 0.53, p < .01) refers to the number of outgoing two-paths predicting future event sending, and ITPSnd (β = .63, SE = 0.25, p < .05) refers to the number of incoming two-paths driving future event sending. ISPSnd (β = .86, SE = 0.24, p < .001) refers to the number of “inbound shared partners” predicts future sending rate (Butts & Marcum, 2017; p. 17). Reciprocity describes that event sending and receiving are going to be reciprocated. Generalized reciprocity describes that events received will be paid forward to a third person.
Test of Hypotheses
This section provides the results of the tests of the variables that predicted the outcome of the proposed models for homophily (See Model 2 in Tables 3 and 4). Hypotheses 1a, b, and c inquired about the effect of the homophily of ideators’ location, job position, and industrial background on communication event occurrences. According to Model 2, location homophily (β = 2.01, SE = 0.29, p < .001) significantly predicted the occurrence of relational events. However, job position homophily (β = .24, SE = 0.15, p > .05) and industrial homophily were not significant (β = −.12, SE = 0.16, p > .05). H1a was supported, but H1b and c were not supported.
H2 posited that ideators’ content homophily based on platform categories significantly influences communication events. Among all platform categories tested (See Model 3), only homophily of long-term needs (β = 3.17, SE = 0.49, p < .001) was a significant predictor. Based on Model 3, challenge impact (β = −1.36, SE = 0.46, p < .001) and partners (β = −1.19, SE = 0.39, p < .01) showed that heterogeneity patterns, the opposite of homophily were in effect (Aksoy, 2015). The homophily effect for the rest of the platform categories was not significant, indicating that H2 was only partially supported.
H3 predicted that multidimensional homophily affects communication patterns, and it was supported by a negative and significant measure of the ideator distance covariate added to test the hypothesis (β = −.60, SE = 0.08, p < .001; See Model 4).
Preliminary Overall Test of the Cumulative Models
Before testing the hypotheses and comparing how the two measures of value homophily contribute to model fitting, a preliminary nested-model test probed the validity of the overall model to see whether each step of variable addition improved model fitting. The model architecture builds cumulatively from the null to controls only (Model 1), then either status homophily (Model 2) or value homophily (Models 3 and 4) as parallel intermediate models, and finally the two full models that combine status and value homophily (Models 5 and 6, differing only in which value homophily measure is used). Model comparison was done using Chi-Square deviance comparison tests. A detailed model comparison diagram is presented in Supplemental Appendix H. The test confirmed that each step contributed significantly to improving the models’ explanatory power.
Model Comparison
Consistent with the cumulative tests above, each of the six models was also significantly better than the null model (Pilny et al., 2016). Model 1, χ2(11) = 1107.98; Model 2, χ2(14) = 1143.72; Model 3, χ2(19) = 1152.02; Model 4, χ2(12) = 1162.10; Model 5, χ2(22) = 1184.10; Model 6, χ2(15) = 1192.75; all p < .001.
This study adopted two measures of value homophily: platform-category-based (Models 3 & 5) and multidimensional homophily measures informed by LCA dimension reduction analysis (Models 4 & 6). To assess how the two measures compared, the GoF measures between models were examined. Model 4 (AIC = 4743.65, AICC = 4744.78, BIC = 4787.73, χ2(12) = 1162.10) had a significantly better fit than Model 3 (AIC = 4767.73, AICC = 4770.53, BIC = 4837.52, χ2(19) = 1152.02). Similarly, Model 6 (AIC = 4719.00, AICC = 4720.75, BIC = 4774.10, χ2(15) = 1192.75) had significantly better model fit compared to Model 5 (AIC = 4741.36, AICC = 4745.14, BIC = 4822.18, χ2(22) = 1184.10). In both cases, the multidimensional homophily measures achieved better model fit than the unidimensional homophily measures.
Discussion
Through a longitudinal analysis of communication networks among ideators, this study examines comments as core communicative actions ideators engage for knowledge exchange, innovation, and constructing a temporary community on co-opetitive CCC (Monge & Contractor, 2003; Shumate & Contractor, 2013). Homophily of profiles and idea content was probed to evaluate the role of status and challenge-specific value homophily on communication patterns among ideators. In the context of CCC, we view comments as communication interactions that signal endorsement/critique, facilitate information exchange, and construct public-facing relational communication, consistent with Shumate and Contractor (2013)’s classification of communication relationships of flow, affinity, and representational communication. Homophily/heterophily is a critical heuristic of communication events on CCC because of the limited information available for ideators (Hu et al., 2023).
Profile-Based Status Homophily and Communication Dynamics
Results indicate that homophily of location has a positive impact on the occurrences of communication network events. Although CCC are open communities with geographically dispersed ideators, people still tend to cluster in commenting by countries of origin. Consistent with patterns in other online communities such as social media (Bastos et al., 2018; Zhang et al., 2017) and games (Huang et al., 2013), location could serve as an identity sign to attract others with similar types of cultural background and national identities (Goyanes, 2015). This finding demonstrates the persistence of geography as an organizing principle for communication even in virtual and globally accessible platforms. In addition, co-located users are also likely to work in similar time zones, which improves the convenience of communication (Huang et al., 2013). Context-wise, most challenges on OpenIDEO center on social issues that might be locally oriented. For example, the challenge we investigated is a food challenge, and food-related challenges may vary from location to location when agricultural, cultural, and economic status differ. Therefore, it is likely that when faced with similar concerns, ideators from similar geolocations tend to gather, albeit, in this case, digitally, for discussions.
One caveat about location homophily is that it is based on voluntary disclosure by 54 out of 157 ideators (34.4%) from 24 unique countries (See Appendix C for details). The missing location information aligns with user behavior on digital platforms (e.g., on Twitter, only about 21% of users provide city-level information on their profiles; Cheng et al., 2013). Actors who did not disclose their country affiliation were coded as zero for the location-homophily matrix, so their dyads did not contribute to this covariate’s computation. In the REM framework, missing data reduces the effective weight of the variable without inflating its statistical significance or biasing other predictors. Our interpretation of location homophily, therefore, should be interpreted conservatively as patterns among a subset of users who chose to reveal their country affiliations. More importantly, location homophily can only operate as a social mechanism when location information is visible to potential communicators. Despite only a limited number of users providing location information, conservative results still confirmed a meaningful effect of location homophily. Future research efforts should also explore content-driven location information to address missing data and provide a more holistic account of the digital location homophily effect.
For occupational homophily, neither jobs nor industrial backgrounds influenced communication networks. It is likely that people look for complementary expertise, experience, and background rather than those sharing occupations or industries (Horwitz & Horwitz, 2007; Su et al., 2010). Additionally, we annotated job positions by job titles, and the same job titles may involve different tasks depending on the types of organizations or industries they work in. For example, a business developer in finance may face different tasks than one in agriculture. Future research could delve deeper into the homophily of ideators’ professional skills and work scopes instead of examining mere labels for job positions.
Content-Based Value Homophily and Communication
Homophily and Heterophily Effects of Platform Categories
Interestingly, different dimensions of the platform categories displayed different influences on communication patterns. Homophily was only observed in long-term needs, indicating that ideators prefer cooperating with others sharing the same long-term needs. In comparison, heterophily, the reverse of homophily (Aksoy, 2015), was observed in short-term resources such as challenge impacts and partners. Ideators avoid communicating with others with the same short-term needs. The challenge impact was described by the platform management as “what could being part of [the challenge] unlock [for] you to advance your solution toward impact?,” and partners outlined “who are the partners that you most need to help take the action [in the short-term],” and therefore signal more emphasis on short-term needs. The results for the methods (e.g., tools and evidence) were not significant, indicating that ideators tend not to talk with others applying similar methods. Lee et al. (2019) noted that homophily is often driven by a need for social bonding, support, a sense of community, and belongingness, whereas heterophily could potentially be motivated by informational demands to exchange knowledge, create, and innovate. Short-term needs drive ideators to connect with diverse people. Similarly, ideators’ communication is not restricted to those using similar methods. This reflects the need for diverse information access and exposure because heterophily and diversity are linked with creativity (Han et al., 2014). Connecting with similar others, in this scenario, may lead to information redundancy (Norbutas & Corten, 2018). Short-term needs are also more urgent resource needs that may motivate competition among similar actors (Dobrev et al., 2001; Singh & Singh, 1994). This observation reflects the view that CCC are co-opetitive knowledge communities where mixed bonding and informational demands for ideas of better quality are shaping communication dynamics simultaneously and that different dimensions of value homophily function in different ways. Lastly, consistent with previous research, not all dimensions of value homophily are salient (Hooijsma et al., 2020; Roman, 2016). Findings suggest that general topics (e.g., themes and impacts) and status (i.e., idea readiness) are not significant factors influencing communication dynamics in CCC, suggesting that these aspects are not the main motivating factors for people’s social interaction. Future research could further explore whether the challenge design in aspects of problem scope or evaluation criteria is related to what categories are valued more in social interactions.
Multidimensional Value Homophily Measurement
This study raises an innovative multidimensional homophily measurement that goes beyond capturing multiple characteristics/dimensions separately, and also considers relationships among characteristics. This measurement is especially useful in contexts where certain characteristics overlap or are correlated. Treating them all as independent and equally weighted dimensions ignores their interrelationships and may cause multicollinearity issues in modeling or misinterpretation of what are really the significant factors. For example, people’s political ideology and moral foundations tend to be correlated (Graham et al., 2009), and it would be misleading to treat ideology and morality as separate dimensions when measuring homophily among individuals. This research adopts a measurement of multidimensional homophily by first fitting an LCA model to identify the latent content features of the community. However, we treat LCA as a clustering tool for identifying latent profiles that reflect multidimensional homophily, recognizing it is not the only method toward that goal.
The multidimensional approach offers several theoretical and methodological advantages. First, when a large number of platform categories exist and the categories are potentially overlapping, multidimensional measurement detects core dimensions among the myriad overlapping categories. This method is also useful in making sense of other human-defined categories, such as hashtags and user-generated labels. This approach is valuable because multidimensional homophily should not be a mere addition of homophily of multiple types of attributes, where some attributes play a higher role than others in occurrences of communication network events (Hooijsma et al., 2020). Second, this measure identifies the core features and helps simplify the interpretation of the models. In this current research, three significant classes represent the overall submission patterns of the ideas. In this food systems challenge, ideators tend to seek food solutions from three angles: (1) Incubation and stakeholder leverage, (2) Systematic, institutional, and cross-domain innovation, and (3) Capital-driven growth. Their value identities can also be more accurately represented by the probabilistic membership scores obtained through latent class analysis (LCA), which indicates the likelihood of each person aligning with one of three distinct value-based categories. Content value homophily is also manifested in these three latent dimensions of the platform categories. This measure also allows us to observe the community’s reaction to the complex platform categories raised by the challenge management. Third, the fit of the model with the multidimensional measure is significantly higher than the model that includes all platform categories and assumes that they are independent measurements of ideators’ value homophily. This means that the multidimensional measure captures value homophily in a more nuanced and parsimonious way. Moreover, by retaining individuals’ probability of affiliation with each of the three classes identified by the LCA model, a probability-based measure of homophily offers more nuanced insights compared to binary categorization. Lastly, the multidimensional homophily measure reveals that people with similar category submission patterns are more likely to communicate, which means that value homophily can be inferred from people’s relationships with core dimensions of the platform categories. In the food systems challenge, communication events tend to be shaped by status homophily (i.e., location), content value homophily (heterophily in some contexts) based on platform categories, and the similar alignment or relationship with the platform category structures. Lastly, in this crowdsourcing context, the multidimensional identification of latent categories also supports that people’s perceived categorization within an online community (Turner & Oakes, 1986) is not always visible. In fact, it might also be inferred from existing and visible categorization that stimulates homophily communication tie-building behaviors.
The use of LCA in this paper was made possible by a successive methodological evolution in analyzing digital content. Dictionary-based approaches, such as LIWC, provide transparent, keyword-based feature extraction but are rigid in tackling various linguistic contexts. More recent natural language processing (NLP)/AI approaches are more flexible and sensitive to contexts, achieve better accuracy, but also need careful validity and reliability checks (Ziems et al., 2024) due to hidden mechanisms. Current study explores a different analytical layer: the organizing patterns based on pre-existing categories using LCA. However, there are opportunities to analyze the idea texts and further examine how ideators design their ideas and whether any additional hidden themes emerge informed by data clustering. This would allow us to better understand how ideators make sense of the platform category affordances. We believe that theme-extraction method development in NLP/AI and LCA could further complement each other in identifying signals/themes of ideation as well as how these signals organize in structures in the future. We encourage future communication scholars to explore combined inductive theme detection via NLP/AI and latent structural identification through data-informed modeling, while verifying mechanisms and accuracy.
In summary, this research contributes to our understanding of homophily theory from a multidimensional perspective and informs salient homophily theory attributes in an increasingly important and widely growing milieu of online knowledge challenge communities. In co-opetitive knowledge communities, both bonding and informational needs shape the way ideators form connections. In addition, platform categories that signal challenge organizers’ intended innovation directions still matter for communication dynamics. Lastly, the use of longitudinal modeling and data ensures a more robust research design (Y. Xu, 2025) that confirms homophily theory’s proposition that similarity effects drive social interaction in the increasingly popular CCC context. This research also confirms that different types of homophily/heterophily patterns remain consistent throughout the capture contest period as temporary event structures, where the nuanced role of homophily among temporary communicators may differ from that of individuals engaged in continuing social interaction.
Co-Opetition and Homophily
Findings suggest the potential existence of co-opetition in online CCC. One status homophily (i.e., co-location) predicts the likelihood of comment events, and this aligns with the potential of location being a driver for cooperation. Existing organizational science literature suggests that firms that are geographically co-located, but with some distance, tend to form knowledge collaboration relationships (Chetty & Michailova, 2011). The geographical granularity for this research is at the country level, which may be the reason why ideators tend to collaborate. Another possible explanation is that in online communities, physical location drives collaboration because geographically restricted boundaries no longer bind them, consistent with Breitmar et al. (2024)’s finding that people still form online networks with others having geographical proximity at the country level. The presence of heterophily when ideators have urgent (short-term) informational needs shows that ideators tend to cooperate with people of complementary resources/skills, reflecting people’s strategic social purposes in CCC. This also speaks to Brandenburger and Nalebuff’s concept of complementors, whose resources and skills add value (1997). On the other side of the coin, this reflects that people with the same short-term needs avoid communicating. When ideators are too similar in short-term needs, they might be directly competing for resources, consistent with niche overlap theory, which states that when two entities have similar niches/resource needs, it tends to intensify their competition (Singh & Singh, 1994). These findings also suggest that homophily theory can benefit from considering a somewhat larger role of indirect measures of value homophily, such as those found in platform categories.
Issues for Organizational Communication
By examining a special organizing form of CCC, this research also speaks to organizational communication in two ways. First, it advances understanding of how homophily and heterophily coexist as driving forces for communication dynamics in this co-opetitive and hybrid community. Second, it illustrates how communication infrastructure (i.e., platform categories) shapes organizing processes through both top-down design and bottom-up communication dynamics among community members (Majchrzak et al., 2013; Shumate & Contractor, 2013).
Our findings also demonstrate that platform design, specifically the CCC category design, shapes communication patterns, which aligns closely with organizational communication scholarship on affordance theory (Leonardi, 2011; Treem & Leonardi, 2013). Platform categories are technological and organizational features designed by challenge management to structure ideation and guide community participation (Chen et al., 2022). Platform categories are also a governance mechanism that encodes ideation priorities and directions and shapes communication from the top down by making certain value orientations visible and searchable for ideators to pursue their communicative goals (Treem & Leonardi, 2013). Critically, following Leonardi (2011), the same technology (i.e., categorization feature in our context) does not produce a fixed affordance. Instead, the same categorization feature affords different perceived possibilities depending on the category’s meaning. Our findings suggest that categories oriented toward long-term needs are associated with homophilous communication tie formation, suggesting that ideators may perceive long-term categories as affording opportunities to identify others who share similar long-term visions. Speaking to the affordance literature (Leonardi, 2011; Treem & Leonardi, 2013), we describe this phenomenon as a visibility affordance enacted through bonding social connections. In contrast, categories oriented toward short-term needs are associated with heterophilous tie formation, suggesting that ideators may perceive this category as an opportunity to locate complementary resources and expertise, an informational enactment of the category feature. This mechanism echoes the relational nature of affordances—when the perceived meaning of a platform feature differs, the possibilities for social action also differ accordingly (Leonardi, 2011; Please see Appendix I for the affordance conceptualization flow).
Practical Implications
This research found that although not all platform categories influence communication dynamics in the same ways, they still function as visible social cues affecting social interactions and reflect multidimensional similarity among ideators. Platform management and sponsors should focus on designing categories, as they significantly shape the social structure of communities. The management team should also be aware of how people are more likely to cluster under certain types of platform categories and be cautious about whether this could create social walls that discourage certain interactions across categories, especially when interaction across diverse domains is encouraged for best innovation outcomes. Ideators should also be aware of how status and value homophily function in co-opetitive knowledge creation communities. How to label or design ideas and what they choose to disclose in profiles matter profoundly because those will have significant impacts on the people who are attracted to interact with them. This research also has important implications for platform affordances and user interface design for digital knowledge innovation platforms. The homophily findings reflect ways to motivate communication that improve community bonding and cohesion, which is especially important for short-term knowledge innovation communities that require community-driven interactions that drive positive innovation outcomes. Platform design experts could experiment making certain elements (e.g., profile information, certain category affiliation) more visible or salient through search functions to improve community cohesion. The heterophily in resource-seeking needs also informs platform design to highlight ways to surface information/resource needs and their complementarity to encourage more effective knowledge exchange.
Limitations
To start, the comment content was not examined. Future research efforts should explore how different dimensions of homophily influence different layers of comment networks (e.g., flow, affinity, and representational communication; Shumate & Contractor, 2013) using a multiplex and longitudinal approach. Second, this research is grounded on one CCC, and future research efforts should explore whether homophily findings can be generalizable to other platforms, such as open-source software communities (e.g., Github), and online professional communities (e.g., LinkedIn groups). Third, we acknowledge the imperfect proxy of value homophily using categories as it only captures one dimension and ignores ideators’ underlying values and beliefs. Future work should use surveys or interviews to surface whether ideators’ hidden values influence communication actions. Fourth, this research assumes that ideators are aware of each other’s value and status, although direct data supporting this assumption is lacking. However, in online communities, homophily is a significant factor that reduces the uncertainty users face due to limited access to personal information about each other (Hu et al., 2023). Ideas and profile information are also the primary sources of information available that guide users’ social decisions. Nevertheless, future research efforts should examine to what extent users actually view and interpret others’ profiles and ideas through methods such as surveys, experiments, and interviews.
Conclusion
The field of communication has developed important theory and has accumulated a rich array of research findings on the critical role of homophily in building communication network ties. Theory advancement and innovative research on homophily’s manifestations and effects in the increasingly popular venue of CCC, however, is still in development. The CCC venue, which features temporary networks, co-opetitive forces, and global communities, offers tremendous potential for fascinating new insights, and this study has provided a model and motivation for our future progress.
Supplemental Material
sj-docx-1-crx-10.1177_00936502261458941 – Supplemental material for Co-Opetition in Crowdsourcing Challenges: Comment Network Dynamics and Relational Homophily Among Contestants in Crowdsourcing Challenge Communities
Supplemental material, sj-docx-1-crx-10.1177_00936502261458941 for Co-Opetition in Crowdsourcing Challenges: Comment Network Dynamics and Relational Homophily Among Contestants in Crowdsourcing Challenge Communities by Yiqi Li, Peter Monge and Janet Fulk in Communication Research
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The related project of this article was supported by the National Science Foundation of the United States (Cyber-Human Systems, Grants: #1514505 “CHS: Medium: Collaborative Research: Understanding Online Creative Collaboration over Multidimensional Networks”), the Annenberg School of Communication at the University of Southern California, and the School of Information Studies at Syracuse University.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
