Abstract
This study aims to identify an appropriate conceptual framework to evaluate crowdsourcing platforms from an open innovation perspective employing a combination of qualitative and quantitative methods. The initial indices of the performance evaluation framework in the crowdsourcing platforms are obtained through the Delphi method and interviews with experts. Then, using these factors, a statistical questionnaire is designed and distributed among users of crowdsourcing platforms to confirm or reject the factors. Finally, the aspects of the performance evaluation framework of crowdsourcing platforms are specified from the perspective of open innovation. Using fuzzy hierarchical analysis, these aspects are prioritized in order of importance: Collaboration, Project design, Moderation, Terms and conditions, UI/UX (user interface and user experience), and Key statistics. Concerning the principle of crowdsourcing, which is based on crowd participation and crowd intelligence of users, Collaboration and Project design turned out to be the significant factors in evaluating a crowdsourcing platform.
Keywords
Introduction
One of the fundamental problems facing governments during the COVID-19 pandemic is the lack of employment opportunities. In developing countries, it is challenging to find a full-time job apart from the pandemic. There are some professionals worldwide for whom full-time employment is not financially viable. As a result, they prefer to work as freelancers and problem solvers using only a computer and an internet connection. Using this approach, business owners/managers can minimize potential costs and maximize efficiency at the same time. Currently, the Ponisha platform in Iran has over 270,000 crowdworkers and has successfully conducted over 60,000 crowdsourced projects (Ponisha, 2022). Organizations can outsource their innovation-related problems to employ many competent or even inexperienced individuals on such platforms. A type of outsourcing in which a requester's projects are open to a variety of people, whether experienced or inexperienced, is known as crowdsourcing.
Howe (2006) is a pioneer in this field and described the definition of crowdsourcing for the first time as an act whereby an organization or institution takes a function that was once performed by employees and outsources them to an undefined network of people, which is generally in the form of an open call. It is noteworthy that the economical use of crowdsourcing is increasingly becoming popular among companies’ managers. In addition, the previous years have seen a rapid growth of organizations paying considerable attention to crowdsourcing (Howe, 2006; Hossain, 2012; Gatautis and Vitkauskaite, 2014; Huberman et al., 2009; Sutherlin, 2013; Bassi et al., 2020; Estellés-Arolas and González-Ladrón-de-Guevara, 2012), and this development has led to increasing the active platforms in this regard.
It should be noted that these platforms serve as an intermediary between employers/requesters and crowdworkers. Each platform is designed to accomplish a specific purpose, such as InnoCentive 1 for research and development, Clickworker 2 for producing text contexts, data classification, web search, and surveys. Platforms such as HeroX 3 facilitate relationships between organizations and individuals by allowing organizations to tap into the pool of crowdworkers to develop innovative solutions. Crowdsourcing can also benefit organizations by bringing innovation and creativity together.
For example, organizations use crowdsourcing to evaluate a novel product before development (Vukovic, 2009). Researchers have proven that crowd-sourced innovations have many advantages over solutions developed by research and development personnel in organizations (Henkel and Von Hippel, 2004; Bogers et al., 2010; Baldwin and Von Hippel, 2011). Several studies have found that designing incentive mechanisms motivates crowd workers and keeps crowdsourcing platforms performing well (Yang et al., 2008; Mason and Watts, 2009; Chen et al., 2010; Horton and Chilton, 2010); however, there are several factors, particularly those related to innovation, that contribute to an increased platform's performance quality.
Today, companies are seeking to embrace innovations to improve their business and make them more competitive against their counterparts (HashemiDehaghi, 2019; Grimaldi et al., 2013; Huizingh, 2011). Organizations often lack the enabling environment for innovative activities (Loku and Loku, 2022), so they must use other people's ideas outside of their own.
In Al-Zagheer and Barakat (2021) view, performance evaluation is crucial to an organization's programmatic ability and predictability for strengths and weaknesses, and neglecting it or ignoring its indicators can jeopardize its success. Also, the performance evaluation leads to identifying the pros and cons of the organization as well as finding reasonable and unreasonable cases (Zhang and Tan, 2012). A successful performance evaluation system includes a set of performance factors that deliver useful information and assist managers in controlling, planning, managing, and performing various activities (Appelbaum, 2011). Performance evaluation is a systematic and organized tool to examine the processes that direct the organization to achieve its predetermined targets (Zhang and Tan, 2012). The performance evaluation system is used to take advantage of effective and efficient methods leading to achieving the goals. In this system, some indices evaluate the effectiveness and efficiency of processes in an organization (Kazan, 2012).
According to the remarkable advantages of performance evaluation, the need to use experimental and comparative evaluation techniques is clear and undeniable. Performance evaluation tasks can be performed by directly measuring existing systems or modeling for systems under design. The importance of the performance evaluation system is related to the methods employed to analyze and optimize the existing systems (Pooley and Abdullatif, 2010). Based on what Cricelli et al. (2022) found, crowdsourcing is an effective and powerful practice due to its adaptability, and it can enhance the implementation of open innovation strategies.
Therefore, the present study aims to develop a conceptual framework for assessing crowdsourcing platforms from an open innovation perspective. While there are many approaches to evaluating the performance of crowdsourcing platforms, we have used an exploratory methodology to better understand one aspect that has been neglected so far. A combination of qualitative and quantitative methods is used to analyze this study. We developed the initial framework indices by interviewing experts using the Delphi method. Users of crowdsourcing platforms were then given a statistical questionnaire to confirm or reject the identified factors. This study aims to achieve the following primary objectives:
Identifying different indices for a conceptual framework to evaluate the performance of crowdsourcing platforms from the perspective of open innovation Developing an appropriate conceptual framework to evaluate the performance of crowdsourcing platforms from the perspective of open innovation.
Here is how the rest of the paper is organized. A description of the methodology is provided in section three of the paper. Section four illustrates the research results and the data used in detail. Lastly, the fifth section summarizes this study's subject and potential areas of future research.
Theoretical framework
Open Innovation means that valuable ideas can come from inside or outside the company and can go to market from inside or outside the company as well (Chesbrough, 2003). It is a paradigm that assumes that the firms are required to use external and internal ideas and paths to market access, and it is useful to make technological progress (Chesbrough et al., 2006). There are many approaches to innovation, such as close and open approaches (Trott and Hartmann, 2009). Vanhaverbeke et al. (2008) argue that open innovation is impossible without absorptive capacity as an internal capability of the companies seeking innovation.
Traditional open innovation models source knowledge from external stakeholders and funnel it through a single flow into the organization. Therefore, crowdsourcing-based idea platform concepts have emerged to maximize open innovation's efficiency and flexibility, and now when an organization is seeking scientific advancement, clarity and knowledge sharing among scientists outside the organization employing the crowdsourcing process have desirable benefits (Lee, 2016; Boudreau and Lakhani, 2013).
Aitamurto et al. (2011) defined crowdsourcing as an open innovation mechanism that is based on information and communication technology. They also noted that the difference between the term crowdsourcing and other concepts, such as co-creation and user innovation is still being discussed. Through crowdsourcing, organizations can access external capabilities to create innovation. To solve this problem, firms must provide details of the issue and ask for help from qualified individuals (Majchrzak and Malhotra, 2013; Christensen and Karlsson, 2019). Johnson stated that the community-based crowdsourcing approach could be employed appropriately when the innovations are based on past developments (Johnson, 2010). According to some studies, market-based competitive approaches lead to individuals competing for the best solution. As a result, it performs better than community-based approaches (Johnson, 2010; Aitamurto et al., 2011).
There has been widespread interest in crowdsourcing, especially in the past few years (Guo et al., 2020), so research in this area is of great importance, and many studies have examined it in the broader literature (Whitla, 2009; Behrend et al., 2011; Saxton et al., 2013; Zhao and Zhu, 2014). Geiger et al. investigated crowdsourcing from a socio-technical perspective that provides a deeper understanding of the components and relationships in crowdsourcing systems (Geiger et al., 2012). The authors divided these systems into four categories based on the performance of different crowdsourcing systems and according to their type of participation (homogeneous/heterogeneous) and the value obtained from crowd participation. Then, the fundamental requirements were presented to design them by analyzing the characteristics existing in each type of these systems. This study provided a good starting point for discussion and further research regarding the design of other crowdsourcing systems.
One of the main challenges of the platforms is providing appropriate incentives to the crowdworkers so that their performance and participation in crowdsourcing tasks can be increased (Hirth et al., 2013; Gadiraju et al., 2015; Ye and Kankanhalli, 2017).
Lee et al. (2014) proposed a novel framework to improve the quality of work in the field of crowdsourcing. This framework is also advantageous to enhance the quality of the results in an environment where problems are tackled using crowdsourcing. This framework includes task management, worker management, task distribution, and quality analysis. The characteristics of workers are analyzed in this framework, and the appropriate tasks are assigned to the specialized people in line with improving the work quality. It also proposes collective voting instead of the more commonly used majority representation method and properly evaluates work results. The results obtained in this study show that this framework facilitates the effective allocation of work.
Concerning the similarity of how online communities operate with crowdsourcing platforms, à Campo et al. (2019) examined the studies conducted related to the online communities and collected the design instructions (explorations) of these communities from the perspective of twenty-one users’ interfaces. Then, the authors experimented these explorations on twenty crowdsourcing platforms to assess how these platforms matched the explorations. According to this study, by creating a framework to design crowdsourcing platforms from the perspective of an online community, it is possible to tackle the existing problems, such as increasing motivation, collaboration, creativity, and reliability. These explorations can also be used to design, evaluate and compare crowdsourcing platforms with other competitors. Finally, this study specified the current challenges of crowdsourcing platforms to achieve the positive aspects of online communities.
Paik et al. (2020) stated that in spite of extensive scientific and theoretical research on crowdsourcing, innovation managers rarely employ it as a tool for innovation. Their study's conclusions imply that the crowdsourcing model is strikingly costly for organizations in the step of problem identification. Also, it was revealed that this model could be affordable in problem-solving while the traditional purchase process in the downstream step is costly.
Methodology
The authors are unaware of any research examining crowdsourcing platforms from an open innovation perspective for performance evaluation. Due to this reason, the study employs an exploratory mixed method (qualitative and quantitative, respectively) to determine the appropriate indices for the performance evaluation framework from an open innovation standpoint. To accomplish this goal, the authors followed a two-step process. The initial evaluation framework was developed based on unstructured interviews with experts using the Delphi method (Hsu and Sandford, 2007).
Using the extracted indices, a quantitative questionnaire was created and distributed among users of crowdsourcing platforms. The Cochran formula is used to determine sample size in this part of the study. As the population size (more than 100 thousand people) is unknown, the equation below is used to specify the number of statistical samples:
It is noteworthy that the statistical population of this study includes professors and experts in open innovation, managers of crowdsourcing platforms, and all people who have completed at least one task or project as a crowd worker or employer in any of the crowdsourcing platforms during the past year.
The Delphi technique
Initially, we conducted unstructured interviews with Delphi panel members and they were asked the following questions:
Given the role of Crowdsourcing in open innovation, what should be considered in designing a Crowdsourcing platform? In order to evaluate the performance of a Crowdsourcing platform from open innovation point of view, what indicators should be considered?
21 indices were extracted from the experts in the first round of interviews. In the second round, these factors were used to prepare a questionnaire
4
consisting of 21 indices rated on a five-point Likert scale and sent back to the experts for validation or addition of new indices they deemed relevant. All the initial indices were confirmed by the experts, so there was no need to gather opinions from Delphi panel members in the third round. As part of the performance evaluation framework for crowdsourcing platforms, indices that received an average score of more than three, were regarded as confirmed. Delphi panel participants were selected among crowdsourcing platform managers, innovation managers, academic researchers in the field of open innovation, and managers of organizations that used crowdsourcing platforms for open innovation. Table 1 provides information about the members of the Delphi panel.
Profiles of Delphi panel members.
Profiles of Delphi panel members.
The information regarding the indices and their average scores are presented in Table 2. It can be seen from the table that all twenty-one indices have been confirmed and categorized into six categories. Experts who participated in the Delphi method contributed to developing these aspects. The final aspects are Terms and conditions, Collaboration, Moderation, Project design, Key statistics, and User interface and User experience (UI/UX). Considering the crowdsourcing platforms as open innovation platforms, these factors are considered the main aspects of performance evaluation.
Selection of indices using the Delphi method.
Since experts confirmed all the indices, a quantitative questionnaire 5 was prepared based on those indices and they were distributed among the users of crowdsourcing platform. This step aimed to verify all indices obtained from experts by crowdsourcing platform users. By contacting the followers of social media pages belonging to some well-known crowdsourcing platforms, we were able to reach these users.
Confirmatory factor analysis of performance evaluation framework for crowdsourcing platforms
Since the performance evaluation framework of crowdsourcing platforms in this study has six components (classes) that can be considered as aspects of this conceptual framework, the first-order confirmatory factor analysis is examined in line with testing the measurement model and validating the components of the performance evaluation framework for crowdsourcing platforms. According to Figure 1, a factor analysis is conducted for the performance evaluation framework of crowdsourcing platforms.

Factor analysis output for the structure of performance evaluation framework of crowdfunding platforms (factor load coefficients).
Using confirmatory factor analysis, it is possible to identify hidden concepts and variables in the questionnaire. Since hidden variables cannot be easily measured, an operational definition must be made using explicit variables. When evaluating crowdsourcing platforms’ performance, for instance, an abstract concept is the central issue, which is also called a hidden variable in factor analysis. It is advantageous to use the operational definition to clarify the ambiguity of these kinds of variables and make them more understandable.
To give an operational definition for a hidden concept or variable, they are represented by observable or explicit variables that can be measured on a scale. In this study, the variables include Terms and conditions, Collaboration, Moderation, UI/UX (user interface and user experience), Key statistics, and Project design, which are analyzed according to the questionnaire. The results imply that factor load for all the questionnaires is more than 0.4. Hence, the questionnaire items are appropriate.
Based on the results obtained through the LISREL software (Table 3), the ratio of chi-square to degrees of freedom (χ2/df) equals 1.896. This ratio provides information on the relative efficiency of competing models in accounting for the data. Researchers have recommended using ratios as low as 2 or as high as 5 to indicate a reasonable fit (Marsh and Hocevar, 1985). Furthermore, the root mean square error of approximation (RMSEA) ranges from 0 to 1, with smaller values indicating a better model fit. A value of .08 or less indicates acceptable model fit (Xia and Yang, 2019), which in the proposed model is equal to 0.065.
Fitting indices risk-taking.
Marsh et al. (2020) interpret goodness of fit index (GFI), Adjusted goodness of fit index (AGFI), Comparative Fit Index (CFI) and Normed Fit Index (NFI) scores in the .80 to .89 range as representing reasonable fit; scores of .90 or higher are considered evidence of good fit. In this study, all the indices are calculated to be over 0.95. As a result, the data of this study are well suited to the factor structure. This means that the model can be approved.
As stated earlier, the performance evaluation framework of crowdsourcing platforms has six components (classes) that can be considered indices of this framework. Thus, the second-order confirmatory factor analysis is conducted to test the measurement model and validate the components of the performance evaluation framework of crowdsourcing platforms.
The results presented in Table 4 demonstrate that the factor loads of indices (questions) related to each component are appropriate to predict the performance evaluation factors of crowdfunding platforms. In addition, it can be concluded that each component's factor load can be considered an index of performance evaluation of crowdsourcing platforms suitable for predicting this variable. Moreover, Table 5 provides the fit indices related to the measurement model shown in Figure 2, and in analogous to previous calculations, all indices represent the fact that the model can be approved.

Second-order factor analysis output for the structure of the performance evaluation framework in the crowdsourcing platforms (factor load values).
General fit indices for the tested model in this research (Ghasemi, 2013).
The results of the Bartlett test for all aspects of the framework.
Here we analyze and explain the data collected in this study and the results we obtained. The main aspects of the performance evaluation framework for crowdsourcing platforms are as follows: Terms and conditions, Collaboration, Moderation, UI/UX (user interface and user experience), Key statistics, and Project design. The Kaiser-Meyer-Olkin-Bartlett's (KMO-Bartlett) Test was used to determine whether the data could be used for Confirmatory Factor Analysis. The results are presented in Table 5.
Marshall et al. (2007) consider a KMO value above .50 as sufficient for factor analysis. Hence it is acceptable for all aspects of the framework. Furthermore, in Bartlett's test, the significant values of zero (0.000) indicate an appropriate factor model.
Ranking and determining the importance of the framework aspects using the FAHP method
The Fuzzy Analytic Hierarchy Process (FAHP) was applied to prepare a questionnaire 6 that included all the approved indices and was sent to the experts who participated in the qualitative round. By using FAHP software, fuzzy analytic hierarchy calculations was performed. Among the aspects used to evaluate crowdfunding platforms, experts consider “collaboration” to be the most significant criterion (importance factor, 0.173) for evaluating the performance of crowdfunding platforms. Then, the criterion of “project design” with an importance factor of 0.171, the “moderation” criterion with an importance factor of 0.166, the “terms and conditions” criterion with an importance factor of 0.165, the “user interface and user experience” criterion with importance factor 0.164, and the “key statistics” criteria with an importance factor of 0.161 were prioritized, respectively. In Table 6, the statistical results indicate that collaboration has the highest importance. Key statistics, however, had the lowest rank and the least importance compared to the others.
The values determined for ranking and determining the importance of the framework aspects.
The values determined for ranking and determining the importance of the framework aspects.
A conceptual framework for evaluating the aspects of crowdsourcing platforms from an open innovation perspective is illustrated in Figure 3 based on the results of this study.

The conceptual framework of aspects of crowdsourcing platforms evaluation from the perspective of open innovation.
From the perspective of open innovation, this paper seeks to identify an appropriate conceptual framework for evaluating crowdsourcing platforms. It includes Terms and conditions, Collaboration, Moderation, Project design, Key statistics, and UI/UX (user interface and user experience). There are two steps involved in the analysis of this research: a qualitative step and a quantitative step. During the first step, experts are interviewed about the study's subject matter using the Delphi method. The quantitative step involves distributing a statistical questionnaire to crowdsourcing platform users. After that, the answers were validated using the Lisrel software.
There are numerous guidelines available for the “acceptable” model fit. Brown (2015) suggests RMSEA close to 0.06 or less and CFI close to 0.95 or greater. It should be considered that these are not rigid, and he comments that his use of “close to” is meaningful. According to Kline (2015), “RMSEA ≤ .05 indicates close approximate fit, values between .05 and .08 suggest reasonable error of approximation, and RMSEA ≥ .10 suggests poor fit”. CFI “greater than roughly .90 may indicate a reasonably good fit of the researcher's model”, and SRMR values “less than .10 are generally considered favorable”. Brown (2015) and Kline (2015) recommend reporting several of the same fit indices; however, their criteria for a good fit are different, and Brown (2015) is a little more conservative.
In this study, first-order and second-order confirmatory factor analyses were conducted to validate the components of the framework for evaluating the performance of crowdsourcing platforms. According to the results of the first-order confirmatory factor analysis, the data of this study fit well with the scale's factor structure. In addition, the results of the second-order confirmatory factor analysis indicate that the factor loads of the indices (questions) associated with each component are appropriate for predicting the performance evaluation factors of crowdsourcing platforms. The results of the Fuzzy Analytic Hierarchical Process (FAHP) revealed that “Collaboration” is the most important aspect, while “Key Statistics” is the least important.
In light of the results obtained in this paper, the following major recommendations for future study can be made:
Research on different roles of crowdsourcing in open innovation for organizations. Research on approaches to increase interaction and collaboration between users of a crowdsourcing platform. Providing a conceptual framework for project design on crowdsourcing platforms from an open innovation perspective. A statistically greater sample of respondents will ensure more reliable results. In future studies, the authors plan to widen the population of interest and add the elements of artificial intelligence and programming to enhance the acceptability of the model fit.
