Abstract
True experimental design and quasi-experimental design are considered to be rigorous research designs appropriate for assessing the impact of pedagogical interventions. This study explores the extent and application of experimental design in impact research on entrepreneurship education (EE) based on a systematic literature review. The findings reveal a substantial lack of methodologically rigorous studies on EE impact, which has severe implications for the accumulated knowledge on the subject. Furthermore, the article summarizes the findings from the body of experimental impact studies with a strong research design and concludes by indicating fruitful avenues for future research.
Keywords
Entrepreneurship is recognized as an important driver of economic growth (Audretsch et al., 2006). There has, consequently, been an increasing propensity in government policy to promote entrepreneurship education (EE) as a means of stimulating economic growth (Martinez et al., 2010; O’Connor, 2013). The introduction and development of EE courses demand substantial investments, in terms of both time and money, from faculty, educational institutions, sponsors, policymakers and other stakeholders. It is accordingly important to understand the impact that EE can have on students: for example, whether they develop an entrepreneurial mindset through such courses or whether EE actually contributes to increased start-up rates after graduation.
There has been substantial growth in impact research on EE as stakeholders seek to understand its consequences for students and society as a whole (Bae et al., 2014; Blenker et al., 2014; Martin et al., 2013; Nabi et al., 2016a). However, empirical research has produced rather mixed results on the impact of EE using various measures of entrepreneurial outcome (Bae et al., 2014; Lorz et al., 2013; Martin et al., 2013). While some scholars have found a positive impact on, for instance, entrepreneurial attitudes and intentions (Fayolle et al., 2006; Kolvereid and Moen, 1997; Wilson et al., 2007) and entrepreneurial behaviour (Elert et al., 2015; Kolvereid and Moen, 1997; Lange et al., 2011), others have obtained mixed results (Oosterbeek et al., 2010; Souitaris et al., 2007). Some have even found indications of a negative impact on entrepreneurial orientation (Mentoor and Friedrich, 2007) and entrepreneurial intention (Oosterbeek et al., 2010; Von Graevenitz et al., 2010). Therefore, how EE affects students, and via which mechanisms, remains unexplained.
The growing body of impact studies on EE has, therefore, received considerable criticism. A major concern has been the lack of empirical studies that are methodologically robust (Bae et al., 2014; Fayolle and Linan, 2014; Martin et al., 2013), a weakness that has also been highlighted in research on management education in general (Köhler et al., 2017; Rynes and Brown, 2011). Köhler et al. (2017) argue that, to gain legitimacy for a field and publish impactful research, impact studies need to be designed in a way that provides strong evidence for such effects. Rigorous experimental design is, according to Slavin (2002: 18), ‘the design of choice for studies that seek to make causal conclusions, and particularly evaluation of education innovations’ and ought to be the preferred choice when addressing educational impact (Johnson and Christensen, 2012). In this study, we define rigorous or strong experimental design as true experiments or quasi-experiments that make use of a longitudinal design (as opposed to a cross-sectional design) and have control groups for comparison (Cook and Campbell, 1979; Shadish et al., 2002). Accordingly, these would be suitable research designs for studying the impact of EE as a pedagogical intervention. The degree to which strong experimental design is actually applied in EE impact research is, however, not known, although EE impact research has been criticized for reporting impact without the necessary level of methodological rigour. This can have severe implications for the accumulated knowledge about impact in EE research, on which educators and policymakers have to base their actions. Thus, it is critical to establish a strong experimental design for EE impact research when providing stakeholders with empirical evidence about the relationship between EE and entrepreneurial learning outcomes.
Based on the above, we believe that the use of experimental research design in EE impact research requires further investigation. The twin objectives of this systematic literature review (SLR) are, therefore, (1) to explore the diffusion of experimental impact studies in EE research and the extent to which those studies have a strong experimental research design (i.e. apply a true experimental design (TED) or a quasi-experimental design (QED)) and (2) to synthesize the findings on entrepreneurial outcome measures in studies with a strong experimental research design.
To address these objectives, we use an SLR approach to explore published research reported in 65 journals listed by the Association of Business Schools (ABS). By applying established categories of experimental research design, we are able to classify quantitative EE impact studies according to the robustness of their research design and to provide an overview of the status quo in EE impact research. While our review highlights examples in which experimental research design has been applied successfully, it also sheds light on the scarcity of strong experimental design in EE impact studies and the threat this poses for the reliability of previous empirical findings. Furthermore, we provide a synthesis of empirical studies with strong experimental research design in order to establish the cumulative knowledge in EE that can be traced back to methodologically robust quantitative studies. Our study contributes to EE scholarship from both methodological and theoretical perspectives by furthering our understanding of the use of experimental research design in EE impact studies. We propose key avenues for further research that hold the potential to strengthen and build legitimacy for the field of EE research, and the findings from the study should be of value to scholars applying experimental design in their empirical work, as well as practitioners and policymakers who are seeking to better understand the impact of EE as a pedagogical intervention.
The content of the rest of the article is as follows. The next section addresses the use of EE outcome measures and outlines findings in earlier reviews and meta-analyses of EE. Thereafter, the methodological approach is presented along with a recap of seminal contributions on experimental research design to draw up experiment classifications. Next, the descriptive and qualitative findings of the SLR are reported, and then the article concludes with a discussion of the findings, our conclusions and the implications of our work for future research on EE.
Research context: Measuring the impact of EE
Impact studies on EE aim to establish whether a pedagogical intervention has caused any change in specific outcome variables. The outcomes measured need to be carefully aligned with the intended learning outcomes for the EE course (Kamovich and Foss, 2017) and may address changes in students’ hearts, minds and behaviour (Souitaris et al., 2007). The importance of evaluating the outcomes of EE has been widely acknowledged (Mets et al., 2017), and different frameworks have been suggested for categorizing entrepreneurship learning outcomes. Fisher et al. (2008) developed a tripartite framework drawing on seminal contributions in the education literature (Bloom et al., 1956; Kraiger et al., 1993), which categorizes entrepreneurial learning outcomes as cognitive, skill-based or affective. Cognitive outcomes refer to knowledge, comprehension and critical thinking about entrepreneurship; skill-based outcomes are linked to the skills necessary to start a business; and affective outcomes comprise entrepreneurial attitudes, volition and behavioural preferences.
An alternative framework for teaching and learning entrepreneurship was suggested by Kyrö (2008). The framework consists of three constructs: cognition, affection and conation. Compared with the framework of Fisher et al. (2008), skill-based learning outcomes do not comprise a separate category, but rather are included in cognitive learning outcomes. Furthermore, affective learning outcomes are divided into affection and conation. While affection refers to emotions and perceptions, conation takes the mind one step closer to behaviour, as it describes how one acts on thoughts and feelings via impulse or directed effort (Ajzen, 1989).
Four EE outcomes drawn from the above sources are shown in Table 1, along with behavioural outcomes as a fifth category. After all, developing cognitive, skill-based, affective and conative entrepreneurial outcomes should ultimately lead to entrepreneurial behaviour and socio-economic outcomes in real life; for example, through employability, business creation, intrapreneurship or social entrepreneurship (Kozlinska, 2016; Mets et al., 2017). Hence, it is essential to establish an understanding of the impact of EE in all five outcome categories of EE impact research.
Categories of outcome measures in EE impact studies.
EE: entrepreneurship education.
There have been several previous attempts to summarize findings on EE impact through SLRs and meta-analyses. In 2007, Pittaway and Cope reviewed 184 papers published between 1970 and 2004 in an SLR and concluded that EE appeared to have an impact on student propensity and intentionality towards entrepreneurship. They emphasized that there was a lack of research on whether EE actually led to entrepreneurial behaviour and, more specifically, on the link between different forms of pedagogy and student entrepreneurial outcomes. Their findings are supported by Mwasalwiba (2010), who in his literature review also highlights the substantial focus on attitudes and intentions and the failure to link these to actions. He further calls for broader outcome definitions.
A positive impact on skills and knowledge, attitudes, intentions and nascent entrepreneurship is also acknowledged in SLRs by Rideout and Gray (2013) and Lorz et al. (2013). Both reviews draw attention to the methodological weaknesses and deficiencies found in most EE impact studies. This tendency is further confirmed in two meta-analyses on EE by Martin et al. (2013) and Bae et al. (2014). Using human capital as a theoretical lens in a meta-analysis of 42 studies, Martin et al. (2013) found a significant positive association between EE/training and entrepreneurial human capital as well as between EE/training and entrepreneurship outcomes. Closer examination of the findings did, however, reveal that studies without a strong experimental design tended to overestimate the positive association. When studies with pre- and post-measurement and control groups were isolated, the effect size was substantially reduced.
Bae et al. (2014) report similar findings on how methodological rigour influences empirical findings on EE. Their meta-analysis of 73 studies found a small significant correlation between EE and entrepreneurial intention. However, after controlling for the intentions that students had before EE, the association was no longer significant. Hence, when controlling for self-selection bias by introducing pre-intervention measurement, the actual impact of EE becomes unclear. Bae et al. (2014) further established the role of cultural values as moderators in the relationship between EE and entrepreneurial intention.
A recent SLR by Nabi et al. (2016a) of 159 impact studies of EE in higher education also recognizes that there are substantial methodological weaknesses in those studies. However, their main critique concerns the outcome measures and the lack of detail on the pedagogical interventions. The authors argue that there is too much focus on short-term subjective impact measures as opposed to long-term behavioural measures such as venture creation and performance. They also lobby for novel impact indicators related to, for example, affective measures such as emotion and mindset. Furthermore, in line with Martin et al. (2013), they call for more research to explain the contradictory findings of impact studies, for instance, by including person-, context- and model-specific moderators.
Thus, despite the increasing body of impact studies on EE, it appears that we still have scant knowledge on this matter. While there are several insightful indications about impact and outcomes in existing empirical studies, there are also rather ambiguous findings that require further investigation. Hence, in the remainder of this article, we first set out to explore the application of experimental research design in EE impact research. Subsequently, empirical studies with a strong experimental design are examined to establish what can actually be considered reliable knowledge about the impact of EE as a pedagogical intervention.
An SLR approach
This study is based on an SLR approach, which aims to make the literature search and review process transparent and replicable. According to Pittaway and Cope (2007) and Nabi et al. (2016a), SLRs have become a well-established methodological approach in the fields of both entrepreneurship and EE and are especially valuable when attempting to sum up evidence over long periods. Figure 1 documents the different stages of our SLR process, for which the starting point was our research objectives: first, to identify experimental impact studies on EE and, subsequently, to review extant knowledge on EE impact produced by rigorous studies with a strong experimental design.

Stages in the SLR process.
Our SLR is based on a journal-led search in selected peer-reviewed journals. While admittedly this approach may have certain limitations in terms of potentially excluding relevant articles outside the selected journals, it was necessary to ensure the feasibility of the SLR by generating hundreds rather than thousands of hits. It was also essential to target high-quality and impactful EE research; hence we followed Blenker et al. (2014) and Wang and Chugh (2014) in applying the ABS Academic Journal Quality Guide to identify journals, as the Guide provides an indication of the quality and impact of the scientific contribution of articles included in the listed journals. As EE is a research field at the interface between entrepreneurship and business and management education, the literature search included all journals in the ABS subject areas ‘entrepreneurship’ and ‘management development and education’. The journal searches were conducted using the databases Science Direct, Elsevier Scopus, ABI Inform and Business Source Complete for articles published up to and including December 2017. Journals that were not accessible through the databases were searched manually. Titles, abstracts and keywords were searched using the primary Boolean search term (‘entrepreneurship education’ OR ‘enterprise education’), and the secondary search term (‘impact’ OR ‘effect’ OR ‘outcome’ OR ‘learning’) was used for a full-text search to identify quantitative impact studies on EE. The first database search, after the removal of duplicates, resulted in 613 articles.
Subsequently, we reviewed titles, abstracts and the methodology sections of the articles and excluded those that did not meet the inclusion criteria for quantitative impact studies described in Figure 1. This process left 132 articles. While SLRs have advantages over traditional ad hoc narrative reviews in that they provide a set of clear steps to systematically generate evidence (Tranfield et al., 2003), a potential drawback is the risk of excluding relevant articles. Hence, as an additional measure to validate the search results and ensure that relevant publications had not been overlooked, we conducted independent literature searches. We also applied snowballing to identify other relevant ABS journals by searching the reference list of the other identified articles. Through this process we expanded our search to include in addition the European Economic Review and the Journal of Economic Behavior & Organization, which are included in the subject area ‘economics, econometrics and statistics’ in the ABS list.
After validation of the SLR search results, the final sample consisted of 145 articles that met the inclusion criteria for quantitative impact studies. These were coded according to the experimental research design category as described in the following section, and a subgroup of 17 articles that could be classified as rigorous experimental studies with a strong research design were accordingly subjected to a full-text analysis.
Analysis
Drawing on Blenker et al. (2014) and Wang and Chugh (2014), among others, we constructed a thematic reading guide for reviewing and coding the articles (see Appendix 1). The 145 quantitative studies were coded according to general information (author(s), year, title and journal) and the type of experimental design. If a study was classified as being either a true experiment or a quasi-experiment, it was further coded in accordance with the remainder of the reading guide by focusing on the outcome variables utilized and recording contextual variables stated in the studies, such as the characteristics of pedagogical intervention, sample characteristics and time frame.
The SLR applies three categories of experimental design following Cook and Campbell (1979) and Shadish et al. (2002): TED, QED and pre-experimental design (PED). Within these three categories, there are various types of experimental design. The ones that were used for coding impact studies in this SLR are depicted in Figure 2.

Types of experimental research design as described by Cook and Campbell (1979).
Experimental designs differ with respect to three characteristics: (1) whether the experiment makes use of control groups; (2) whether randomization into treatment and control groups is applied; and (3) whether the research design is longitudinal as opposed to cross-sectional. The upper half of Figure 2 illustrates the classic true experiment – the randomized pre-test–post-test control group design, in which all three of the above characteristics are present. Here, participants are randomly assigned to either a control group, C, or a treatment group, T, and thereafter are given a pretest OT1 or OC1 to ensure that the groups do not differ from the outset. Then group T undergoes treatment X (e.g. in the form of an EE course), while group C does not take part in the course. Afterwards, a post-test OT2 or OC2 is completed, and any difference between group T and C is assumed to be due to the treatment X. The lower half of Figure 2 exemplifies the design of the randomized pre-test–post-test control group design, together with other experimental designs relevant to EE impact research. 1
The reason for making use of control groups, randomization of participants and longitudinal design is to control for confounding variables that threaten internal validity. As the key objective of an impact study of education is to find evidence of a causal link between the education intervention and the observed outcomes, it is advisable to apply strong experimental research that controls for confounding variables and, thereby, to exclude alternative explanations and rival hypotheses for observed effects (Johnson and Christenson, 2012; Mertens, 2010). According to Johnson and Christenson (2012), TED and QED could consequently be considered strong experimental designs, while a PED is characterized as a weak experimental design. The presence of randomization, control groups and longitudinal design in TED controls for confounding variables such as history (when environmental events during an experiment influence the dependent variable), maturity (biological or psychological changes during an experiment due to the passage of time), testing (participants becoming test-wise post-test due to earlier pre-tests), mortality (participant drop-out during an experiment), statistical regression (when diverging scores of extreme groups regress towards the mean when testing is repeated) and selection (systematic differences between treatment and control groups due to self-selection) (Campbell and Stanley, 1963; Cook and Campbell, 1979). The randomized pre-test–post-test control group design and the randomized Solomon four-group design 2 shown in Figure 2 are accordingly considered to be strong experimental designs as they apply randomization, control groups and longitudinal design (Shadish et al., 2002), and findings based on a TED would consequently provide strong evidence of causal links between EE courses and entrepreneurial learning outcomes.
In many educational real-life settings, random assignment is not a realistic option. Following Cook and Campbell (1979), the quasi-experiment would then be the recommended design. The non-equivalent pre-test–post-test control group design is the most relevant QED in EE impact studies, as it enables comparison of EE and non-EE students. In this case, students attending an EE course would constitute the treatment group. The control group would comprise students not attending an EE course, but otherwise would be as similar to the student treatment group as possible. Without randomization, the internal validity of the design faces challenges in terms of selection, maturation, history and statistical regression (Shadish et al., 2002). Nonetheless, with the presence of both control groups and a longitudinal design, it can still be considered a strong experimental design with which it is reasonable to claim causality between an EE course and observed outcomes.
PEDs are considered to be weaker experimental research designs due to their limited control of potentially confounding variables (Johnson and Christenson, 2012; Shadish et al., 2002). The one-group post-test only design is considered to be the weakest among these alternatives. With this research design, students attending an EE course would take a post-test after finishing it. The design poses many threats to internal validity and has been referred to by Campbell and Stanley (1963: 5) as having ‘…such a total absence of control as to be of almost no scientific value’. The design is subject to threats of history, maturation and mortality as it does apply neither a control group nor a pre-test. The non-equivalent post-test only design introduces comparison groups, and the one-group pre-test–post-test design makes use of measurements before and after EE interventions. However, both research designs still face basic problems due to threatened internal validity. Thus, relying on a PED when attempting to address the impact of EE courses can be problematic in terms of claiming causality. Therefore, a TED or a QED should be the preferred alternative in quantitative impact studies on EE, and the following section presents the degree to which these rigorous experimental designs are being applied in EE impact studies.
Findings
Descriptive analysis
As noted above, the systematic search in ABS-listed journals resulted in 145 identified quantitative impact studies on EE. Figure 3 shows the journals in which these were published. The figure identifies two major outlets for quantitative impact studies on EE: Education and Training, which has published 38 articles, and Industry and Higher Education, with 20 published quantitative impact studies on EE.

Overview of ABS-listed journals that have published EE impact studies (n = 145).
The coding of the 145 quantitative impact studies revealed that only 17 articles were experimental studies with a strong design; that is, a TED or a QED. The remaining 128 quantitative impact studies were described as having a weak PED (see Figure 4). Among the studies, 28% had the weakest of the PEDs, the one-group post-test only design, while 28% had the non-equivalent post-test only design and 32% had a one-group pre-test–post-test design. Among the 17 experimental studies, four had a TED, while there were 13 quasi-experimental studies with a non-equivalent pre-test–post-test control group design. Hence, the analysis showed that only 11.7% of the quantitative impact studies met the standards for a strong experimental design.

Types of experimental design in EE impact studies (n = 145).
Figure 5 illustrates the increased amount of quantitative impact studies in recent decades and depicts the rather limited application of experimental design in comparison. Especially in the last 10 years, there has been a considerable yearly increase in the amount of impact studies. There has, however, not been corresponding growth in impact studies with a strong experimental design.

Twenty-one years of quantitative impact studies (1997–2017; n = 145).
The descriptive findings therefore point towards considerable challenges for impact research on EE. On a positive note, the amount of EE impact studies is increasing and there are high-quality journals in which this discussion is taking place. Nevertheless, the rigour of the research design is a substantial issue when building accumulated knowledge in the field. When only 11.7% of quantitative impact studies apply a strong experimental design, this has severe implications in terms of making inferences about EE impact.
Qualitative analysis
Entrepreneurial outcome measures
The findings from the analysis of the 17 identified studies applying a strong experimental design illustrate how conative outcomes in terms of entrepreneurial self-efficacy/feasibility and entrepreneurial intention are the most frequently applied outcome measures (Table 2). Of the 17 studies, 12 use either one or both of these as outcome variables. Cognitive outcomes such as knowledge and traits (six studies), as well as skill-based outcomes (seven studies), have also received attention. In terms of affective outcomes, seven studies apply attitude to entrepreneurship as an outcome variable, while subjective norm and passion/inspiration have received less attention. In fact, only two studies (Souitaris et al., 2007; Varamäki et al., 2015) make use of subjective norm to measure EE effect, while only Nabi et al. (2016b) and Gielnik et al. (2017) have recently addressed impact on entrepreneurial inspiration and entrepreneurial passion, respectively. With regard to actual entrepreneurial behaviour, the impact on nascency has been examined in only three studies (Gielnik et al., 2015; Karlsson and Moberg, 2013; Rauch and Hulsink, 2015), while actual venture creation remains almost unaddressed, with two honourable exception (Gielnik et al., 2015; 2017).
Overview of the 17 rigorous experimental impact studies on entrepreneurship education.
n.s.: non-significant; TED: true experimental design; QED: quasi-experimental design.
Although the majority of the 17 studies report a positive impact on the various outcome measures, the findings are still mixed – see Table 3 for a summary. 3 In terms of entrepreneurial knowledge, Volery et al. (2013), Gielnik et al. (2015) and Nabi et al. (2016b) find a positive impact of EE, while Huber et al. (2014) find no significant relationship. The findings are also mixed with regard to entrepreneurial traits. While Huber et al. (2014) report a positive impact on traits such as need for achievement, social orientation and proactivity, studies by Mentoor and Friedrich (2007), Oosterbeek et al. (2010) and Volery et al. (2013) all report non-significant impacts on traits such as the need for achievement, the need for autonomy, the need for power, endurance, risk propensity and innovation propensity.
Findings on outcome measures.
The impact on skills is, however, mainly positive, and EE is reported to have a positive impact on opportunity identification and exploitation (DeTienne and Chandler, 2004; Thursby et al., 2009; Volery et al., 2013); proactiveness and risk-taking (Huber et al., 2014; Sanchez, 2011, 2013); and analysing, motivating and creativity (Huber et al., 2014). However, Oosterbeek et al. (2010) report non-significant results on entrepreneurial skills.
The studies on entrepreneurial attitude are, with two exceptions (Souitaris et al., 2007; Varamäki et al., 2015), overwhelmingly positive regarding the impact of EE. Studies on other affective outcome measures, however, remain scarce. Nevertheless, two recent studies report a positive impact on entrepreneurial passion (Gielnik et al., 2017) and entrepreneurial inspiration (Nabi et al., 2016b), while Souitaris et al. (2007) establish a positive impact on subjective norm, in contrast to the non-significant and negative findings of Varamäki et al. (2015).
With regard to conative outcomes, nine studies report a positive impact on feasibility/perceived behavioural control/entrepreneurial self-efficacy. Souitaris et al. (2007), and Varamäki et al. (2015) are the only studies that present non-significant findings. The most equivocal results derive from the studies that address entrepreneurial intention: five report a positive impact, two found no significant difference (Nabi et al., 2016b; Volery et al., 2013), one found both non-significant and negative impacts depending on the pedagogics (Varamäki et al., 2015) and two even found a purely negative impact (Huber et al., 2014; Oosterbeek et al., 2010). By far the largest sample size is to be found in the study by Huber et al. (2014). Therefore, when summarizing the samples and results, we find the following distribution of EE impact on entrepreneurial intention: positive impact (n = 1099), non-significant impact (n = 446) and negative impact (n = 1897). Accordingly, although it is the most frequently applied outcome measure in impact studies, evidence of the actual impact of EE on entrepreneurial intention remains highly inconclusive.
Studies on actual entrepreneurial behaviour signal positive findings about entrepreneurial nascency (Gielnik et al., 2015; Rauch and Hulsink, 2015) and new venture creation (Gielnik et al., 2015, 2017). There is, however, a sample size issue here as the studies on nascency had a total sample size of only 224 and the studies on venture creation had a total sample size of 287.
Therefore, although the majority of studies report positive impacts, there are also several with non-significant findings and some even with a negative impact. Consequently, it becomes difficult to conclude anything on the basis of such equivocal findings, and this is a matter that is further complicated by the variety of contextual factors in the studies.
Contextual factor: Pedagogical interventions
The characteristics of the pedagogical interactions are diverse and indicate many gaps for further examination. The duration of the courses ranges from 2 weeks to 2 years. While the majority of studies examine EE interventions lasting between 3 months and 10 months, only one investigates the impact of a short intervention of 2–4 weeks (Huber et al., 2014). Moreover, only two studies look at EE lasting for more than an academic year – Thursby et al. (2009) study a 2-year programme, and Varamäki et al. (2015) followed a cohort through its first 2 years of a Bachelor’s degree course.
Furthermore, when separating the studies into the traditional categories of learning about, for and through entrepreneurship (Jamieson, 1984), it becomes evident that none of the pedagogical interventions can be categorized as learning about entrepreneurship. The 17 impact studies are evenly distributed between learning for entrepreneurship (nine studies) and through it (nine studies), 4 and no particular differences in terms of positive or negative impact can be observed between these in the SLR sample.
Contextual factor: Sample characteristics
Different sample characteristics could have a major impact on how a course is experienced by the participants and the effect of the EE intervention. The educational level of the EE participants is, for instance, a topic for further exploration. One example is EE impact on primary school students, as only one study addresses this (Huber et al., 2014). Of the remaining 16 experimental studies, 4 are about secondary school students, 3 concern postgraduate students and 9 examine the impact on undergraduate students. Whether or not a course is compulsory could also have an impact on its effect, and both categories are covered equally in the experimental impact studies.
Bae et al. (2014) show in their meta-analysis that cultural values serve as a moderator of the relationship between EE and entrepreneurial intentions. Hence, the cultural context is another important characteristic that can impact the effect of an EE course. Based on the 17 experimental impact studies, it appears that EE impact studies have predominantly been a Western European exercise (11 studies). There are, however, also three from Africa, two from the United States and one from Australia.
Contextual factor: Time frame
In the majority of the 17 experimental impact studies, the post-measurement time is immediately after the completion of the pedagogical intervention. Recent contributions by Volery et al. (2013), Rauch and Hulsink (2015), Gielnik et al. (2015) and Gielnik et al. (2017) have, however, also collected data several months after the intervention. Gielnik et al. (2017), for instance, combine measurement right after an EE course with measurements 12 and 28 months after course completion, thereby enabling longitudinal follow-up of development after an EE programme.
Discussion
The findings of this study show that the number of experimental impact studies has increased considerably in recent decades. Nevertheless, 88.3% of the studies can be classified as having a weak experimental design that does not really allow causal claims to be made. This is a major concern in a field that is rapidly expanding and is in search of legitimacy among stakeholders such as policymakers, sponsors and educational institutions (Fayolle et al., 2016). In fact, our SLR reveals that only 17 impact studies up to and including 2017 apply a strong experimental design either through TED or QED. Hence, there are not that many rigorous studies for policymakers and educators to draw on when making decisions regarding investments and the future development of EE. Obviously, several insightful qualitative studies on outcomes, as well as PED studies, provide a valuable understanding of relationships between variables. However, in a fast-moving field in which action and intervention are developing quickly, it is critical that the theory and research needed to justify and explain EE develop simultaneously. Our findings indicate that this has not been the case for strong experimental impact studies on EE. While this is also a challenge for both general and management education (Köhler et al., 2017), the issue is even more pronounced for EE as a young and emerging field. EE scholars are researching new and innovative education initiatives (often with small samples), while established education fields provide more stable conditions to undertake research.
In fact, the qualitative analysis indicates that there is still scant knowledge about the effects of EE as a pedagogical intervention. In general, the majority of the strong experimental impact studies point towards a positive relationship between participation in EE and cognitive, skill-based, affective, conative and behavioural outcomes. However, the SLR also identifies studies that report non-significant and even negative relationships between EE and the impact indicators. The few studies and the small sample sizes of the single studies further complicate the equivocal findings. For example, only 4 of the 17 studies have a treatment group of more than 200 students. This complicates the application of, for example, meta-analysis, which is a well-recognized approach to summarize effect by combining empirical studies on interventions. For instance, two recent meta-analyses by Bae et al. (2014) and Martin et al. (2013) had to include studies with a weak experimental design in order to draw conclusions. Hence, it is hard to draw categorical conclusions based on the sample of 17 articles, since their findings appear to point in several different directions, even when the same outcome variables are studied.
Furthermore, with mixed findings, low numbers of experimental studies and small sample sizes, we question whether findings are valid for other populations in different contexts. EE cannot be treated as a black box, and it is necessary to acknowledge the nuances of EE offered across the world, at different education levels and with quite diverse pedagogics. We agree with Rideout and Gray (2013: 348), who call for a larger pool of methodologically adequate EE studies in order to answer questions such as ‘What type of EE, delivered by whom, within which type of university, is the most effective for this type of student, with this kind of goals, and under these sets of circumstances?’. It is essential to acknowledge the diversity of EE interventions. A compulsory course about entrepreneurship theories offered to first-year Bachelor’s students in general business would obviously have a different impact than an elective course in an entrepreneurship Master’s in which students start their own companies. There is great variance in EE pedagogics and their impacts will most likely be quite different. By not treating EE as a black box, it will be possible to draw nearer to a more complex understanding of the actual impact of EE interventions.
Thus, the summary of experimental research findings in Table 2 defines important research gaps and points towards future research opportunities. For example, two Spanish impact studies by Sanchez (2011, 2013) concern compulsory courses for secondary and undergraduate students who learn for entrepreneurship throughout an 8-month pedagogical intervention. His findings show significant increases in intention, self-efficacy, proactiveness and risk-taking by EE students. However, when applying Table 2 to identify gaps, there is still much to be explored. Little is known about how Spanish students or those in neighbouring countries will develop during a self-selected elective course or through EE courses for primary education. Furthermore, we do not know anything about the potential long-term impact, affective outcome measures or whether EE actually results in entrepreneurial behaviour.
Numerous research gaps could be identified by applying Table 2 in this way. However, we especially want to draw attention to some particularly under-researched issues. For instance, there are no experimental impact studies on courses about entrepreneurship. All the identified studies concern learning for or through entrepreneurship. It is often claimed that learning about entrepreneurship does not impact on students, as opposed to the two other approaches (Honig, 2004; Neck and Greene, 2011). However, due to the absence of experimental impact studies on this pedagogical approach, there is no robustly researched knowledge to support this view. Moreover, only one study (Huber et al., 2014), from the Netherlands, reports on EE in primary education, which also remains a major research gap. There is also an over-representation of impact studies from Western European countries. Bae et al. (2014) found that the impact of EE is moderated by cultural values, and methodologically rigorous studies from, for example, Eastern Europe or Asia could provide interesting insights into how EE impacts students in other cultural settings.
Accordingly, our findings could serve as an overview of where rigorous EE impact studies are still needed. Furthermore, in line with Nabi et al. (2016a), we find that the majority of impact indicators are short-term, subjective impact measures. As the proof of the pudding is said to be in the eating, there is still major potential for examining long-term impacts such as venture creation and performance. Furthermore, the objective of EE is not necessarily only to increase start-up rates but also to develop the entrepreneurial mindset of students, which can then be used in, for example, existing companies and to enhance students’ employability. Thus, novel outcome measures such as intrapreneurial intentions, personal development, social entrepreneurship, employability and career decision-making could be fruitful indicators to advance our understanding of EE impact.
The mixed results from impact research also provide an interesting opportunity for further research in order to offer explanations for the equivocal findings. The scenario design by Nabi et al. (2016b) is, for example, an important contribution that sheds light on how the same EE intervention can have different impacts on different students. The suggestion by Von Graevenitz et al. (2010) of a sorting effect, where students become more confident about whether entrepreneurship is a suitable career path for them, also has potential for further exploration. Thus, a decrease in entrepreneurial intentions after EE is not necessarily negative if it is due to enhanced career maturity among participants.
Conclusion
The two objectives of this article are (1) to review the use of experimental research design in EE impact research and (2) to offer insights into the findings of impact studies that apply a strong experimental design through either TED or QED. In doing so, we hope to shed light on the value of applying a strong experimental design in EE impact research and to lay the foundation for a future research agenda. When it comes to the first research objective, the main finding from the study is that there is a substantial lack of strong experimental design in EE impact studies. Of 145 quantitative impact studies identified in ABS-listed ranked journals, only 17 have a TED or QED, accounting for 11.7% of the studies. Hence, 88.3% of quantitative impact studies can be characterized as having a weak experimental design. This lack of rigour has severe consequences for the possibility of making inferences and for the generalizability of existing research findings.
Furthermore, with regard to the second research objective, it is evident in the synthesizing of findings from the 17 rigorous impact studies that we still know rather little about the causal link between EE and entrepreneurial outcome measures. While the majority of impact studies indicate a positive impact, there are also studies with non-significant and even negative impacts on EE outcomes. Hence, based on the findings from the SLR, we call for more true and quasi-experimental studies that can provide robust findings on EE impact. There is a need for more research on the outcome measures identified in the SLR, but there is also potential for exploring novel impact indicators. Intrapreneurship, social entrepreneurship and employability are, for example, outcome measures that remain unexplored in rigorous experimental studies.
An expanding body of rigorous impact studies would also contribute to the development of a more fine-grained understanding of EE and the influence of contextual factors. Context matters in education and EE cannot be treated as a black box. More strong experimental impact studies on the variety of pedagogics, course durations and student samples would accordingly enhance understanding of the nuances of EE impact. As a result, one could get closer to answering important questions such as which pedagogics to apply for a certain group of students if the ambition, for example, is to increase nascent entrepreneurship.
Therefore, although there have been many important research contributions towards an understanding of the complex phenomenon of EE in recent decades, EE impact research has not yet delivered the required empirical findings to EE stakeholders. Teachers and educational institutions need robust evidence on which to base decisions as to when they introduce, execute and develop EE courses. Correspondingly, governments across the world are including EE in educational policies and investing heavily in the implementation of EE. They cannot be expected to continue to do so if EE research does not provide robust evidence of its impact. Hence, the EE research community should take a critical look at the research being conducted and strive to provide EE stakeholders with empirical evidence acquired through methodologically rigorous studies.
Like any methodology, the SLR has its limitations. We acknowledge that the decision to do a journal-led search will deliver different results to those of an open database search, as would the selection of other search strings. However, by searching impactful journals within EE research, our review highlights a fundamental problem in EE impact research: knowledge about the impact of EE as a pedagogical intervention is scarce. The quality of the research on EE impact is currently lagging behind the thriving development of EE at educational institutions worldwide. As EE continues to spread, it becomes increasingly important for research to justify and explain what is taking place during and after EE courses. For the future, the challenge for EE scholars is to do this with methodologically rigorous studies that can help EE to gain legitimacy both as an educational element and as a research field.
Footnotes
Acknowledgements
The authors express their gratitude to Professor Øivind Strand, Professor Åsa Lindholm Dahlstrand and Professor Henry Colette for insightful comments on earlier versions of this article. The authors also thank the anonymous referees for their valuable comments.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Notes
Appendix 1
| Reading guide | ||||||||
|---|---|---|---|---|---|---|---|---|
| 1. General information | ||||||||
| 1a. Author(s) | ||||||||
| 1b. Year of publication | ||||||||
| 1c. Article title | ||||||||
| 1d. Journal | ||||||||
| 2. Theoretical positioning | ||||||||
| 2a. Theoretical framework | ||||||||
| 3. Impact | ||||||||
| 3a. Impact measures | Cognitive |
Cognitive |
Skill-based |
Affective |
Affective |
Affective |
Conative |
Conative |
| Behavioural |
Behavioural |
|||||||
| 3b. Measurement items | ||||||||
| 3c. Reported impact | ||||||||
| 3d. CV | ||||||||
| 3e. Reported CV effect | ||||||||
| 4. Methodology | ||||||||
| 4a. Research design | TED: |
TED: |
QED: |
PED: |
PED: |
PED: |
||
| 4b. Data collection method | Questionnaire | Other: | ||||||
| □ | ||||||||
| 4c. Follow-up length | ||||||||
| 5. Sample characteristics | ||||||||
| 5a. Sample size | ||||||||
| 5b. Education field | Business | Science | Humanities | Social | Health | Education | ||
| □ | □ | □ | □ | □ | □ | |||
| 5c. Education level | Primary | Secondary | Tertiary | |||||
| □ | □ | □ | ||||||
| 5d. Country | ||||||||
| 6. Intervention characteristics | ||||||||
| 6a. Course description | About | For | Through | JA-YE | Other | |||
| □ | □ | □ | □ | |||||
| 6b. Compulsory | Yes | No | ||||||
| □ | □ | |||||||
| 6c. Duration | ||||||||
| 6d. Total work hours | ||||||||
| 6e. Team work | Yes | No | ||||||
| □ | □ | |||||||
| 7. Analysis | ||||||||
| 7a. Data analysis method | ||||||||
| 7b. Key findings | ||||||||
CV: control variable; TED: true experimental design; QED: quasi-experimental design; PED: pre-experimental design.
