Abstract
When previous research is cited incorrectly, misinformation can infiltrate scientific discourse and undermine scholarly knowledge. One of the more damaging citation issues involves incorrectly citing article content (called quotation errors); therefore, investigating quotation accuracy is an important research endeavor. One field where quotation accuracy is needed is in the learning sciences given its impact on pedagogy. An integral article in pedagogical discussions surrounding how to teach at the college level is the meta-analysis on active learning by Freeman et al. The Freeman et al. meta-analysis compared active learning to traditional lecture in terms of its effects on student learning and has been important in national initiatives on STEM (science, technology, engineering, and mathematics) reform. Given its influence coupled with the impact quotation errors could have in scientific discourse, we used citation context analysis to analyze whether assertions in the citing text that related to the efficacy of lecture and active learning were supported by what was explicitly stated in the cited meta-analysis. Assertions were analyzed under supported, unsupported, or irrelevant for purposes of study categories. The most prevalent supported category related to active learning being more effective than lecture; the most prevalent unsupported category related to the effectiveness of specific activities/approaches other than the general approach of active learning. Overall, the percentage of supported assertions was 47.67%, and the percentage of unsupported assertions was 26.01%. Furthermore, the percentage of articles containing at least one unsupported assertion was 34.77%. Proactive measures are needed to reduce the incidence of quotation errors to ensure robust scientific integrity.
Scholarly knowledge advances through the publication of scholarly research. Published articles are said to be derivative in nature in that they are connected to prior research investigations (Ziman, 1969). Authors support their own investigations by strategically citing those researchers who influenced their work (American Psychological Association, 2020). However, if current or prior research contains inaccuracies that are not corrected, scholarly knowledge can be undermined. For example, errors can be propagated unintentionally (Awrey et al., 2011; Parse, 1996) which may cause subsequent readers to become misinformed (Mogull, 2017; Schulmeister, 1998).
Citation Accuracy
One way to support scientific communication and integrity is through accurately citing the scholarly work of others (Gasparyan et al., 2015). If, however, cited articles are the subject of errors made through the citation process, this misinformation can infiltrate scientific discourse and the dissemination of incorrect or incomplete information can occur. Errors can spread quickly in the digital era and those that persist can become true and have serious ramifications that could affect the integrity of scholarship and investigation (Wilson, 2015). One could also imagine how errors that lead to misinformation could find their way into policy decisions, financial investments, funding opportunities, and other large-scale initiatives. Unfortunately, citation accuracy has been and continues to be an issue in journal publishing (e.g., Mogull, 2017; Onwuegbuzie et al., 2011; Santini, 2018).
Several types of errors can affect citation accuracy. 1 These errors may include issues with (a) bibliographic information (e.g., misspelling author names or incorrectly listing publication year), (b) reported data (e.g., misquoting effect size or sample size), (c) secondary versus primary sources (e.g., citing a review paper rather than the primary source), and (d) claims, paraphrasing, interpretations, or directly quoted statements (e.g., making unsubstantiated claims, changing important information through paraphrasing, misinterpreting the original results, or omitting/adding/changing words in a direct quote). Citation errors that relate to claims, paraphrasing, interpretations, and directly quoted statements are termed quotation errors and have a negative impact on quotation accuracy.
Quotation Accuracy
Quotation errors are serious because they can disseminate misinformation (Davids et al., 2010). The rate of these troubling errors tends to vary in the research literature. 2 For example, the quotation error rate was 6.7% in nursing (Schulmeister, 1998); 11.3% (Drake et al., 2013), 15.0% (Teixeira et al., 2013), and 18.3% (Todd et al., 2007) in ecology; 12.4% in manual therapy (Gosling et al., 2004); 14.5% (Mogull, 2017) to 15.0% in general medicine (de Lacey et al., 1985); 15.6% in veterinary medicine (Hinchcliff et al., 1993); 17% in otolaryngology/head and neck surgery (Fenton et al., 2000); 19.0% in anatomy (Lukić et al., 2004); 19.2% in physical geography (Haussmann et al., 2013); 20.0% in foot and ankle surgery (Luo et al., 2013); 24.0% in drug therapy (Neihouse & Priske, 1989); 24.2% in marine biology (Todd et al., 2010); and 38.0% in pediatric orthopedics (Davids et al., 2010). Additionally, in a systematic literature review and meta-analysis of the quotation accuracy in medical journal articles, Jergas and Baethge (2015) found the error rate to be approximately 25.0%. The variation in error rates across these disciplines could be due to the types of quotation errors examined (e.g., issues with claims, paraphrasing, interpretations, or directly quoted statements), the discipline of the articles being analyzed, or other methodological considerations (e.g., random sample vs. specific articles/journals targeted). Despite these varying error rates, the common thread is that quotation errors can and do occur when scholarly works are cited. Even in the scientific enterprise, error is a human phenomenon; therefore, pinpointing what may contribute to errors is key to improving the transmission of sound scientific knowledge (Brown et al., 2018). We should not take quotation accuracy for granted, as quotation errors can lead readers to have doubt about what they are reading (Jergas & Baethge, 2015), in addition to propagating misinformation that may be cited in future research endeavors (Bareket et al., 2020). Misrepresenting the content of a cited article has been said to be “one of the potentially most damaging violations of good academic referencing” (Harzing, 2002, p. 133).
An often-cited article in the literature on quotation accuracy is by Harzing (1995). Harzing argued that a commonly held belief among the management community of high failure rates of employees living and working outside their native countries (i.e., expatriates) was actually a myth that stemmed from three largely misquoted articles. Her deep reading of the management literature during her doctoral studies led her to question pervasive statements about expatriate failure rates being very high as she did not observe corresponding empirical data in the literature. She traced the myth of expatriate failure rates back to the main source articles by examining references in recent articles on the topic and working her way back through the literature (i.e., examining citation chains). In a subsequent article, Harzing (2002) lamented that her 1995 article had done little to correct the way that expatriate failure rates continued to be discussed, indicating that it may be difficult to change inaccurate disciplinary beliefs once they have been established. In the spirit of Harzing (1995), Sanz-Martín et al. (2016) explored the chain of citations in a body of research literature to question the “fact” that jellyfish blooms were increasing globally. When citing and cited papers were analyzed, the data did not support globally increasing blooms but did support some regional trends. Sanz-Martín et al. found that almost half the assertions made were not supported by the cited articles, and they identified one review paper that had an obvious influence over subsequent papers. Relatedly, Korpela (2010) found that even when primary articles have been publicly retracted, affirmative citations of these works could continue for years.
Quotation Accuracy in the Learning Sciences: An Examination of Active Learning
Given the seriousness of misrepresenting the content of a cited article, one important field in need of an examination of quotation accuracy is the learning sciences due to its impact on pedagogy and student academic success. How to teach (i.e., instructional approaches) is an important question of study and discussion in this interdisciplinary field. In particular, an instructional approach that has gained traction in the learning sciences is active learning. Active learning has been defined as “instructional activities involving students in doing things and thinking about what they are doing” (Bonwell & Eison, 1991, p. iii). This approach is often considered alternative to the traditional lecture method, which may be a factor in why active learning initially gained popularity. Traditional lecture is a prominent method used in college courses (Stains et al., 2018) but has been criticized as being passive in nature with less student learning than more actively engaging methods (Deslauriers et al., 2019). Some opponents of lecture go so far as to claim that the lecture method is an unethical practice (Wieman, 2014) and to ask why lecture has refused to go away (Pickles, 2016).
When discussing active learning and/or active learning versus traditional lecture in higher education, one highly influential review article often cited is the Freeman et al. (2014) meta-analysis. The number of citations an article has is often used as a measure of how large a role it plays in scientific discourse (Lopresti, 2010). Based on citation data drawn from Web of Science: Core Collection (WOSCC) and Scopus, the Freeman et al. meta-analysis has been cited at least 2,500 times (as of August 8, 2020). In addition to being highly cited, Freeman et al. noted their paper was “the largest and most comprehensive meta-analysis of the undergraduate STEM education literature to date” (p. 8412). Their meta-analysis included 225 studies comparing student performance under active learning versus traditional lecture conditions in science, technology, engineering, and mathematics (STEM) undergraduate courses. Freeman et al. defined active learning as engaging “students in the process of learning through activities and/or discussion in class, as opposed to passively listening to an expert. It emphasizes higher order thinking and often involves group work” (pp. 8413–8414). Their results showed that student performance was, on average, almost half a standard deviation higher with active learning compared with lecture and that failure rates were about 1.5 times higher in traditional lecture courses compared with those with active learning.
When Freeman et al. (2014) is cited, it is often used as support for the adoption of active learning and the reduction or elimination of lecture in STEM courses. The meta-analysis has been cited in national and university-level initiatives and grants aimed to transform college courses to include active learning (e.g., Association of American Universities, 2017; Chasteen, 2016). The citing of Freeman et al. in these initiatives and grants is not surprising, as systematic reviews are considered a useful tool to inform policymakers and practitioners about what does and does not work especially when it comes to making decisions for public policy and practice (Gough et al., 2013). Beyond these initiatives, those involved in the institutional transformations taking place at colleges and universities across the country often cite Freeman et al. as support for their adoption of active learning.
Professional organizations and media outlets have also promoted the Freeman et al. (2014) meta-analysis and have questioned the value of lecture. For example, Bajak’s (2014) article in Science was titled, “Lectures Aren’t Just Boring, They’re Ineffective, Too, Study Finds.” In this article, physicist Eric Mazur of Harvard University discussed the importance of the Freeman et al. findings stating that “the impression I get is that it’s almost unethical to be lecturing if you have this data” and there is “an abundance of proof that lecturing is outmoded, outdated, and inefficient” (para. 4). A news release from the National Science Foundation (2014) was titled, “Enough With the Lecturing.” One comment made by Scott Freeman, lead author of the meta-analysis on active learning, in this news release was, “We’ve got to stop killing student performance and interest in science by lecturing” (para. 12). Lederman (2014) noted in “A Boost for Active Learning” in Inside Higher Education that students are more successful in their courses when instructors use methods other than lecturing. In each of the aforementioned articles, the Freeman et al. meta-analysis was cited as evidence as to why changes were needed in higher education STEM courses. Prominent researchers such as Carl Wieman (2014; Nobel Laureate in Physics) have also made hard-hitting claims against lecture, saying that lecture is “the pedagogical equivalent of bloodletting” (p. 8320). He also said that the Freeman et al. meta-analysis makes a powerful case for adopting active learning in our teaching practices as well as for redirecting our approach to further research on active learning (e.g., dropping traditional lecture as the standard for comparison).
Despite the positive press surrounding the Freeman et al. (2014) meta-analysis, several researchers caution the conclusions that have been drawn from this study. For example, Cooper and Stowe (2018) reported that because the data were not disaggregated in the meta-analysis, it is impossible to draw any conclusions about the effectiveness of specific approaches that are categorized as active learning; one can only say the umbrella category of active learning is more effective than the umbrella category of traditional lecture. These more general categories of active learning and traditional lecture often conceal which variables actually affect student learning such as the pedagogical methods used, the amount of time devoted to these methods, and the instructor and student characteristics in these courses (Bernstein, 2018). Thus, it is unwarranted to eliminate lecture and adopt active learning when it is unclear as to which specific aspects of both general approaches are effective or ineffective for student learning (Zakrajsek, 2018).
Purpose of the Present Study
Despite the fact that issues with the Freeman et al. (2014) meta-analysis have been brought to light, the prominence of the meta-analysis as support for changing courses from being lecture-based to active learning–based is quite evident throughout the literature, across institutions of higher education, and in large-scale initiatives. Further evidence of how instrumental the meta-analysis is relates to its very nature (i.e., a meta-analytic review paper)—that is, review papers have the potential to shape the field (Murphy et al., 2017) in a positive or negative direction depending on the accuracy of how they are quoted. In light of the (a) influence of the Freeman et al. meta-analysis, (b) concerns that have been raised about the meta-analysis, and (c) issues that can jeopardize quotation accuracy due to incorrect interpretations, we made the Freeman et al. meta-analysis the focus of our quotation accuracy study. The purpose of this study was to examine the quotation accuracy of those articles that cited Freeman et al. by analyzing whether the claims, paraphrasing, interpretations, and directly quoted statements (herein, these terms are collectively called assertions) made in these citing articles were supported by what was explicitly stated in the meta-analysis. With regard to direct quotes, we examined these statements based on their content rather than examining their word-to-word matching with the primary text. The specific analysis we used to analyze assertions is called a citation context analysis; this methodology allows for the examination of how citing authors have referred to the work of others by analyzing the text surrounding a citation (Small, 1982).
We focused our efforts on examining those assertions made about active learning or lecture’s efficacy (i.e., assertions related to the impact of active learning or lecture on student outcomes) given the meta-analysis’ influence on moving instruction away from lecture and toward a model of active learning. These assertions could have serious ramifications with regard to course transformations, policy decisions, initiatives, and future research endeavors and thus needed to be assessed on whether or not they could be explicitly traced back to text in Freeman et al. (2014). We did not categorize and analyze assertions in our citation context analysis that were outside of our efficacy focus beyond labeling them irrelevant for purposes of this study. Irrelevant assertions included, for example, the definition of active learning, the statistical procedures used by Freeman et al., and generic overview statements about meta-analyses in education.
We had three main research questions in this citation context analysis. First, what was the percentage of assertions made by citing articles that could or could not be directly traced back to the Freeman et al. (2014) text (i.e., assertions that were or were not supported by Freeman et al.)? Second, given that Freeman et al. is cited across various disciplines, what was the percentage of supported and unsupported assertions across these different disciplines? Third, given that Freeman et al. was published in 2014, what was the percentage of supported and unsupported assertions across time?
Method
Search Procedure
To examine quotation accuracy, we conducted a citation context analysis of the Freeman et al. (2014) meta-analysis. The second author searched the two main bibliographic databases that provide citation information, WOSCC and Scopus, on June 13, 2019, to obtain articles that cited Freeman et al. These citing documents were limited to primary articles or reviews; thus, all other document types such as editorials, book chapters, letters, meeting abstracts, news stories, and proceedings papers were excluded. Results were saved in an EndNote library, and duplicates were removed using a validated method developed for systematic reviews (Bramer et al., 2016). The resulting data set included 1,124 articles. Due to 29 articles being written in a language other than English and five articles citing the meta-analysis in the reference section and not in the main text, the final data set included 1,090 articles (thus, a total of 34 articles was excluded).
Coding Information
The first, fifth, sixth, and seventh authors divided the 1,090 articles into four sets, with two sets containing 273 articles and other the two sets containing 272 articles. Each of these authors took one of these sets and searched each article PDF for the citing text—that is, text containing a Freeman et al. (2014) citation or text following the citation that continued the discussion of the meta-analysis. This citing text was copied and pasted into a shared Word document with columns for article ID, author(s), year published, journal title, journal discipline, and citing text. Text before and after the citing text was also copied and pasted to provide contextual information. Once all of the articles were searched and the citing text was found, the seventh and eighth authors each went through half of the articles (545 articles each) and ensured no citing text was overlooked in each article and that the copied and pasted text included the correct meta-analysis citation.
To determine quotation accuracy and whether or not the citing text included assertions related to the efficacy of lecture and active learning, overarching assertion categories were developed that covered possible supported and unsupported assertions (see Table 1 for these assertion categories). Determining these categories occurred in four steps. In Step 1, the first, fifth, sixth, and seventh authors read the Freeman et al. (2014) meta-analysis and identified all text related to the efficacy of active learning and/or lecture presented in the Introduction, Method, Results, and Discussion sections that, if cited, would be deemed supported (i.e., the assertion made in the citing text could be directly traced back to the information contained in the meta-analysis). Many of the statements about active learning and lecture throughout the sections of the meta-analysis were classified under a common category. For example, the meta-analysis included many statements about how active learning affected student performance compared with lecture, so we had an early category labeled active learning’s effectiveness. If a citing article stated that active learning was more effective than lecture and cited the meta-analysis, we could identify the accompanying text from the meta-analysis to which we could refer based on the assertion category active learning’s effectiveness. If any of the assertion categories overlapped and could be integrated, we combined them into one category.
Illustrative examples of assertion categories
Note. STEM = science, technology, engineering, and mathematics.
In Step 2, we developed assertion categories related to our efficacy focus that were unsupported (i.e., the assertion made in the citing text could not be substantiated by the information presented in the meta-analysis). To develop these unsupported assertion categories, the first, fifth, sixth, and seventh authors examined the assertions that fell under the supported assertion categories and listed the ways in which these assertions could be misconstrued. To illustrate this, Freeman et al. (2014) found that active learning was more effective than lecture, but they did not find nor did they state that lecture was ineffective. Rather, they found and stated that overall active learning was more effective than lecture. It could be the case that lecture is effective when compared with other instructional approaches, for example. If a citing article stated that lecture is an ineffective approach and cited the meta-analysis, we created a category labeled lecture is ineffective and deemed it unsupported.
In Step 3, the first, fifth, sixth, and seventh authors read a randomly chosen set of 40 articles from our data set and coded the citing text using our previously identified supported and unsupported assertion categories. If any of the citing text did not fall under one of our assertion categories but was related to instructional efficacy, additional categories were developed.
In Step 4, the third and fourth authors examined the developed assertion categories and compared them with the Freeman et al. (2014) text to ensure what was deemed supported or unsupported was in fact correct (no issues were identified; i.e., quotations were accurate). The final assertion categories for supported included (a) active learning is more effective than lecture/increases student learning (labeled with an S1 category code); (b) lecture is not the best approach to teaching/needs to be abolished or left behind in favor of active learning (labeled with an S2 category code); and (c) second-generation research should be conducted (move beyond using lecture as control condition) (labeled with an S3 category code; see Table 1 for illustrative examples of the three supported categories). The final assertion categories for unsupported included (a) specific activities/approaches (e.g., group discussions, flipped classes, inquiry-based learning) other than the general approach of active learning are effective (labeled with a US1 category code); (b) lecture is ineffective (labeled with a US2 category code); (c) active learning is beneficial for specific populations/course topics (e.g., minorities, women, genetics material, calculus; labeled with a US3 category code); and (d) active learning improves measures above and beyond learning/retention (e.g., motivation, engagement, attitude, interpersonal skills; labeled with a US4 category code; see Table 1 for illustrative examples of the four unsupported categories). Assertions made about the meta-analysis that did not fall under any of these categories were labeled irrelevant for purposes of this study.
It should be noted that unless the citing authors specified what they meant by learning and retention and their specifications were outside of the student outcomes measured in the meta-analysis, we deemed the general use of these terms as supported. The coding of assertions as unsupported was conservative; that is, we attempted to keep subjective interpretation of the citing text to a minimum if the text was too general or nonspecific for us to say that the information it contained was not found in the meta-analysis. As previously stated, our focus was on quotation accuracy—what was explicitly stated and measured in the meta-analysis and whether the citing text could be directly traced back to this information.
Interrater Agreement
With the three supported assertion categories and four unsupported assertion categories, the first and third author coded all citing text. The coding process entailed tagging separate assertions as falling under one or more of the seven categories or falling under the irrelevant category using the category codes S1, S2, S3, US1, US2, US3, and US4, or irrelevant that were mentioned above. The first author served as the primary coder and the third author served as the secondary coder. To ensure accurate coding of assertions in the citing text, several steps were taken before the categorization of the assertions was compared with the secondary coder’s categorization of the assertions. These steps included the following.
First, the fifth (Rater 1), sixth (Rater 2), seventh (Rater 3), and eighth (Rater 4) authors were initially trained on the coding system (led by the primary coder) by reviewing citing text within 25 articles and comparing their identification and subsequent categorization of the separate assertions made in the text they reviewed. Second, disagreements were discussed, and another set of 25 articles were reviewed and discussed. Third, after this training period, the raters were assigned their official set of articles and subsequent citing text to code. Their assigned set was determined at random and counterbalanced to ensure two raters coded the assertions in the citing text. Through the process of categorizing the separate assertions made in the citing text in each article, the primary coder and raters determined there to be 1,630 separate assertions across the 1,090 articles. Raters 1 through 4 coded 48.98%, 62.36%, 39.15%, and 49.87% of these 1,630 assertions, respectively; the primary coder coded 100% of these 1,630 assertions. Thus, the assertions in the citing text had been categorized three separate times (by two raters plus the primary coder).
Fourth, the three category codes (one from the primary coder and one from each of two raters) for each assertion were then compared with each other. Interrater agreement was calculated as a percentage of agreement and as a Cohen’s kappa. Finally, after the primary coder’s and raters’ categorizations of the assertions were compared, any coding disagreements between the primary coder and at least one of the raters were resolved by the primary coder. These final decisions on the categorization for each assertion would then be compared with the secondary coder’s independent categorization decision for each assertion.
As shown in Table 2, agreement percentages between raters ranged from 77.82% (Raters 2 and 4) to 91.40% (Raters 1 and 3). Agreement percentages between the primary coder and the raters ranged from 69.00% (with Rater 3) to 82.76% (with Rater 2). Cohen’s kappa levels between raters ranged from 0.61 (Raters 2 and 4) to 0.87 (Raters 1 and 3). Cohen’s kappa levels between the primary coder and the raters ranged from 0.55 (with Rater 3) to 0.59 (with Rater 2). These kappa values fell into the Landis–Koch moderate agreement, substantial agreement, and almost perfect agreement levels (Landis & Koch, 1977). Additionally, Krippendorff’s alpha was calculated for level of agreement across the primary coder and two of the raters. The range of agreement was from 0.59 (with Raters 2 and 4) to 0.67 (with Raters 1 and 4).
Interrater agreement percentage (%), Cohen’s kappa (k), Krippendorff’s alpha (α), and percentage of assertions mutually coded (MC)
As previously stated, once the primary coder and two raters had agreed on the categorization of the assertions in the citing text, the primary coder compared them with the secondary coder’s independently determined categorizations (see Table 2). Percentage of agreement was 95.50% with an obtained Cohen’s kappa of 0.92 (almost perfect agreement level). Any disagreements between the primary and secondary coders were discussed. Agreements were reached on 86.00% of these disagreements; the unresolved disagreements were related to what the citing authors meant by retention. After discussion, a decision was reached to use the conservative method (as mentioned previously) of not counting the use of retention as unsupported unless it was defined in such a way that could not be traced back to Freeman et al. (2014). After this discussion, there was a final agreement on 99.38% of the categorizations. The primary coder resolved any remaining disagreements.
Data Analysis
Data were organized and analyzed at two quotation accuracy levels. We conducted analyses at the (a) assertion level (n = 1,630) and (b) article level (n = 1,090). At the assertion level, the assertions under each assertion category were added together for an overall category total. At the article level, any article that had multiple assertions made about the meta-analysis was assessed for whether the assertions fell under the same category. If this was the case, the assertion category was only counted once for this analysis. For example, if an article had three assertions made about the meta-analysis and the first was coded as S3, the second was coded as US1, and the third was coded as US1, this article would be counted once under S3 and once under US1. Thus, articles could fall under more than one assertion category (under S3 and US1 in the above example). At both of these levels, we examined the frequency of the three supported assertion categories and the frequency of the four unsupported assertion categories in general and across journal disciplines and publication years.
We also conducted an additional analysis at the article level: Each of the 1,090 articles could have assertions made about the meta-analysis where (a) all assertions in the article fell under a supported assertion category, (b) all assertions in the article fell under an unsupported assertion category, (c) all assertions in the article fell under the irrelevant category, or (d) the assertions in the article fell under multiple assertion categories. For the purpose of this analysis, we focused on articles where one or more assertions fell under only the supported assertion categories, 3 only the unsupported assertion categories, 3 or under both supported and unsupported assertion categories. 3
Results
Supported and Unsupported Assertions: Assertion Level
The frequency of supported and unsupported assertion categories across the 1,630 assertions is presented in Table 3. The most prevalent supported assertion category was Active learning is more effective than lecture/increases student learning (assertions with an S1 code = 43.50%). The most prevalent unsupported assertion category was Specific activities/approaches (e.g., group discussions, flipped classes, inquiry-based learning) other than the general approach of active learning are effective (assertions with a US1 code = 16.93%). The total percentage of assertions that were supported by the meta-analysis text was 47.67% (n = 777). The total percentage of assertions that were not supported by the meta-analysis text was 26.01% (n = 424).
Frequency of assertion category codes across all assertions and articles
Note. Irrelevant code = 429 assertions (26.32%).
An assertion could fall under more than one assertion category.
Supported and Unsupported Assertions: Article Level
The frequency of supported and unsupported assertion categories across the 1,090 articles is presented in Table 3. The most prevalent supported assertion category was Active learning is more effective than lecture/increases student learning (articles with one or more S1 code(s) = 54.22%). The most prevalent unsupported assertion category was Specific activities/approaches (e.g., group discussions, flipped classes, inquiry-based learning) other than the general approach of active learning are effective (articles with one or more US1 code(s) = 23.21%).
Out of the 1,090 articles, 531 (48.72%) had assertions that were categorized under only one or more of the supported assertion categories 3 ; thus, these articles did not contain any unsupported assertions. The number of articles that had assertions categorized under only one or more of the unsupported assertion categories 3 was 289 (26.51%); thus, these articles did not contain any supported assertions. The number of articles that had assertions categorized under only the irrelevant assertion category was 180 (16.51%). Finally, the number of articles with at least one supported assertion and at least one unsupported assertion 3 was 90 (8.26%). Thus, 379 articles (34.77%) contained one or more assertions related to our efficacy focus that were not supported by the meta-analysis text.
Supported and Unsupported Assertions: Journal Disciplines
Using the same two levels of analysis discussed previously (i.e., assertion and article levels), the frequency of the seven assertion categories was compared based on the discipline of the journal in which each article was published. The disciplines included Science, Technology, Engineering, Mathematics, STEM, non-STEM, General Education, Transdisciplinary, and Difficult to Determine (i.e., journal title in language other than English).
Assertion Level
The frequency of supported and unsupported assertion categories across journal disciplines for the 1,630 assertions is presented in Table 4. The most prevalent discipline from which assertions made about Freeman et al. (2014) came was Science (899 assertions) and the least prevalent was Transdisciplinary (three assertions). Transdisciplinary and Difficult to Determine had only three and four assertions, respectively; therefore, these two discipline categories were not discussed in the following results. The most prevalent supported assertion category across journal disciplines was Active learning is more effective than lecture/increases student learning (assertions with an S1 code ranged from 30.60% [Technology] to 51.15% [Engineering]). The most prevalent unsupported assertion category across journal disciplines was Specific activities/approaches (e.g., group discussions, flipped classes, inquiry-based learning) other than the general approach of active learning are effective (assertions with a US1 code ranged from 13.74% [Engineering] to 18.66% [Technology]). Mathematics had the highest percentage of supported assertions (57.14%), and Technology had the lowest percentage of supported assertions (35.82%); Technology had the highest percentage of unsupported assertions (29.10%), and Mathematics had the lowest percentage of unsupported assertions (22.08%).
Frequency of assertion category codes across all assertions and articles based on journal discipline
Note. STEM = science, technology, engineering, and mathematics.
An assertion could fall under more than one assertion category.
Article Level
The frequency of supported and unsupported assertion categories across journal disciplines for the 1,090 articles is presented in Table 4. The most prevalent discipline from which articles came was Science (600 articles), and the least prevalent discipline was Transdisciplinary (two articles). Transdisciplinary and Difficult to Determine had only two and four articles, respectively; therefore, these two discipline categories were not discussed in the following results. The most prevalent supported assertion category across journal disciplines was Active learning is more effective than lecture/increases student learning (articles with at least one S1 code ranged from 40.70% [Technology] to 65.38% [Mathematics]). The most prevalent unsupported assertion category was Specific activities/approaches (e.g., group discussions, flipped classes, inquiry-based learning) other than the general approach of active learning are effective (articles with at least one US1 code ranged from 17.53% [Engineering] to 25.58% [Technology]). Similar to the assertion-level analysis, Mathematics had the highest percentage of articles with supported assertions, and Technology had the lowest percentage of articles with supported assertions; Technology had the highest percentage of articles with unsupported assertions, and Mathematics had the lowest percentage of articles with unsupported assertions.
Supported and Unsupported Assertions: Publication Years
Using the same two levels of analysis discussed previously, the frequency of the seven assertion categories was compared based on the year in which each article was published. The publication years included 2014, 2015, 2016, 2017, 2018, and half of 2019 (January 1 to June 13).
Assertion Level
The total number of assertions and the percentage of total supported and unsupported assertions are presented in Figure 1. Considering the data from the 4 complete years (i.e., 2015–2018) of analysis, the following can be concluded. There was a general increasing trend for the number of total assertions made across years (note that 2019 was only examined from January 1 to June 13). There was no consistent, clear trend for either percentage of total supported and unsupported assertions made across the years.

Percentage of total supported and unsupported assertions (line graph, left scale) and total number of assertions (bar graph, right scale) across years.
The frequency of supported and unsupported assertion categories across publication years for the 1,630 assertions is presented in Table 5. The most prevalent year for assertions to be made about the meta-analysis was 2018 (468 assertions) although 2019 was only examined for part of the year and already had 276 assertions. The least prevalent year was 2014 (26 assertions); however, it is important to note that the meta-analysis was first published May 12, 2014. Although the number of assertions differed across publication years, the most prevalent supported assertion category across publication years was Active learning is more effective than lecture/increases student learning (assertions with an S1 code ranged from 40.81% [2018] to 73.08% [2014]). Overall, the highest percentage of supported assertions was in 2014 (76.92%), while the lowest percentage of supported assertions was in 2018 (44.87%). If 2014 were excluded due to only having a total of 26 assertions, the highest number of supported assertions would be in 2016 (50.53%). The most prevalent unsupported assertion category across publication years was Specific activities/approaches (e.g., group discussions, flipped classes, inquiry-based learning) other than the general approach of active learning are effective (assertions with a US1 code ranged from 0.00% [2014] to 20.51% [2018]). Overall, the highest percentage of unsupported assertions was in 2018 (29.27%) and the lowest percentage of unsupported assertions was in 2014 (3.85%). If 2014 were excluded due to only having a total of 26 assertions, the lowest percentage of unsupported assertions was in 2015 (22.64%).
Frequency of assertion category codes across all assertions and articles based on year published
An assertion could fall under more than one assertion category.
Article Level
As shown in Figure 2, the total number of articles containing assertions had an increasing trend with the exception of 2019 (for the reason explained previously). Considering the data from the 4 complete years (i.e., 2015–2018) of analysis, the following can be concluded. There was an increasing trend for US1 (Specific activities/approaches other than the general approach of active learning are effective) and a slight decreasing trend for S3 (Second generation research should be conducted [move beyond using lecture as a control condition]). However, there was no clear trend for the percentage of other supported and unsupported categories across the years.

Percentage of supported (S1, S2, and S3) and unsupported (US1, US2, US3, and US4) assertions (line graph, left scale) and total number of articles (bar graph, right scale) with assertions across years.
The frequency of supported and unsupported assertion categories across publication years for the 1,090 articles is presented in Table 5. The most prevalent year for articles with assertions made about the meta-analysis was 2018 (327 articles) although 2019 was only examined from January 1 to June 13 and had 182 articles. The least prevalent year was 2014 (21 articles). Although the number of assertions differed across publication years, the most prevalent supported assertion category across publication years was Active learning is more effective than lecture/increases student learning (articles with at least one S1 code ranged from 50.55% [2019] to 76.19% [2014]). The most prevalent unsupported assertion category across publication years was Specific activities/approaches (e.g., group discussions, flipped classes, inquiry-based learning) other than the general approach of active learning are effective (articles with at least one US1 code ranged from 0.00% [2014] to 28.13% [2018]).
Discussion
Citation accuracy has been and continues to be an issue in journal publishing (e.g., Mogull, 2017; Onwuegbuzie et al., 2011; Santini, 2018). A danger in inaccurately quoting scholarly works is the potential for misinformation to infiltrate scientific discourse. Errors can affect the integrity of scholarship and investigation (Wilson, 2015) and could potentially influence policy and practice. Thus, given how highly cited the Freeman et al. (2014) meta-analysis is in the learning sciences and how it has been used to support the adoption of active learning and the reduction or elimination of lecture in STEM courses, we investigated whether the assertions made in the citing text were or were not supported by the meta-analysis. Using citation context analysis, we examined citing text that related to the efficacy of active learning and/or lecture given the strong advocacy for the adoption and implementation of active learning in college courses. The findings from this analysis have provided insight into quotation accuracy when citing Freeman et al. as well as a more comprehensive picture of the scientific discourse surrounding active learning and instructional efficacy.
Supported Assertions Across All Assertions and Articles
The most prevalent supported assertion category was Active learning is more effective than lecture/increases student learning (assertions with an S1 category code = 43.50% [assertion level], articles with at least one S1 category code = 54.22% [article level]). Assertions in the citing text that fell under this category generally contained information from Freeman et al. (2014) regarding how much better students did under active learning than under lecture (or subsequently, how much higher failure rates were under lecture vs. active learning). Given that the primary results from the meta-analysis were about the differential effects of active learning and lecture on student learning, it is understandable that the most common supported assertions in the citing text would be related to the main findings of the meta-analysis. Furthermore, the meta-analysis was the largest and most comprehensive meta-analysis on active learning in STEM education and thus the main findings were that much more important and interesting for those in the learning sciences and other education-related disciplines.
Though not as prevalent, articles did include assertions that fell under the supported assertion category of Lecture is not the best approach to teaching/needs to be abolished or left behind in favor of active learning (assertions with an S2 category code = 3.13% [assertion level], articles with at least one S2 category code = 4.59% [article level]). Given that lecture has been the subject of much criticism and there has been advocacy to adopt alternative approaches, the points made in Freeman et al. about questioning the continued use of lecture in light of the effects of active learning were used by a number of articles in the assertions they included about the meta-analysis. One possible reason this supported assertion category (S2) was less frequent than the supported assertion category that related to the effectiveness of active learning versus lecture (S1) could be that moving away from lecture and toward active learning is inherent in the main findings of the meta-analysis. One need not cite the specific points made in Freeman et al. about favoring active learning over lecture if the main findings can be cited that provide evidence of active learning’s impact on student learning as compared with the traditional lecture method.
Articles also included assertions that fell under the supported assertion category of Second generation research should be conducted (move beyond using lecture as control condition) (assertions with an S3 category code = 1.72% [assertion level], articles with at least one S3 category code = 2.20% [article level]). This category was not as common as the other two supported assertion categories as it has less to do with the main findings of the meta-analysis and more to do with future research directions. Thus, the recommendation in Freeman et al. about conducting second generation research may have been more relevant to researchers currently conducting or interested in conducting research in the active learning realm than to those looking to the meta-analysis for evidence of which instructional approach is more effective than the other.
Unsupported Assertions Across All Assertions and Articles
Shifting the focus to unsupported assertion categories, the overall percentage of assertions that were not supported by the Freeman et al. (2014) meta-analysis was 26.01%. This percentage is consistent with the systematic literature review and meta-analysis of the quotation accuracy in medical journal articles (Jergas & Baethge, 2015). In our additional analysis at the article level, the overall percentage of articles that included at least one unsupported assertion was 34.77%. Thus, approximately one out of every three articles contained assertions that were not supported by Freeman et al. This article-level analysis is not conducted as often as the assertion-level analysis, and as such, it is more difficult to determine if this percentage is consistent with much of the quotation accuracy literature. However, based on the article-level error rate in Todd et al. (2010) and Teixeira et al. (2013)—26% and 41%, respectively—the rate of unsupported assertions we identified at the article level appears to be within the range of these other studies.
The most common unsupported assertion category was Specific activities/approaches (e.g., group discussions, flipped classes, inquiry-based learning) other than the general approach of active learning are effective (assertions with a US1 code = 16.93% [assertion level], articles with at least one US1 code = 23.21% [article level]). In these assertions, activities as diverse as Twitter, group discussions, clicker questions, modules, and inquiry exercises were deemed effective by citing Freeman et al. (2014) as support. Furthermore, approaches such as problem-based learning, flipped learning, inquiry-based learning, and cooperative learning were identified as approaches that improve student learning by citing Freeman et al. as evidence. However, Freeman et al. contained two groupings: traditional lecture and active learning. Pulling out a specific approach or activity that could fall under active learning and deem it effective due to the results found in Freeman et al. would not be supported by those results. Different forms of active learning were not compared with one another nor were specific forms of active learning compared with lecture. Thus, the aggregation of effect sizes across a variety of active learning studies creates a generalization but not a specification. In other words, we can generally say that active learning (in general) is more effective than lecture (in general), but we cannot specifically say that cooperative learning, for example, is more effective than lecture. Misconceptions can be started if researchers are publishing papers that say, for example, “Freeman et al. (2014) showed that flipped courses led to improved student learning” or that “Twitter use in the hard sciences can impact learning, engagement, and relationships” while citing the Freeman et al. meta-analysis. Illustrative examples from articles included in our analyses are shown under US1 in Table 1.
The second most common unsupported assertion category was Active learning improves measures above and beyond learning/retention (e.g., motivation, engagement, attitude, interpersonal skills) (assertions with a US4 code = 6.69% [assertion level], articles with at least one US4 code = 9.63% [article level]). For example, Freeman et al. (2014) was cited as support for assertions made about active learning improving motivation and attitudes to skills related to life-long learning to thinking like a scientist. In Freeman et al., the only measures examined related to student learning (i.e., assessments and failure rates). There was no investigation into or reporting of other measures. When assertions are being made about the types of student behaviors that can be improved through active learning and a meta-analysis is cited as evidence, people may adopt active learning with the intention of improving those named student behaviors. However, using the meta-analysis as evidence in support of those assertions is not accurate and can lead to an incorrect understanding of the meta-analysis and of what active learning may or may not be capable of doing for students.
Though not as prevalent as the other two unsupported assertion categories, articles did include assertions that fell under the unsupported assertion category of Lecture is ineffective (assertions with a US2 code = 2.21% [assertion level], articles with at least one US2 code = 3.03% [article level]). Although Freeman et al. (2014) provided evidence that active learning led to larger learning gains and reduced failure rates for students as compared with traditional lecture, these authors did not explicitly state that lecture is ineffective. Rather, lecture (in general) was shown to be less effective than active learning (in general).
Nevertheless, lecture is not necessarily ineffective in that it can be an efficient way to disseminate information (see research on explicit/direct instruction, for example (e.g., Klahr & Nigam, 2004; Martella et al., 2020). Lectures have the potential to provide context and structure for a subject, facilitate the development of an idea, promote listening and note-taking skills, give students access to new information in the field, and allow students to learn from content experts in the field, among other possible benefits (French & Kennedy, 2017). Lecture methods are multifaceted and varied (French & Kennedy, 2017) and, thus, may not all look the same in a lecture-based course or provide the same benefits to students. For example, lecture could be used for the duration of a class period or could be used periodically during a class period. Lectures could be conducted through PowerPoint presentations, whiteboards, or instructor notes. Lecturers could have received teaching awards or could have received low student evaluations. By saying that lecture is ineffective, all forms of lecture are being lumped together and criticized as one when certain forms of lecture could be more effective than certain forms of active learning. Future research on the effects of varying forms of lecture are needed to determine if and how they can be implemented effectively, especially in relation to active learning approaches.
The least common unsupported assertion category was Active learning is beneficial for specific populations/course topics (e.g., minorities, women, genetics material, calculus) (assertions with a US3 code = 3.31% [assertion level], articles with at least one US3 code = 4.77% [article level]). These assertions, similar to assertions about active learning improving more than just student learning, could resonate with those who wish to adopt strategies that could improve learning in certain populations of students or topic areas. The Freeman et al. (2014) meta-analysis did not contain specific data on any improvements for certain populations of students nor beyond STEM disciplines in general.
Explanations for Unsupported Assertions
There are several possible explanations for making unsupported assertions. The first two explanations relate to all four unsupported assertion categories. First, given the communities of teaching-oriented faculty and staff in higher education or the groups of learning sciences professionals working on STEM education initiatives, for example, incorrect readings of Freeman et al. (2014) or biases influencing how information is perceived could become ubiquitous in these groups. With this misinformation in mind, authors may make assertions and cite Freeman et al. without reading or double checking the paper they are citing. Second and relatedly, authors may read others who have cited Freeman et al. (2014) and transfer that information into their own papers without being aware that those who had cited Freeman et al. had cited them incorrectly. Accepting and copying another author’s interpretation of a cited article, rather than going back to the original source material, can cause the spread of incorrect information. This issue can be labeled as citing secondary versus primary sources.
The third explanation is specifically related to unsupported assertions about specific approaches being effective. There is a lack of consistent and clear operational definitions for active learning and lecture (see Martella & Demmig-Adams, 2018). Freeman et al. (2014) noted that the active learning interventions varied widely in intensity and implementation, and included approaches as diverse as occasional group problem-solving, worksheets or tutorials completed during class, use of personal response systems with or without peer instruction, and studio or workshop course designs. (p. 8410)
However, deeper insight into which forms of active learning (or active learning features) are most effective was not provided in Freeman et al.—the different active learning interventions were grouped under the broad category of active learning and the effect sizes were statistically combined. Those reading Freeman et al. may have read the statement about how the active learning interventions varied (with various example features listed) and, coupled with the finding that active learning was more effective than lecture, made the assumption that all forms of active learning were more effective than lecture and thus specific active learning features could be deemed more effective than lecture.
The fourth explanation is specifically related to unsupported assertions about active learning improving measures above and beyond learning/retention. Active learning is generally thought of as a more engaging approach than the traditional lecture method. In fact, in Freeman et al. (2014), the authors discuss analyses that indicated better student performance (passing the class, getting higher grades) and increased engagement help students persist in STEM and cite Goodman et al. (2002), Seymour and Hewitt (1997), and Watkins and Mazur (2013). Those who cited Freeman et al. (2014) may not have closely read that part of the meta-analysis and subsequently made the assumption that (a) Freeman et al. (2014) found active learning to increase student engagement or (b) because students had higher performance in active learning courses, they must have been more engaged and motivated.
The fifth and sixth explanations are specifically related to unsupported assertions about lecture being ineffective. The methodological decisions made in Freeman et al. (2014) created a false dichotomy between active learning and traditional lecture (it should be noted that when Freeman et al. tried to categorize and analyze different types of active learning, the sample sizes were highly unbalanced). There is danger in creating false dichotomies in that they appear as though they are mutually exclusive or that you must choose between them (see LaBoskey, 1998). Due to the false dichotomy of active learning versus traditional lecture created in the meta-analysis, it appears that active learning is devoid of any lecture. However, active learning can and often does contain lecture components (Martella et al., in press; Zakrajsek, 2018). Freeman et al. (2014) found active learning courses could contain lecture, stating that active learning activities could range from taking 10% to 100% of the class time—this percentage range led us to assume that lecture could range from 0% to 90% of class time in active learning courses. Therefore, based on this range, it is possible that lecture courses could contain active learning components for up to 9% of the class time. However, by creating these two instructional bins (i.e., active learning bin and traditional lecture bin), readers may have seen these two approaches as mutually exclusive and made the assumption that lecture was ineffective.
Additionally, the popular press may influence the takeaways from Freeman et al. The interpretations written in these articles such as “It Puts Kids to Sleep—but Teachers Keep Lecturing. Here’s What to Do About It,” from The Washington Post (Strauss, 2017) may influence others’ reading of the meta-analysis or may lead them to cite Freeman et al. as saying something that was an exaggeration found in a popular press article. In yet another example, an article in Science was titled, “Lectures Aren’t Just Boring, They’re Ineffective, Too, Study Finds” (Bajak, 2014). These popular press and other media-coverage articles frequently mentioned lecture as an ineffective or boring approach.
The seventh and last explanation is specifically related to unsupported assertions about active learning being beneficial for specific populations or course topics. In Freeman et al. (2014), the authors mention that other research indicates active learning can have disproportionate benefits for STEM students from disadvantaged backgrounds and female students; they cite Haak et al. (2011) and Lorenzo et al. (2006). Perhaps it was the case that authors who cited Freeman et al. as saying that active learning confers greater advantages for students from diverse backgrounds pulled this information from the discussion section in the meta-analysis without directly going back to the Haak et al. or Lorenzo et al. sources to ensure the information was correct and to cite them directly. Readers could have also assumed Freeman et al. had made those findings if a careful read was not conducted.
Supported and Unsupported Assertions Across All Assertions and Articles Based on Journal Discipline
In addition to examining the prevalence of the seven assertion categories across assertions and articles, we examined the general frequency of the seven categories across journal disciplines. The most common journal discipline citing Freeman et al. (2014) was Science followed by General Education, and the least common journal discipline was STEM (excluding Transdisciplinary and Difficult to Determine discipline categories). Given that active learning is generally discussed in discipline-based education research and the learning sciences, it is not surprising that discipline-specific journals and general education–related journals cite this meta-analysis more than journals that encompass an array of disciplines. It is also not surprising that Science was the most common discipline publishing studies that cited Freeman et al. as biology, chemistry, and physics have the status of “parent disciplines” for discipline-based education research and have a long history in education research (Singer et al., 2012).
Furthermore, the typical most common supported category across journals and publication years was “Active learning is more effective than lecture/increases student learning (labeled S1),” and the typical most common unsupported category across years and journals was “Specific activities/approaches (e.g., group discussions, flipped classes, inquiry-based learning) other than the general approach of active learning are effective (labeled US1).” The most common publication year for citing Freeman et al. (2014) in our analysis was 2018. It should be noted based on frequency counts in WOSCC and Scopus, each year since the meta-analysis was published has experienced an increase in the number of citations with 2019 (the full year) surpassing 2018 by almost 40%. This trend appears to be similar for 2020. It appears that discussions surrounding active learning continues to garner interest, which is not unexpected given current national conversations and initiatives surrounding student achievement and retention in STEM disciplines.
Future Directions and Implications
The Freeman et al. (2014) meta-analysis is a prominent paper that has helped fuel the discussion of STEM educational improvements and course transformations at the college level. As Willingham (2014) stated, “Perhaps the best news is that the effectiveness of college instruction is on people’s minds” (para. 9). However, ensuring the dissemination of correct information is of the utmost importance given the policy decisions, funding opportunities, institutional changes, and large-scale initiatives that involve active learning. Citation practices should not be taken for granted (Leatham, 2015), and there are many steps that can be taken to promote quotation accuracy (see recommendations by Bareket et al., 2020; Jergas & Baethge, 2015, for details). Five overarching suggestions can be made to ensure quotation accuracy.
First, mistakes are often made in citing the works of others due to a failure to closely read articles and/or reading and citing secondary sources. Authors should always go back to primary sources to further reduce the spread of overgeneralizations and inaccurate information rather than trusting secondary sources or popular press articles. They should carefully read these primary articles to make sure all pertinent information is obtained.
Second, because there are a lack of consistent and clear operational definitions for active learning and lecture, it is not possible to state with any certainty what specific components constitute active learning. As is, these are general categories rather than specific procedures. Future research should focus on clear and precise operational definitions of instructional variables. Components of active learning including the inclusion of various forms and durations of lectures and the order they are delivered should be specified. In other words, more than two bins should be developed for categorizing variables. Those citing the works of others should also consider the possible range of components that may be present in a general category such as active learning. When citing the works of others, a consideration of these other components should be made and indicated.
Third, it is critical to restrict one’s own conclusions or those noted by others to the dependent variables measured and to eliminate any inferences related to these conclusions. For example, Freeman et al. (2014) found that active learning improved learning and retention of students. However, they did not measure an increase in student motivation, for example. Conclusions of some who cited the meta-analysis inferred there were improvements in this area when there were no improvement.
Fourth, researchers should eliminate false dichotomies when investigating complex areas of study. For example, due to the overlap of active learning and lecture (i.e., lecture can comprise 0% to 90% of class time in active learning classrooms), a false and overly simplistic dichotomy was created where active learning was interpreted as inconsistent with lecture and was considered mutually exclusive. Also, such a false dichotomy permits the interpretation that lecture is ineffective or damaging. Researchers should clearly specify the range of possible components in an instructional approach such as active learning in a manner that makes it more salient (e.g., table or other visual display). Also, authors citing the works of others should pay particular attention to descriptions of instructional variables as cited in the literature.
Finally, authors who cite the works of others should consider the specific populations and settings where the studies took place compared with those statements made by the researchers. For example, Freeman et al. (2014) did not analyze the effects of active learning on students from disadvantaged backgrounds and females. However, because Freeman et al. claimed, with noted study citations, that active learning affects these students, others indicated this was a finding from Freeman et al. when it was not.
Overall, a crucial step in disseminating accurate information is prioritizing quotation accuracy when writing and publishing papers. Double (and triple) checking each assertion and subsequent research paper being used as evidence can help reduce the creation of incorrect assertions; journals might even require a signed statement from authors that quotation accuracy was ensured on manuscript submission. For researchers, it is important to use operational definitions, present important information in a salient manner, and refrain from making claims that go beyond the purpose of the study and the data obtained.
Conclusion
Given the propensity for humans to make errors, even those who work in the scientific arena, it is our hope that the results of this study illuminate those errors that can easily be made in the scientific communication process so that producers and consumers of scientific writing can avoid or notice (and fix) these errors in the future. Perhaps quotation accuracy has been more frequently studied in medicine due to the serious consequences that errors can have such as affecting a person’s health and well-being. However, errors in the learning sciences literature can also have serious consequences in that they can negatively affect how students are educated (relating to, e.g., pedagogy and educational resources or opportunities) or can hinder their success in present or future environments (relating to, e.g., STEM fatigue and attrition). Unfortunately, studies that focus on citation errors are not common in the learning sciences. Our citation context analysis served to fill this gap in the literature by providing insight into how accurately a highly influential article relating to the learning sciences has been cited. As evidenced by our citation context analysis, quotation errors were common in studies citing Freeman et al. (2014), and these errors could affect the scholarly discourse surrounding active learning and lecture. It is important to note that we did not examine the accuracy of the irrelevant assertions; and thus, there could be additional errors in the citing text that could also affect the scholarly discourse surrounding Freeman et al.
Future research examining quotation accuracy could be conducted across learning sciences, education-related journals, or on other highly cited articles. Furthermore, researchers could examine if and to what extent quotation errors have been propagated. For example, for those unsupported assertions we identified, did other authors read these assertions and use them in their own papers? Researchers could also investigate the frequency of the lack of citation where one is actually needed (Jergas & Baethge, 2015). Through these research investigations, issues in the literature can be revealed and myths may not continue to be promulgated. “Robust science needs robust corrections” (Allison et al., 2016, p. 29).
Footnotes
Notes
Authors
AMEDEE MARCHAND MARTELLA is a National Science Foundation graduate research fellow and doctoral student in cognitive psychology at Purdue University, West Lafayette, IN 47907-2098, USA; email:
JANE KINKUS YATCILLA is an associate professor in the Purdue Libraries & School of Information Studies, West Lafayette, IN 47907-2098, USA; email:
RONALD C. MARTELLA is a professor of educational studies at Purdue University, West Lafayette, IN 47907-2098, USA; email:
NANCY E. MARCHAND-MARTELLA is the Suzi and Dale Gallagher Dean of Education and a professor in the Department of Educational Studies at Purdue University, West Lafayette, IN 47907-2098, USA; email:
ZAFER OZEN is a doctoral student in the Gifted Education Research and Resource Institute at Purdue University, West Lafayette, IN 47907-2098, USA; email:
TUGCE KARATAS is a doctoral student in Gifted, Creative, and Talented Studies at Purdue University, West Lafayette, IN 47907-2098, USA; email:
HELEN H. PARK is a 2018–2019 Undergraduate Research Training (URT) scholar, 2019–2020 Office of Undergraduate Research (OUR) scholar, and an undergraduate student majoring in both elementary and special education (mild P-12) with a concentration in reading as well as a minor in global studies and learning science at Purdue University, West Lafayette, IN 47907-2098, USA; email:
ALEXANDRA SIMPSON graduated as a psychological sciences major at Purdue University, West Lafayette, IN 47907-2098, USA; email:
JEFFREY D. KARPICKE is the James V. Bradley Professor and head of the Department of Psychological Sciences at Purdue University, West Lafayette, IN 47907-2098, USA; email:
