The Use and Misuse of Organizational Research Methods ‘Best Practice’ Articles

Abstract

This study explores how researchers in the organizational sciences use and/or cite methodological ‘best practice’ (BP) articles. Namely, are scholars adhering fully to the prescribed practices they cite, or are they cherry picking from recommended practices without disclosing? Or worse yet, are scholars inaccurately following the methodological best practices they cite? To answer these questions, we selected three seminal and highly cited best practice articles published in Organizational Research Methods (ORM) within the past ten years. These articles offer clear and specific methodological recommendations for researchers as they make decisions regarding the design, measurement, and interpretation of empirical studies. We then gathered all articles that have cited these best practice pieces. Using comprehensive coding forms, we evaluated how authors are using and citing best practice articles (e.g., if they are appropriately following the recommended practices). Our results revealed substantial variation in how authors cited best practice articles, with 17.4% appropriately citing, 47.7% citing with minor inaccuracies, and 34.5% inappropriately citing BP articles. These findings shed light on the use (and misuse) of methodological recommendations, offering insight into how we can better improve our digestion and implementation of best practices as we design and test research and theory. Key implications and recommendations for editors, reviewers, and authors are discussed.

Keywords

best practices recommendations guidelines research methods impact organizational research methods qualitative and quantitative methodology organizational sciences

Being the highest impact methods journal in Management and Applied Psychology, Organizational Research Methods (ORM) is the source of some of the most influential research methods articles in the organizational sciences. Published quarterly, ORM accepts a range of articles pertaining to research methodology, some of which include “best practice” type pieces around key methodological and analytic topics. ORM pieces are clearly highly influential. As of June 2020, the top cited 35 best practice articles published in ORM between 2010 and 2020 received 8,292 total citations. Despite their popularity and utility, there is anecdotal evidence that ORM ‘best practice’ (BP) article recommendations are not always appropriately followed by organizational researchers—even when they are directly cited. Namely, in our preliminary discussion with editors and reviewers, we found surprisingly high numbers of examples of inappropriate citations and poorly followed recommendations. But how pervasive is this problem? How might we combat BP article misuse to improve the methods used in our science and enhance the impact of methodological BP articles?

To address the above research questions, we investigated the current practices of research teams citing methodological BP pieces. We first determined if and how researchers use or misuse the recommendations offered in BP articles when citing them. We then quantified the extent of the misuse, wherever present. Our review goes beyond past conversations on methodological cutoff practices (Lance et al., 2006) to consider how researchers interpret, follow, and cite methodological best practice recommendations broadly. Based upon our review, we offer steps forward to ensure appropriate usage and citations of BP articles. Hopefully, our findings raise awareness in the organizational science community on the shortcomings of how BP articles are being used and cited by researchers. Further, our review is intended to inform authors of ORM best practice articles how their recommendations are being interpreted and utilized by researchers in the field. We provide additional suggestions to journal editors and reviewers on how to evaluate and critique best practice article citations in manuscripts they are reviewing. Ultimately our recommendations will benefit the organizational science research community by preventing further misuse of BP articles, encouraging the appropriate usage of BP articles, and generally improving methodological rigor in the organizational sciences.

Citation Practices

Appropriate citation practices, both minor and substantive in nature, are fundamental to good science. Citation behaviors include correct spelling or formatting of reference lists and in-text citations, listing the correct publication date in reference lists and in-text citations, choosing to cite credible and rigorous scholarly works, and representing or describing cited works accurately and fully (APA, 2019; Harzing, 2002). Reference lists give credit to researchers for their work, connect pieces of literature to one another, and help readers identify additional literature relevant to a particular topic. Amongst other purposes, articles compiled in reference lists and cited within scholarly works can provide justification for a new scientific contribution, offer dissenting opinions to the present article, demonstrate scholarly consensus, or, in the case of many ORM pieces, provide support for methodological decisions made by the researcher. Colquitt (2013) argued that the selection and usage of references as a kind of craftsmanship from which readers of scholarly works can evaluate the rigor, care, and thoughtfulness of the researcher and their work product. By including certain articles in reference lists over others, the researcher is identifying the most critical ideas, theories, and recommendations that guided the decisions they made in their study. In this way, citation behaviors go well beyond appropriate italicization and indentation. They are the foundation of every publication.

Given its integral role in positioning new scientific arguments and acknowledging past contributions, standards for citation behaviors are presented in the training of new researchers and explicitly outlined by numerous sources in our field. For example, the American Psychological Association (APA, 2019) underscores the importance of appropriate citation behaviors and states, “By following the principles of proper citation, writers ensure that readers understand their contribution in the context of the existing literature—how they are building on, critically examining, or otherwise engaging the work that has come before.”

Appropriate citation behaviors and the accurate representation of past scientific work have considerable implications for the quality of our science. An improvement in citation behaviors can increase the quality of scientific arguments and, in instances where authors cite methodological articles, can increase the level of scientific and methodological rigor. One of the most impactful citation behaviors and the central focus of this study is how researchers represent and apply the research methods articles they cite in their studies. In 2002, Harzing provided 12 guidelines for good referencing behaviors in organizational behavior research. Importantly, the sixth guideline stated: “Do not misrepresent the content of the reference” (p. 128). In citing methodological articles, appropriate integration of methodological article recommendations guides research decisions and encourages empirically supported research methods.

While appropriate citation and application of methodological BP articles can improve the methodological rigor of our field, inappropriate citations and misuse of ORM BP articles have considerable implications for methodological rigor, including the continuation and perpetuation of BP article misuse. Inappropriate citations become especially concerning in the context of research methods. For instance, Lance et al. (2006) recognized this impact and ascertained the origin of commonly accepted cutoff indices or “methodological urban legends” in our field. They identified articles that authors cited when mentioning these cutoffs and found that authors were frequently misciting and misrepresenting the original contributions of these methodological pieces. One example of misrepresentation was the use of a 0.70 cutoff for coefficient alpha and citing Nunnally (1978), yet Nunnally recommended that cutoffs should only be used in “early stages of research” and never mentioned coefficient alpha in association with the 0.70 cutoff. As a result of the misrepresentation of Nunnally's (1978) work, our field is using a lenient cutoff for reliability and misappropriating this cutoff to coefficient alpha, which potentially results in inaccurate and unreliable research conclusions. The misuse of Nunnally's work even prompted researchers to publish additional methodological articles that combat these errors and provide clarifying guidance on standards for reliability (Cho & Kim, 2015; Cortina et al., 2020).

In some cases, inappropriate citations may be the product of not reading the cited article, but nevertheless citing it based on its representation in another article in which it is cited (Liang, et al., 2014, Todd & Ladle, 2008). This creates a whisper-down-the-lane effect, perpetuating the misuse of the original recommendations. This can have particularly severe consequences when the article being inappropriately cited is methodological in nature. In fact, Heggestad et al. (2019) observed this phenomenon in the adaptation of scales. Coined the “cascading of adaptations,” Heggestad et al.'s literature review found numerous instances where authors would adapt an original scale and this adaptation would be carried forward by subsequent authors. The underlying meaning of the original construct was affected by these adaptations, yet, this change was never acknowledged. This resulted in numerous studies (at least 15 for one single construct reviewed by Heggestad et al.) claiming to investigate one variable but really examining another. Like scale adaptations, author misrepresentation or misuse of methodological recommendations (best practices, cutoffs, measurement scales, etc.) carries the risk that misrepresented methodological decisions will be cited in future works, carrying forward that inappropriate practice, and potentially become a standard for the field (e.g. Heggestad et al., 2019; Lance et al., 2006).

Consequently, citation behaviors, specifically the representation or use of article content, must be evaluated like any other methodological decision made by researchers. Lance et al. (2006) demonstrated the danger of continuous or proliferated misuse of citations of methodological pieces. Given the frequency in which ORM articles are cited and the credibility of the journal as a top journal for methodological and organizational research, the appropriate use or misuse of the methodological recommendations can significantly impact the field.

In spite of its importance, citation behaviors and their potential implications are rarely discussed or studied in our literature (Colquitt, 2013). Fields such as social work (Mitchell-Williams et al., 2017; Spivey & Wilks, 2004), marine biology (Todd et al., 2010), and experimental psychology (Faunce & Job, 2001) have published analyses of citation behaviors. Given that it has been over fifteen years since Lance et al. (2006) initial investigation of the proliferation of misrepresented cutoff criteria in quantitative research, there is a need to reexamine citation behaviors in our discipline and extend this investigation beyond cutoff practices.

The Current Study

The current paper investigates the representation and use of ORM best practice articles in the literature by reviewing articles that cite them. Specifically, we focus on how authors are representing the methodological recommendations of ORM BP articles, and the extent to which authors follow the methodological recommendations in their work when they cite BP articles. We focus our investigation on papers that cite ORM BP articles in their methods sections and directly compare the behaviors they report to the recommended behaviors from the original BP article they cite. We conclude by offering recommendations to alleviate BP article misuse. These recommendations are intended to inform and advise both the individuals using and interpreting BP articles (e.g., researchers, authors) and the systems that influence or contribute to the use and misuse of ORM BP pieces (e.g., editors, reviewers).

Method

Selecting BP Articles

To evaluate the use and misuse of ORM BP pieces, we first conducted exploratory literature searches and a pilot test to verify the presence (or absence) of best practice article uses and misuse. Two members of our research team independently reviewed the top cited best practice articles published in ORM from 2010–2020 (N = 35) (listed within the following link: https://osf.io/8bdgr/?view_only=301dd4ec48ea446aa1deaf7d51e970a3). Of these, we selected six best practice articles (all with extremely high levels of citations) for pilot coding: Aguinis et al. (2011), Becker (2005), Cho and Kim (2015), Newman (2014), Rogelberg and Stanton (2007), and Spector and Brannick (2011). Twenty-five articles citing each best practice piece were evaluated in our pilot coding (N = 150 total articles). The coding team did not use detailed coding guides in this pilot stage. Instead, they made notes on the BP recommendations that were met or not met. In the first round of pilot coding, the coders also gave ratings on the quality of the citations to quantitatively evaluate the distribution of citation practices for BP articles. Results showed that there was considerable variation; therefore, this step was not repeated in the second round of pilot coding. In both rounds of pilot coding, the coding team identified which BP articles were being consistently cited in method or results sections, since the focus of the study was to examine how researchers are citing ORM BP to justify their methodological decisions. Throughout this process, we noticed substantial variation in how authors were using and citing ORM BP articles (Table 1), supporting our suspicion that there are inconsistencies in how authors are implementing and citing methodological best practice articles in their works. All six articles from pilot coding contained incidences of appropriate use or misuse of the BP articles, but additional characteristics of the BP articles ultimately informed the selection of which BP articles we would code for our study.

Table 1.
Pilot Coding Results.

BP Article Citations in Method or Results Sections Citations Outside of Method or Results Sections Classification

Pilot Coding Round 1 Becker (2005) 19 6 Appropriate Use: 3
Satisfactory Use: 9
Misuse: 13

Rogelberg and Stanton (2007) 12 13 Appropriate Use: 3
Satisfactory Use: 13
Misuse: 9

Pilot Coding Round 2 Cho and Kim (2015) 18 7

Newman (2014) 17 8

Aguinis et al., (2011) 14 11

Spector and Brannick (2011) 18 7

Note. N = 25 articles were coded during the pilot stage for each BP article (total N = 150 articles).

Two coders independently coded each article.

Moving forward, we decided to select three best practice articles to include in the current review. The three articles we selected offered particularly clear and concise guidelines for authors (e.g., included decision trees, numbered recommendations, etc.). These clear and concise guidelines also were not highly dependent on the research context or reliant on the judgment of the authors citing the BP article. For example, Rogelberg and Stanton (2007) suggest three different techniques for testing nonresponse bias that a researcher must decide between based on the specific characteristics of their study. As coders, we knew it would be difficult to evaluate such decisions using only the information reported in a journal manuscript. Consequently, we intentionally selected articles with direct and specific recommendations in an attempt to avoid introducing researcher bias or subjectivity into our coding. We also wanted to ensure that we were truly capturing misrepresentation of cited articles due to the author misuse rather than a fuzzy representation of an ambiguous BP article. Finally, we considered the frequency with which the BP article was being cited in method and results sections. As shown in Table 1, Rogelberg and Stanton (2007) and Aguinis et al. (2011) were not as frequently cited by researchers to justify their methodological decisions in method and results sections. As a result of our constrained sample of BP articles, our findings may offer a conservative picture of the frequency of (mis)use.

The three ORM articles that our research team decided to code for our study were: Cho and Kim (2015), Newman (2014) and Spector and Brannick (2011). In their ORM best practice piece, Cho and Kim (2015) challenge six common misconceptions about coefficient alpha. Spector and Brannick (2011) question the current use (misuse) of control variables in their ORM piece, and Newman (2014) provides a set of guidelines for handling missing data. These articles are intended to provide organizational scholars with best practices to engage in to increase methodological rigor during the design and implementation of their research. Spector and Brannick (2011) were even identified as the 12th most cited ORM article by Aguinis et al. (2019).

Coding Procedure

Each author carefully read the three ORM best practice articles, focusing on the recommendations provided by the authors. We then developed initial coding guides for each best practice article based on the clear recommendations from the authors. For example, we leveraged the decision tree presented in Figure 1 (pg. 374) of the Newman (2014) article to create the coding guide for the handling of missing data. Using the flow chart provided in the article, our coding guide mirrored the recommended practices presented by Newman (2014) on how to handle missing data under different circumstances and conditions.

Figure 1.
Degree of use by journal ranking.

This process resulted in three distinct coding guides (available at the following link: https://osf.io/8bdgr/?view_only=301dd4ec48ea446aa1deaf7d51e970a3), one for each best practice article. The coding forms were created via an online survey platform, Qualtrics, to reduce coding errors. To further confirm the objectivity of our coding guides, and to gain confidence in our interpretation of the recommendations offered by the BP authors, we reached out to the lead authors of the three BP pieces asking them to review (and approve) the coding guides. We received and implemented feedback from the lead author of one of the articles. We then reached out to two previous editors of ORM, who were editors of the journal when the best practice articles were originally published. We asked these two former editors to evaluate our coding guides, requesting feedback on our interpretation of the best practice articles. We adjusted our coding forms to include all advice and suggestions from these two methodological experts.

In addition to coding the content of each article, we also coded the discipline of the journal where the article was published along with the journal's rank. We used the 2019 list of Impact Factors to classify each article by discipline and identify whether the journal ranked in the top quartile of impact factors within their discipline (0—not included in the top quartile of impact factors, 1—included in the top quartile). This allowed us to see if citation practices varied by discipline and/or by journal rank or prestige. Finally, we wanted to account for multiple articles in our pool being written by the same author. Articles that had a lead author that was unique from other articles in our coding were given a 0. Articles that had a lead author who authored multiple articles citing the same BP piece were given a 1.

Data Collection

The two lead authors searched Web of Science to gather all articles (across various disciplines) that cited each best practice piece and were published prior to July 2020. This resulted in a total of 687 citations across all three BP pieces (Cho & Kim, 2015: N = 101; Newman, 2014: N = 164; Spector & Brannick, 2011: N = 422). Due to the methodological focus of our project, only articles that cited best practice articles in describing or supporting their methodological choices in methods and results sections—or other related tables, appendices, and footnotes—were included in data analysis. Articles that cited best practice articles for the purpose of theoretical support (e.g., citations in introduction or discussion sections) were excluded from our analyses. Additionally, we excluded articles that were not in the English language. Last, we excluded review articles, research proposals, and meta-analyses - only retaining empirical research studies in our analyses. This resulted in a final sample size consisting of 556 articles (Cho & Kim, 2015: N = 72; Newman, 2014: N = 144; Spector & Brannick, 2011: N = 340) and 9,668 codes across the entire pool of articles.

Reliability

To ensure accuracy in coding, we used a multi-stage coding and calibration process on random subsamples of articles citing each best practice piece. Round 1 of coding included 10% of the total sample of articles for each BP piece, round 2 was a new grouping of 10% of the total sample of articles, and the final round included ten randomly selected articles. After each round of coding, inter-rater reliability and agreement statistics were computed using Krippendorff's alpha and percent agreement, and they informed the calibration meetings where coders discussed and resolved discrepancies. Unlike percentage agreement, Krippendorff's alpha is a conservative estimate of reliability as it adjusts for the likelihood of chance agreement among raters (Krippendorff, 2004). The coders resolved all discrepancies and reached acceptable agreement after the third round of coding for each article (Krippendorff's ɑ = 93.1, 80.6, and 84.4; percent agreement = 95.7%, 90%, 90% for Spector & Brannick, 2011; Cho & Kim, 2015; and Newman, 2014 respectively). Once the team reached acceptable agreement, the two lead authors independently coded the remaining articles. The two coders flagged any articles that warranted further discussion (e.g., unique cases; total N = 37; Spector & Brannick, 2011 = 13; Cho & Kim, 2015 = 13; Newman, 2014 = 11), and met to confer on those flagged articles until agreement was reached.

Results

Degrees of Use/Misuse

After conducting detailed coding of the articles citing the three BP articles, we classified the articles into one of three categories: appropriate use, satisfactory use, and misuse. The proportion of articles classified within these categories is displayed in Tables 2 and 3.

Table 2.
Classification Criteria for ORM Best Practice Article Use.

Classification Spector and Brannick (2011) Newman (2014) Cho and Kim (2015)

Appropriate Use (17.4%)
Provided appropriate justification for the inclusion of controls

Refrained from using demographic variables as controls unless theoretically relevant to the study

Reported both the response rate/s and used an appropriate method to handle missing data

Acknowledged the limitations of alpha and provided additional estimates of reliability

Satisfactory Use (47.7%)
Provided justification for some, but not all controls

Provided sufficient, but not comprehensive justification for the inclusion of controls (i.e., solely justified their inclusion by testing models with and without control variables)

Used appropriate method to handle missing data (e.g., FIML, multiple imputation), but did not report response rate/s

Reported response rate/s but did not use appropriate method to handle missing data

Used an alternate estimate of reliability but do not acknowledge limitations of alpha

Acknowledged limitations of alpha but do not report an alternative estimate

Reported alternative estimates for some but not all scales

Misuse (34.5%)
Provided no justification for control variables

Used demographic variables as controls with no justification

Only used demographic variables as controls

Used listwise deletion

Used single imputation

Deleted data

Provided no information on response rate/s

Unclear how the author/s handled the missing data

Cited a cutoff value for alpha & provided no additional estimate/s

Reported an alpha below 0.70

Reported an alpha value with no further information

Note. N = 556 total articles (Cho & Kim, 2015: N = 72; Newman, 2014: N = 144; Spector & Brannick, 2011: N = 340). FIML = full imputation maximum likelihood.

Table 3.
Degrees of Use for Each BP Article in Our Sample.

BP Article Misuse (%) Satisfactory Use (%) Appropriate Use (%)

Spector and Brannick (2011) 36.2% 49.7% 14.1%

Cho and Kim (2015) 43.1% 43.1% 13.8%

Newman (2014) 27.1% 45.8% 27.1%

Overall 34.5% 47.7% 17.4%

Note. N = 556 total articles (Cho & Kim, 2015: N = 72; Newman, 2014: N = 144; Spector & Brannick, 2011: N = 340).

We defined appropriate use as citations that met all or most of the BP article recommendations with no evidence of misrepresenting the content and contribution of the BP article. For example, in classifying articles citing Spector and Brannick (2011), articles that did not mention that the control variables excluded alternative explanations of the relationships amongst variables or did not test their models with and without control variables were still classified as appropriate use if they met other key recommendations such as providing sufficient justification for all control variables. Also, while fifteen articles used demographic variables as controls, they offered extensive theoretical and empirical justification for their use, thus were still included in the appropriate use category. In classifying articles citing Cho and Kim (2015), articles that reported alpha as an estimate of reliability, but also reported alternative estimates of reliability for all scales were still classified as appropriate use. Relatedly, articles that reported alpha but referred to alpha as a lower-bound estimate of reliability and acknowledged the limitations of relying on alpha as a sole estimate of reliability were classified as appropriate. In classifying articles citing Newman (2014), authors who used the appropriate method for handling missing data and provided some information regarding response rates (e.g., person-level, full or partial) were categorized as appropriate use.

The largest category, satisfactory use, represented the middle ground or grey area between appropriate use and misuse. Citations that we identified as satisfactory failed to meet more than half of the BP article recommendations but did not misrepresent or violate the main tenets of the BP article they cited. For example, in classifying articles citing Spector and Brannick (2011), articles that did not mention that the control variables excluded alternative explanations of the relationships amongst variables and did not test their models with and without control variables were classified as satisfactory use, so long as they provided sufficient justification for the use of their control variables. For classifying articles citing Cho and Kim (2015), articles that did not acknowledge the limitations of alpha as an estimate of reliability but did offer alternate estimates of reliability were considered satisfactory use. In classifying articles citing Newman (2014), articles that did not sufficiently report response rates, but used an appropriate method to handle all missing data were categorized as satisfactory in use. Further examples are described in the satisfactory use section below.

Our third and final category, misuse, embodied complete misrepresentation of the content of the BP article with frequent violations of the article's recommendations. The specific criteria the coding team used to classify the articles citing each respective BP article are listed in Table 2. The percentage of articles that fell into each of the three categories per BP article is listed in Table 3.

Appropriate use of BP Article Recommendations

Overall, 17.4% of our sample (97 out of a total 556 articles) appropriately applied all or most of the recommendations of the BP article that they cited. In this grouping, the articles cited the BP article and revealed no evidence of misrepresenting or contradicting the main recommendations from the BP piece. For example, when citing Spector and Brannick (2011), exemplary articles offered appropriate justification for the use of control variables and refrained from using demographics unless theoretically relevant to the variables of interest in the study. Regarding the citation of Cho and Kim (2015), exemplary articles acknowledged the limitations of alpha and proceeded to provide alternative estimates of reliability (e.g., McDonald's omega). Similarly, when citing Newman (2014), exemplary articles clearly reported response rates and used an appropriate method to handle the missing data based on the level of missingness.

It is important to note that the classification of an article into the appropriate use category does not imply perfect application of BP article recommendations or the implementation of especially rigorous methodology. In fact, we see less than ideal practices in these articles as well. Out of the forty-eight articles citing Spector and Brannick (2011) that were identified as appropriate use citations, only twelve provided sufficient theoretical justification for at least one of their included control variables. These twelve articles make up roughly half of the twenty-three articles in our total sample that provided sufficient theoretical justification. Citations of Spector and Brannick (2011) did not apply this specific recommendation regardless of their degree of use. Although authors reported theoretical justification for the inclusion of their controls in articles identified as appropriate citations, a subset of articles in this category still failed to meet the full recommendations presented by Spector and Brannick (2011).

Relatedly, in the sample of articles citing Cho and Kim (2015) that were classified as appropriate use citations, three of the ten articles reported “composite reliability” estimates in addition to or in place of coefficient alpha. Comparably, only three of the thirty-one articles in the satisfactory use category reported composite reliability estimates. Cho and Kim (2015) discuss composite reliability in their article and suggest that composite reliability is a broad family of reliability estimates that compute the reliability of the composite score. Coefficient alpha is included in this family; nevertheless, 30% of the articles that met our criteria for appropriate use citations of Cho and Kim (2015) stated the use of “composite reliability” instead of reporting the specific estimate used (i.e., unidimensional omega).

Finally, in articles appropriately citing Newman (2014), sensitivity analyses were conducted at a similar frequency in the appropriate use category compared to the articles classified in the satisfactory use category. When person-level missingness was above 30%, 10.5% of the articles in the appropriate use category did not conduct a sensitivity analysis compared to only 5.3% of articles in the satisfactory use category. Newman (2014) clearly states that sensitivity analysis should be conducted when person-level missingness exceeds 30% and yet some articles still failed to do so.

Hence, our classification of appropriate use is a liberal estimate as we chose to include articles that met most (not necessarily all) of the recommendations of the BP article they cited. In spite of these imperfections, a few articles classified in the appropriate use category were exemplary demonstrations of the appropriate application of BP recommendations. We acknowledge and review these exemplar citations in the discussion section, encouraging authors to implement similar use and citation practices in future work.

Satisfactory Use of BP Article Recommendations

In total, 47.7% of our sample (265 out of a total 556 articles) were satisfactory in their application of the recommendations of the BP article that they referenced. Articles in this category followed some (but not most) of the recommendations provided in the cited BP article. For example, Spector and Brannick (2011) propose a reasonable case should be made for the inclusion of control variables, notably claiming, “that merely stating a theory or pointing to observed relationships among control and substantive variables in the past is insufficient (pg. 296).” Therefore, articles that merely pointed to prior research or previous empirical findings as sole justification for their control variables were classified as satisfactory. Relatedly, articles that merely cited a theory as justification for their control variables (without elaboration or further detail) were categorized as satisfactory use. These articles provided justification, although weak and/or insufficient, which we acknowledge as satisfactory compared to articles that provided no justification for control variables (which we classified as misuse). Another example of weak justification would be an author only testing their models with and without control variables and retaining the control variables if there were significant differences between the two models. Last, when articles provided justification for some (but not all) of the control variables, they were classified as satisfactory use.

Regarding the recommendations of Cho and Kim (2015), an example of satisfactory citations were articles that acknowledged the limitations of alpha but did not provide an alternate estimate of reliability. These authors correctly cite Cho and Kim (2015) when discussing the limitations or misconceptions of alpha, yet they do not follow a key recommendation to report an alternative estimate of reliability. Likewise, authors who reported additional estimates of reliability but did not acknowledge the limitations of using alpha as a reliability estimate were classified as satisfactory use. These authors are following the recommendation to report additional estimates of reliability; however, they are missing key arguments by Cho and Kim (2015) regarding the misconceptions and limitations of using alpha. A last example of satisfactory use was articles that noted limitations of alpha and provided additional estimates of reliability for some (but not all) of their scales (e.g., only provided additional estimates for scales that had lower alphas). While these authors are following some of the recommendations provided in Cho and Kim (2015), they are missing key guidelines stated in the BP piece regarding the use (and misconceptions) of alpha.

When citing Newman (2014), articles in this category may have used an appropriate method to handle missing data but did not provide information on response rates and level of missingness of their data. Alternatively, some articles in this category were adequate in reporting response rates but did not use the appropriate method to handle missing data based on the level of missingness (e.g., an author who used pairwise deletion when Newman (2014) recommends they use multiple imputation or maximum likelihood based on their level of missing data).

Evidently, nearly half of the time, authors are adhering to some of the recommendations from the BP article that they are citing. However, they are missing key elements from the suggested best practices, which make their usage insufficient.

Misuse of BP Article Recommendations

Last, 34.5% of our sample (192 out of a total of 556 articles) misused the recommendations of the BP article that they cited. These articles misrepresented the content of the BP article, with frequent violations of the article's recommendations.

For example, Spector and Brannick (2011) advise that researchers be explicit about the hypothesized role for all variables in an analysis—including controls. They explicitly note that merely pointing to past research is insufficient justification for including control variables, and they urge authors to provide theoretical justification and evidence upon which to base their suppositions. Further, the authors urge researchers to eliminate the use of demographics as control variables, especially when used as proxies for other variables of interest. Consequently, we classified articles as misusing a citation of Spector and Brannick (2011) when (a) authors provided no justification for the inclusion of their control variables, and (b) used demographic variables as control variables or proxies with no justification of their inclusion. Forty-two (12.4%) of the articles citing Spector and Brannick (2011) solely included demographic control variables and offered no justification for their inclusion. Once again, the recommendations of Spector and Brannick (2011) are quite clear, yet some authors are still not following all of their best practice guidance.

Incidences of misuse are further evidenced in citations of Cho and Kim (2015). Notably, a highest proportion of articles classified within the misuse category were citing Cho and Kim (2015). This is most likely due to situational constraints placed on a researcher in computing reliability coefficients, including a lack of access to software or tools to compute additional estimates of reliability and potential expectations from journal editors and reviewers to report coefficient alpha at a certain magnitude to receive credibility. In their BP article, Cho and Kim (2015) note, “alpha has typically been referred to as a reliability coefficient rather than a lower-bound estimate; however, the latter is a more correct description in a strict sense.” (pg. 211). Yet, an overwhelming majority of articles referred to alpha as reliability (83.3%), despite Cho and Kim’s (2015) reservations. In fact, only 5.5% of our sample (4 articles) appropriately referred to alpha as the lower-bound estimate of reliability. Relatedly, Cho and Kim (2015) advise, “alpha does not indicate internal consistency in any definitions of psychometric properties…there is little utility in using the term internal consistency from the perspective of clarity and usefulness.” (pg. 216). Again, a majority of our sample referred to alpha as internal consistency despite Cho and Kim’s (2015) advice. Second, Cho and Kim (2015) explicitly recommend that researchers avoid using alpha alone as a reliability coefficient. In spite of this, more than half of the articles that cited Cho and Kim (2015) in their methods section reported alpha in isolation.

Instead of applying the specific recommendations of Cho and Kim (2015), we noted that articles identified as inappropriate citations occasionally cited Cho and Kim (2015) in sentences justifying or explaining “low” coefficient alpha values (i.e., at or below 0.70). Cho and Kim (2015) recommend that cutoff criteria not be applied unwittingly to all contexts and instead, interpret the magnitude of the coefficient value in the context of the research stage and research purpose. One article justified, “The Cronbach's alpha was higher than 0.70 in five of the six constructs…The construct quality did not satisfy this condition but its Cronbach's alpha remained within an acceptable range (Cho & Kim, 2015).” This reference offers no explanation for why the cutoff value of 0.70 is applicable to all other measures in the study at this research stage and for this research purpose but is not applicable to the single measure that had coefficient alpha values below the cutoff. In these examples, authors are misinterpreting or misapplying the reasoning behind the BP article recommendation. Despite the clarity of Cho and Kim’s (2015) guidance and the description for each recommendation, researchers are nonetheless failing to adhere to the other specific recommendations regarding the limitations of alpha when referencing Cho and Kim (2015).

We found similar examples of inappropriate citations in articles citing Newman (2014). According to Newman, “listwise deletion and single imputation are never recommended” (p. 374). Newman (2014) is clear in his recommendation for using all available data when running analyses. Based on this guidance, we classified articles as ‘misuse’ when authors: (a) used listwise deletion, (b) used single imputation, or (c) deleted data. Seventeen articles (out of the 144) deleted data or used single imputation to handle missingness and incorrectly cited Newman (2014) when doing so, representing nearly 12% of our sample. An additional 36.1% of our sample did not report using any technique for handling missing data or were unclear in their discussion of their missing values. Thus, nearly half of the articles that cite Newman (2014) are misusing the recommendations provided when handling missing data. Newman (2014) also recommends detailed and transparent reporting of response rates, including person-level (i.e., overall response/nonresponse rate), full (i.e., respondents who answered every scale) and partial (i.e., respondents who answered some but not all of the scales) response rates. Yet, surprisingly, less than half of our sample reported person-level response rates and less than a quarter of the articles reported the full and partial response rates. Clearly, authors need to be more transparent and detailed when reporting their response rates and levels of missing data.

Overall, the levels of use vary by degree—ranging from exemplary use to absolute misrepresentation. Most articles were merely satisfactory in their implementation and reporting of methodological best practices, indicating authors can and should do better at executing, citing and representing best practices in their methods and analyses.

Degrees of Use by Journal Tier and Lead Author

To investigate whether the citation (mis)use did not systematically vary by journal quality (e.g., that higher impact journals have less citation misuse compared to lower impact journals), we categorized each article based on the impact factor of the journal in which it was published. The majority of our sample was published in journals outside of the top quartile of impact factor rankings (N_{Lower quartile} = 365; N_total = 556). Nevertheless, our findings showed exceptional variation in how authors are citing and using these three BP articles regardless of whether the article was published in a top journal or not. Figure 1 displays the distribution of the degrees of misuse across the two journal tiers and shows that the distribution of degrees of use was almost identical between top tier journals and journals outside the top tier. Thus, the variation in citing and representing ORM BP articles within our sample is present independent of journal quality. Top quality journals within our field are not immune to poor citation practices and changes should be made to avoid continued misuse in the future.

Last, to address concerns that our findings may be a reflection of ‘bad apples’ (i.e., the result of the same author misusing the BP article in multiple publications) as opposed to a ‘bad barrel’ (i.e., a pervasive problem in the field), we analyzed the frequency of researchers listed as first authors of articles citing each BP piece. Our total sample contained more articles written by unique first authors (N = 411) than articles written with shared first authors (N = 145). Importantly, the distribution of the degrees of (mis)use between these two groups were almost identical (see Figure 2), indicating BP citation practices are largely independent of first author and represent a more widespread issue in the field.

Figure 2.
Degree of use by first author.

Discussion

Citation behaviors, specifically the representation of past work in present studies, are fundamental to the way we conduct scientific research and contribute our findings to the broader community. The appropriate use of references is a craft (Colquitt, 2013) and an expected standard (APA, 2019; Harzing, 2002) that has numerous implications for the rigor and validity of scientific work. The present study sought to evaluate the representation of methodological best practice articles from ORM specifically due to the frequency in which these works are cited and the reality that methodological citations directly influence the research design and procedures chosen by author teams.

We focused our investigation on three highly cited ORM BP articles that we identified as providing clear and explicit recommendations to researchers. Ultimately, our results showed that there is significant variation in how people are citing these works and applying the best practice recommendations. Although we have no way of knowing whether it is accidental or intentional, authors are frequently, but not always, misrepresenting the original works that they cite. These results provide significant lessons in how to cite appropriately and guard against the consequences presented by BP citation misuse.

Exemplars of Appropriate Use

We would like to acknowledge some exceptional cases that met all recommendations put forth by the BP piece that they cited. These articles thoroughly and accurately represented the main contribution of the BP piece and should serve as exemplars of how BP articles should be cited in future works.

Koopman et al. (2016) provided in-depth empirical justification for their control of daily task performance when exploring the relationship between organizational citizenship behaviors (OCBs) and work goal progress. The authors state, “In order to fully understand whether OCB interferes with individuals’ ability to accomplish their daily work goals—or what the individual wanted to accomplish on a given day—it is important to partial out what the employee did accomplish on that day (i.e., their level of task performance).” The authors go on to cite prior work that explains how task performance could influence the relationship between their variables of interest, following Spector and Brannick’s (2011) advice on providing sufficient justification for the usage of control variables. A second example of an appropriate citation of Spector and Brannick’s (2011) BP article is evident in work by Martinez et al. (2017). These authors acknowledged the potential limitations of control variables and avoid using demographic variables as controls in their studies. They elaborated, “as there is no reason to believe that these characteristics could contaminate the measurement of or cause spurious relations between our focal variables, we follow current recommendations and omit these characteristics as control variables.” The authors did consider two potential controls in their second study (e.g., organizational support and transgender identity centrality), but provided both empirical and theoretical justification for doing so.

Next, Bijttebier et al. (2018) appropriately cited Cho and Kim (2015) when discussing calculations of internal consistency. The authors use three estimates to evaluate internal consistency for each scale: coefficient alpha, omega (ω), and omega h (ωH). The authors acknowledged the limitations of relying on coefficient alpha as the sole estimate of internal consistency, per Cho and Kim (2015) discussion of the misconceptions of alpha as an estimate of reliability. In a second exemplar citing Cho and Kim (2015), Garneau et al. (2020) (a) provide support that their items measure a single factor, (b) demonstrate that the test items are essentially tau-equivalent in statistical similarity, and (c) report that the error scores of the items are uncorrelated prior to estimating alpha. The authors also refer to alpha as the lower-bound estimate of reliability when reporting. Additionally, the authors provide an alternative estimate of reliability, McDonald's omega, to provide further evidence of reliability for their scales.

Helfgott et al. (2020) cite Newman (2014) when discussing the handling of their missing data. Following Newman (2014) advice, the authors used multiple imputation to handle missing data (instead of single imputation or listwise deletion). Further, the authors provide a table depicting means, standard errors of the mean, and percent missing for each construct in the study. Incorporating a column into a descriptive table that captures missing data percentages for each construct is a clear and concise way to report construct-level missingness. The authors also report person-level and item-level response rates in the manuscript. In a final exemplar, Bieling et al. (2015) first acknowledge the limitations of using listwise deletion when handling their missing data, stating “listwise case deletion might bias results through variance reduction and is less reliable in estimating model fit and parameter estimates, especially in smaller samples.” Instead, the authors use full imputation maximum likelihood (FIML) to handle missing data, citing Newman (2014) in their decision to do so. The authors also performed various robustness checks to determine if the data was missing completely at random (MCAR), missing at random (MAR), or missing not at random (MNAR). Further, they checked for early/late respondents and external variables that might indicate a pattern that could explain the missing values. The above articles provided great detail when discussing response rates, missing values, and the procedures involved with handling missing values at each level, serving as exemplars when it comes to citing and using Newman (2014) best practice recommendations.

The above examples serve as positive demonstrations of proper usage and citations of ORM BP articles. We encourage authors to review the above examples when citing BP articles in the future, and model similar citation behaviors.

Implications of Satisfactory Use or Misuse

Unfortunately, the inappropriate or satisfactory use of BP article recommendations was found to be more frequently occurring than the appropriate use demonstrated in the previous exemplars. This misrepresentation of ORM BP articles has considerable repercussions for both the research study where it is misrepresented as well as to the field, more broadly. For instance, using inappropriate missing data techniques and including meaningless control variables can bias study results (Newman, 2014; Spector & Brannick, 2011). Similarly, improper reliability estimates produce false confidence in the reliability of our measures (Cho & Kim, 2015) and impact observed relationships. Consequently, not adhering to the methodological recommendations offered in best practice articles hinders the rigor of the study and brings into question the confidence in findings. Additionally, and as seen in past works (Heggestad et al., 2019; Lance et al., 2006), inappropriate citation behaviors have a whisper-down-the-lane effect. The presence of a citation error in one article can result in subsequent misuses and distortions of the true contributions and original recommendations of the ORM BP article. Distortions of methodological practices then lead to distortions in the validity of our scientific results and conclusions.

Notably, we selected the three BP articles for our sample because they contained clear guidelines and/or figures depicting best practice recommendations, with specific procedures for authors to follow. Yet, we still observed a fairly high frequency of misuse in citations (see Tables 3 and 4) despite the clarity of recommendations. If we applied the rate of appropriate use, satisfactory use, and misuse to citations of the top 35 cited BP articles between 2010 and 2020 (N = 8,301 citations) and assume that 81% were cited within the method section as found in our study (N = 6,724 BP article citations in method and results sections), only 1,170 articles may be appropriately applying the practices. In contrast, we would estimate that 3,207 articles are citing BP articles in a satisfactory manner and 2,320 articles are citing BP articles and representing the recommendations inappropriately. This would need to be empirically verified; nevertheless, this is illustrative of the low rate of appropriate use and citation of BP articles relative to other levels of use and the scale of this problem is potentially immense. We would suspect an even higher frequency of misuse in citations of BP articles without the same level of directness (e.g., BP articles without figures or bulleted recommendations). Thus, we believe our findings are a conservative representation of the actual level of misuse occurring in the field, expecting increased misrepresentation and misuse in articles that cite more ambiguous BP recommendation articles.

Table 4.
Reporting Practices in Articles Citing ORM ‘Best Practice’ Articles.

Reporting Practices in Articles That Cited Spector and Brannick (2011); N = 340 Articles N (%) ( + ) / (-)

Used demographic variables as control variables without justification 93 (27.4%) -

Tested their model with and without control variables 190 (55.9%) +

Provided theoretical and/or empirical justification for control variables 204 (60%) +

Used control variables to test alternative explanations for a finding 103 (30.3%) +

Reporting Practices in Articles That Cited Cho and Kim (2015); N = 72 Articles N (%) ( + ) / (-)

Referred to alpha as reliability 60 (83.3%) -

Referred to alpha as internal consistency 44 (61.1%) -

Reported alpha in isolation 42 (58.3%) -

Provided an alternative estimate of reliability for all scales 23 (31.9%) +

Reporting Practices in Articles That Cited Newman (2014); N = 144 Articles N (%) ( + ) / (-)

Did not use recommended techniques [i.e., Listwise Deletion, Single Imputation, No Technique Applied] 69 (47.9%) -

Used recommended techniques [i.e., FIML, Expectation Maximization, Multiple Imputation] 75 (52.1%) +

Reported person-level response rate (e.g., overall response/nonresponse rate) 58 (40.3%) +

Reported full response rate (e.g., full respondents who answered every scale) 35 (24.3%) +

Reported partial response rate (e.g., respondents who answered some but not all of the scales) 36 (25%) +

Note. ( + ) = the practice followed recommendations of the BP article (e.g., appropriate use); (-) = the practice did not follow recommendations of the BP article (e.g., misuse); FIML = Full Information Maximization Likelihood.

Given the nature of our data, the implications of inappropriate citation behaviors in methods sections, and the possibility that BP article misuse is more egregious than what was presented in this study, we developed seven recommendations for various stakeholders in this issue.

Recommendations

Our suggested response to the trends in citation practices of ORM BP pieces depend on the origin of the issue. The explicit reasons for misrepresented ORM best practices are unknown, but there are a few potential explanations. For one, the authors citing BP articles may not understand the content of the article. In this scenario, the onus for proper citation behaviors is not solely on the authors of articles citing BP pieces. However, in other circumstances, the author teams citing BP pieces may be failing to apply BP recommendations due to their own individual shortcomings (e.g., not reading the BP article). Regardless of the reason behind the misrepresentation, there are practices targeted towards different stakeholders that can combat further misuse of methodological BP articles. We used findings from our literature coding to generate specific recommendations for multiple stakeholders in methodological BP articles, including authors citing BP articles, authors of BP articles, and journal editors and reviewers. We believe these stakeholders have a shared responsibility to reduce citation misusage in addition to the author teams citing BP articles.

Authors Citing BP Articles

Adhere to Existing Citation Standards

First and foremost, authors citing BP articles should follow existing citation standards to avoid misrepresenting ORM BP articles and to prevent the continuance of the whisper-down-the-lane effect. The most direct way of preventing this misuse is for authors to read the original ORM BP article each time they follow a recommended practice and cite a BP article, rather than relying on memory or citing and using the article the way it was represented in another scholarly work. Authors citing BP articles should also make a concerted effort to check their reference lists prior to submitting to a journal to ensure that the referenced articles are accurately represented in the body of the article.

Follow BP Article Recommendations in Their Entirety

Authors should apply BP article recommendations in their entirety rather than use BP articles to justify singular decisions. These articles provide a collection of best practices to researchers, and authors should not pick and choose which best practices to follow (and which to ignore). Our results showed that authors are not adhering to all best practice recommendations made in ORM BP articles. The selection of one or two best practices to follow while citing a BP article is a misrepresentation of the article's full contribution of appropriate methodological behaviors. Additionally, the adherence to one best practice does not equate to good methodology nor does it guard against the body of negative consequences that may follow. For instance, an article may cite Spector and Brannick (2011) to justify excluding demographic control variables, but without providing sufficient theoretical and empirical justification of the inclusion of other included controls, there is still a risk of reporting biased results. Thus, authors citing BP articles should make every effort to follow all BP recommendations.

We acknowledge there are incidences where certain BP recommendations may not apply to an author's research. For example, Spector and Brannick (2011) suggest demographic variables “should be avoided as mere control variables.” (pg. 297). Yet, one of the articles we coded for included both age and gender as control variables in their research study—and cited Spector and Brannick (2011) in their discussion of controls. However, the authors do provide extensive justification for using these two demographic variables as controls (e.g., citing research and theory on the Big Five personality traits). Thus, in this circumstance, it was appropriate to deviate from Spector and Brannick (2011) recommendations and use demographic variables as controls considering the substantial prior empirical and theoretical findings.

Interestingly, no articles coded in our study explicitly mentioned not following all recommendations from the BP article they cited, nor provided justification for doing so. When authors choose not to apply certain recommendations, they should transparently note what specific practices they chose not to implement and provide sufficient justification for their decision.

Consider BP Articles During Research Design and Planning

We encourage authors to develop a plan for their methods and analyses prior to data collection so they can more closely follow the best practices recommended in ORM articles. Certain methodological decisions (e.g., the inclusion of control variables) must be made prior to data collection. Accordingly, researchers should consult ORM best practice articles during the research design phase to appropriately integrate the recommendations into their study. This behavior also encourages researchers to reread BP articles instead of relying on how previous researchers have used the BP article citation in their work. Thoroughly reading the BP article increases the likelihood of the author team including all (or most) best practice recommendations in their research design and improves the description of BP article recommendations in their final manuscript.

Report BP Adherence Fully and Transparently

Finally, authors citing BP articles should provide very detailed and transparent reporting of how they used the recommendations presented in the BP article. In some articles coded in our study, the text describing the methodological decisions affiliated with a BP article citation was unclear. For example, an article in our sample citing Spector and Brannick (2011) tested multiple models but did not explicitly state that the models were run with and without controls or that they chose this method to evaluate the relevance of the measured control variables. Similarly, some articles citing Newman (2014) were unclear on what response rate they were reporting (e.g., person-level, full, partial) or what method they used to handle missing data. Ambiguous descriptions of methodological decisions hold significant consequences, including the inability to replicate research and improper interpretation of study results. Other researchers reading the article could also have an inaccurate impression of the BP article based on how it is represented and described in the article referencing it. As previously mentioned, this could lead to further misuse of the BP article as recorded in past studies on methodological traditions (Lance et al., 2006).

We recognize that page and word count restraints imposed upon author teams often limit the amount of space they can dedicate to methods-related reporting in the final journal manuscript. To avoid deleting critical methodological decisions and justification for those decisions from the manuscript, we encourage authors to better utilize footnotes and/or appendices as a way to include more detail regarding their methodology and analyses without interfering with article flow or compromising manuscript space. Additionally, we recommend that researchers investigate the willingness of the journal to post online supplemental materials with the journal article. We also recommend that researchers leverage existing repositories and sites, such as the Open Science Framework (OSF), to provide additional materials to readers as we did in the present article. This has the dual advantage of (a) allowing authors sufficient space to elaborate on the recommendations that they followed from the referenced BP article and (b) creating opportunities for researchers to engage in open science practices that are increasingly used in our field today. See Banks et al. (2019) for more detailed information on open science practices and strategies for posting research materials through OSF.

Authors of BP Articles

Clearly Present BP Recommendations

Authors of BP articles have a responsibility to present their findings and recommendations in a manner that minimizes misunderstanding. Authors of BP pieces should provide guidance to authors of best practice pieces on the presentation of recommendations that will make it easier for researchers to follow and demonstrate that they have adhered to the best practices (e.g., provide checklists or charts). Oftentimes, ORM BP authors number their recommendations or identify them explicitly as recommendations (as we did here) to increase the readability of their article and very clearly communicate to researchers what practices they are arguing should be utilized. When possible, consider presenting a visual to help authors identify and follow recommendations.

Cho and Kim (2015) offer very clear recommendations surrounding the selection of an estimate of reliability and the appropriate language and application of coefficient alpha, specifically. However, they do not clearly state whether researchers should report that the conditions for using coefficient alpha as a reliability estimate are met. As a result, when authors reported alpha as a measure of internal consistency and cited Cho and Kim (2015), only 42% provided evidence of tau equivalency, 42% reported that the error scores were uncorrelated, and 50% demonstrated that their measure represented a single factor. Recommendations that were not clearly defined or identified as necessary steps to conduct and report were found to be more frequently ignored or misrepresented in our study.

Newman (2014) arguably contained the clearest, formulaic recommendations out of the articles included in our sample due to its topic of missing data. We suspect this is due in part to the presence of a decision tree figure in the BP article that clearly outlines the methodological best practice in different scenarios of missing data. Newman (2014) clearly defined which missing data techniques should be avoided and which techniques should be used in various contexts. The consequence of these clear definitions was a fairly low number of articles misusing these recommendations. Newman (2014) had the highest proportion of article citations in the appropriate use category. By increasing the clarity of recommendations, researchers are more able to recognize and interpret the key takeaways and research behaviors that they should employ when citing the BP article. This ultimately enhances the usability of the recommendations and decreases the chance of misinterpretation or authors selecting some practices over others (e.g., cherry picking best practices).

Explicitly State the Necessity of the BP Recommendations

Second, and relatedly, we would advise authors of future best practice articles to design recommendations in a way that limits researchers’ flexibility to make certain methodological decisions. Consider the Nunnally (1978) cutoff as an example. Even the clear statement that the cutoff was only useful in certain research contexts seemed insufficient, given the successive misapplication of the cutoff. Authors of BP articles should explicitly identify when and how best practices should be applied so researchers have more stringent guidelines to follow, and recommendations are not left up to interpretation.

We recognize that in certain scenarios, best practice recommendations are largely dependent on multiple factors and contexts that must be interpreted by the researcher. This could make the generation of mandatory best practice recommendations difficult or impossible. However, there are strategies to increase recommendation clarity without sacrificing the nuance of the best practice (e.g., Newman's decision tree, recommendations listed in order of importance, etc.), and we would encourage BP article authors to consider these strategies and limit researcher flexibility in applying BP recommendations.

Journal Editors and Reviewers

Increase Reviewer Education

After establishing standards for citation behaviors or the use of best practice recommendations, reviewers should be educated on the standards by which authors should be held accountable in their use of best practices. Reviewer training and education in best practice recommendations would elevate reviewer standards of citations within methods sections and increase the criticism of methodological citation practices within the review process. Both of these outcomes have the potential to reduce inappropriate use of best practice recommendations in the literature.

Incorporate Alternative Reviewing Approaches for Research Methods

Journal editors should consider other reviewing approaches that would reduce the likelihood that authors misrepresent best practice recommendations in our literature. One example of a reviewing approach is the assignment of a methods reviewer used by the Journal of Management (JoM). This reviewer dedicates their attention to the methodological decisions of the author team and evaluates the rigor and validity of these decisions in the context of the theoretical and practical justification provided within the article. Methods reviewers could also evaluate the references author teams employ in justifying their decisions. As methodological experts themselves, methods reviewers should have the knowledge or familiarity with the methodological best practices and use that knowledge to act as best practice reference gatekeepers. Similar to the methods reviewer, Todd and Ladle (2008) recommend journals employ a random audit of references by the reviewers at some stage of the review process to combat poor citation behaviors. Randomly evaluating a sample of references used by author teams could prevent misrepresented citations from publication. Moreover, the authors’ understanding that reference lists will be critiqued would encourage the practice of ensuring the appropriate use of citations at each stage of the research process.

The final reviewing approach that we recommend journals to adopt is the encouragement or (in some cases) obligation of articles submitting to their journals to pre-register their research hypotheses and/or research design. The process of pre-registration forces researchers to plan their methodology and analyses upfront, which we identified in a previous recommendation to authors, as a way to prevent the misuse of best practice article recommendations. Not only does pre-registration encourage the researcher to think through methodological and/or analytical decisions early on in the research process, it also may prevent or mitigate the tendency to engage in questionable research practices (QRPs). For example, the selection of one missing data technique over another could alter the significance of a study's findings. As a result, authors may select a missing data technique that increases the statistical significance of their findings rather than the technique that is recommended by Newman (2014). Although we are not accusing authors inappropriately citing Newman (2014) of questionable research practices such as p-hacking, we recognize that the current system for publishing research may unintentionally incentivize QRPs (John et al., 2012). This current system could continue to influence a researcher's motivation to make certain methodological decisions. To prevent the continued misuse of BP article recommendations, whether the misuse is due to last minute planning of the research design or QRPs, study pre-registration should be leveraged. Specifically, authors can disclose that if their dataset contains missing data, they will follow Newman (2014) guidelines. Once methodological decisions (i.e., missing data techniques, control variables) are pre-registered, authors citing BP pieces can be held accountable to follow them in their final manuscript.

Publish Editorial Statements on and Standards of Best Practice Methodologies

We encourage editors to publish editorial statements and set standards about the use of these practices. Currently, there are few incidences where journal editors or reviewers explicitly set standards for the appropriate use of BP recommendations or the use of citations more broadly. Colquitt (2013) made an editorial statement on the expectations of citing behaviors within the Academy of Management Journal, but this conversation was not replicated in other journals or emphasized in most journal standards.

Currently, journals such as the Journal of Applied Psychology (JAP) have existing standards for other methodological decisions such as the reporting of specific information in empirical studies. For example, authors are expected to report scale anchors and the scale range in primary quantitative studies. JAP also released Transparency and Openness Promotion (TOP) Guidelines to encourage open science practices. Author engagement in open science practices would alleviate some of the concerns found in this study; however, these standards do not ameliorate all concerns presented in this article. Additional standards should be created and communicated by journals to further reduce inappropriate citation behaviors in our science and specifically within methods sections. While the above standards provide opportunities to hold authors accountable for methodological reporting, they do not explicitly incorporate up-to-date methodological best practices nor enforce the application and appropriate use of BP articles.

In the same way that journals have a checklist for transparency and basic methods reporting practices, journals should consider institutionalizing methodological best practices through updated methodological reporting standards or a new methodological best practice checklist. Journals would be responsible for creating these lists and consistently updating them to reflect current best practices. Best practice checklists would be beneficial to both authors and reviewers. Authors would have a summative list of methodological decisions to involve in the design, implementation, and write-up of their studies and reviewers would be able to rely on a checklist, rather than rely on their education alone, to evaluate the rigor and appropriateness of an article's methods and methodological citations. The creation and implementation of best practice standards in the form of a checklist supports our previous recommendation of ongoing reviewer education on best practices in research methods and releases the burden of methodological expertise from the reviewers. To our knowledge, no journal currently offers clear criteria to researchers or reviewers on how they expect best practice articles to be used and represented in methods sections within their journal.

Limitations

We made every effort to fairly evaluate the articles citing ORM BP pieces. Ultimately, we categorized the use and misuse of methodological BP articles under the assumption that the authors read the BP article that they cited and understood the content of the BP article. The APA (2019) specifically states that authors should cite “only works that you have read and ideas that you have incorporated into your writing.” Given the explicit guidelines and training provided to researchers on how to appropriately incorporate past articles that are included in their reference lists, the comprehensive training offered to graduate students and guidance presented to researchers by Harzing (2002), the APA, and other resources, we operated under the assumption that researchers generally understand the expected standards for reference lists and citation behaviors. How researchers represent prominent methodological pieces with the knowledge of those well-known standards was the central focus of our inquiry.

We recognize that there are unmeasured influences on an author's citation behaviors including an author's motivation or intent for citing an article in their study (among other influential factors) that result in the misrepresentation of a BP article. An author's motivation for citing an article varies. As evidenced by Lance et al. (2006), seminal or notorious articles are sometimes cited to bring credibility or approval to particular research design or study. We cannot and do not wish to speak to the motivations or decision process of the authors citing BP articles; therefore, we only compared the methodological decisions they reported to the explicit recommendations presented in the BP articles they cite. We made no additional inferences in regard to why the author teams made certain decisions in how they represented and used BP articles in their study. We solely sought to record the varying degrees of use and misuse of BP articles and make recommendations to improve the use and representation of BP articles based on our systematic observations of current citation practices.

Conclusion

Citation behaviors are fundamental to our science. Across a sample of three ORM best practice articles, only 17.4% of articles citing these pieces were identified as appropriately representing and applying their recommendations. In considering the results of our literature coding and existing standards and behaviors in the journal publication process, we presented nine key recommendations to three stakeholder groups to prevent further the misuse of best practice articles. The adherence to and application of the recommendations above can increase the appropriate use of ORM best practice articles which increases the overall methodological rigor of our science.

	BP Article	Citations in Method or Results Sections	Citations Outside of Method or Results Sections	Classification
Pilot Coding Round 1	Becker (2005)	19	6	Appropriate Use: 3 Satisfactory Use: 9 Misuse: 13
Rogelberg and Stanton (2007)	12	13	Appropriate Use: 3 Satisfactory Use: 13 Misuse: 9
Pilot Coding Round 2	Cho and Kim (2015)	18	7
Newman (2014)	17	8
Aguinis et al., (2011)	14	11
Spector and Brannick (2011)	18	7

Classification	Spector and Brannick (2011)	Newman (2014)	Cho and Kim (2015)
Appropriate Use (17.4%)	Provided appropriate justification for the inclusion of controls Refrained from using demographic variables as controls unless theoretically relevant to the study	Reported both the response rate/s and used an appropriate method to handle missing data	Acknowledged the limitations of alpha and provided additional estimates of reliability
Satisfactory Use (47.7%)	Provided justification for some, but not all controls Provided sufficient, but not comprehensive justification for the inclusion of controls (i.e., solely justified their inclusion by testing models with and without control variables)	Used appropriate method to handle missing data (e.g., FIML, multiple imputation), but did not report response rate/s Reported response rate/s but did not use appropriate method to handle missing data	Used an alternate estimate of reliability but do not acknowledge limitations of alpha Acknowledged limitations of alpha but do not report an alternative estimate Reported alternative estimates for some but not all scales
Misuse (34.5%)	Provided no justification for control variables Used demographic variables as controls with no justification Only used demographic variables as controls	Used listwise deletion Used single imputation Deleted data Provided no information on response rate/s Unclear how the author/s handled the missing data	Cited a cutoff value for alpha & provided no additional estimate/s Reported an alpha below 0.70 Reported an alpha value with no further information

BP Article	Misuse (%)	Satisfactory Use (%)	Appropriate Use (%)
Spector and Brannick (2011)	36.2%	49.7%	14.1%
Cho and Kim (2015)	43.1%	43.1%	13.8%
Newman (2014)	27.1%	45.8%	27.1%
Overall	34.5%	47.7%	17.4%

Reporting Practices in Articles That Cited Spector and Brannick (2011); N = 340 Articles	N (%)	( + ) / (-)
Used demographic variables as control variables without justification	93 (27.4%)	-
Tested their model with and without control variables	190 (55.9%)	+
Provided theoretical and/or empirical justification for control variables	204 (60%)	+
Used control variables to test alternative explanations for a finding	103 (30.3%)	+
Reporting Practices in Articles That Cited Cho and Kim (2015); N = 72 Articles	N (%)	( + ) / (-)
Referred to alpha as reliability	60 (83.3%)	-
Referred to alpha as internal consistency	44 (61.1%)	-
Reported alpha in isolation	42 (58.3%)	-
Provided an alternative estimate of reliability for all scales	23 (31.9%)	+
Reporting Practices in Articles That Cited Newman (2014); N = 144 Articles	N (%)	( + ) / (-)
Did not use recommended techniques [i.e., Listwise Deletion, Single Imputation, No Technique Applied]	69 (47.9%)	-
Used recommended techniques [i.e., FIML, Expectation Maximization, Multiple Imputation]	75 (52.1%)	+
Reported person-level response rate (e.g., overall response/nonresponse rate)	58 (40.3%)	+
Reported full response rate (e.g., full respondents who answered every scale)	35 (24.3%)	+
Reported partial response rate (e.g., respondents who answered some but not all of the scales)	36 (25%)	+

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship and/or publication of this article.

ORCID iD

Liana M. Kreamer

Author Biographies

Liana M. Kreamer is a doctoral student in Organizational Science at the University of North Carolina at Charlotte. Her current research interests include leadership, workplace meetings, and virtual teams. Liana serves as an Assistant Editor for the Journal of Business and Psychology. She is also a member of UNCC’s Yet! and OSSI outreaches. Before attending the OS program, Liana earned her bachelor’s degree in Psychology at Pennsylvania State University and her master’s degree in Industrial and Organizational Psychology at UNC, Charlotte.

Betsy H. Albritton is a doctoral student in Organizational Science at the University of North Carolina at Charlotte. Her current research interests include measurement, big data science, and leadership.

Scott Tonidandel is a Professor of Management in the Belk College of Business at the University of North Carolina - Charlotte and is a faculty member of the Organizational Science PhD program and the School of Data Science. Scott received his M.A. & Ph.D. in industrial-organizational psychology from Rice University and his B.A. from Davidson College. Scott’s research interests include issues related to leader effectiveness, the impact of diversity in organizations, and research methods and statistics. His recent work focuses on people analytics and the interface of big data and the organizational sciences.

Steven G. Rogelberg holds the title of Chancellor’s Professor at UNC Charlotte for distinguished national, international and interdisciplinary contributions. He is a Professor of Organizational Science, Management, and Psychology as well as the Director of Organizational Science. He has nearly 150 publications addressing issues such as team effectiveness, leadership, engagement, health and employee well-being, meetings at work, and organizational research methods. He is the Editor of the Journal of Business and Psychology, recipient of the Humboldt Award, and is currently President of SIOP.

References

Aguinis

Pierce

C. A.

Bosco

F. A.

Dalton

D. R.

Dalton

C. M

. (2011). Debunking myths and urban legends about meta-analysis. Organizational Research Methods, 14(2), 306‐331. https://doi.org/10.1177/1094428110375720

Aguinis

Ramani

R. S.

Villamor

. (2019). The first 20 years of Organizational Research Methods: Trajectory, impact, and predictions for the future. Organizational Research Methods, 22(2), 463‐489. https://doi.org/10.1177/1094428118786564

American Psychological Association. (2019, September). In-text citations. APA Style. https://apastyle.apa.org/style-grammar-guidelines/citations

Banks

G. C.

Field

J. G.

Oswald

F. L.

O’Boyle

E. H.

Landis

R. S.

Rupp

D. E.

Rogelberg

S. G

. (2019). Answers to 18 questions about open science practices. Journal of Business and Psychology, 34(3), 257‐270. https://doi.org/10.1007/s10869-018-9547-8

Becker

T. E

. (2005). Potential problems in the statistical control of variables in organizational research: A qualitative analysis with recommendations. Organizational Research Methods, 8(3), 274‐289. https://doi.org/10.1177/1094428105278021

Bieling

Stock

R. M.

Dorozalla

. (2015). Coping with demographic change in job markets: How age diversity management contributes to organisational performance. German Journal of Human Resource Management, 29(1), 5‐30. https://doi.org/10.1177/239700221502900101

Bijttebier

Bastin

Nelis

Weyn

Luyckx

Vasey

M. W.

Raes

. (2018). Temperament, repetitive negative thinking, and depressive symptoms in early adolescence: A prospective study. Journal of Psychopathology and Behavioral Assessment, 40(2), 305‐317. https://doi.org/10.1007/s10862-017-9624-8

Cho

Kim

. (2015). Cronbach’s coefficient alpha: Well known but poorly understood. Organizational Research Methods, 18(2), 207‐230. https://doi.org/10.1177/1094428114555994

Colquitt

J. A

. (2013). Crafting references in AMJ submissions. Academy of Management Journal, 56(5), 1221‐1224. https://doi.org/10.5465/amj.2013.4005

10.

Cortina, J. M., Sheng, Z., Keener, S. K., Keeler, K. R., Grubb, L. K., Schmitt, N., Tonidandel, S., Summerville, K. M., Heggestad, E. D., & Banks, G. C. (2020). From alpha to omega and beyond! A look at the past, present, and (possible) future of psychometric soundness in the Journal of Applied Psychology. Journal of Applied Psychology, 105(12), 1351‐1381. https://doi.org/10.1037/apl0000815

11.

Faunce

G. J.

Job

R. F

. (2001). The accuracy of reference lists in five experimental psychology journals. American Psychologist, 56(10), 829‐830. https://doi.org/10.1037/0003-066X.56.10.829

12.

Garneau

Laventure

Temcheff

C. E

. (2020). Internal structure and measurement invariance of the dominic interactive among indigenous children in quebec. Psychological Assessment, 32(2), 170‐181. https://doi.org/10.1037/pas0000775

13.

Harzing

A. W

. (2002). Are our referencing errors undermining our scholarship and credibility? The case of expatriate failure rates. Journal of Organizational Behavior, 23(1), 127‐148. https://doi.org/10.1002/job.125

14.

Heggestad

E. D.

Scheaf

D. J.

Banks

G. C.

Monroe-Hausfeld

Tonidandel

Williams

E. B

. (2019). Scale adaptation in organizational science research: A review and best-practice recommendations. Journal of Management, 45(6), 2596‐2627. https://doi.org/10.1177/0149206319850280

15.

Helfgott

J. B.

Parkin

W. S.

Fisher

Diaz

. (2020). Misdemeanor arrests and community perceptions of fear of crime in Seattle. Journal of Criminal Justice, 69, 1‐19, 101695. https://doi.org/10.1016/j.jcrimjus.2020.101695

16.

John

L. K.

Loewenstein

Prelec

. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524‐532. https://doi.org/10.1177/0956797611430953

17.

Koopman

Lanaj

Scott

B. A

. (2016). Integrating the bright and dark sides of OCB: A daily investigation of the benefits and costs of helping others. Academy of Management Journal, 59(2), 414‐435. https://doi.org/10.5465/amj.2014.0262

18.

Krippendorff

. (2004). Measuring the reliability of qualitative text analysis data. Quality and Quantity, 38, 787‐800. https://doi.org/10.1007/s11135-004-8107-7

19.

Lance

C. E.

Butts

M. M.

Michels

L. C

. (2006). The sources of four commonly reported cutoff criteria: What did they really say? Organizational Research Methods, 9(2), 202‐220. https://doi.org/10.1177/1094428105284919

20.

Liang

Zhong

Rousseau

. (2014). Scientists’ referencing (mis) behavior revealed by the dissemination network of referencing errors. Scientometrics, 101(3), 1973‐1986. https://doi.org/10.1007/s11192-014-1275-x

21.

Martinez

L. R.

Sawyer

K. B.

Thoroughgood

C. N.

Ruggs

E. N.

Smith

N. A

. (2017). The importance of being “me”: The relation between authentic identity expression and transgender employees’ work-related attitudes and experiences. Journal of Applied Psychology, 102(2), 215. https://doi.org/10.1037/apl0000168

22.

Mitchell-Williams

M. T.

Skipper

A. D.

Alexander

M. C.

Wilks

S. E

. (2017). Reference list accuracy in social work journals: A follow-up analysis. Research on Social Work Practice, 27(3), 348‐352. https://doi.org/10.1177/1049731515578536

23.

Newman

D. A

. (2014). Missing data: Five practical guidelines. Organizational Research Methods, 17(4), 372‐411. https://doi.org/10.1177/1094428114548590

24.

Nunnally

J. C.

(1978). Psychometric theory (2nd ed.). McGraw-Hill.

25.

Rogelberg

S. G.

Stanton

J. M

. (2007). Introduction: Understanding and dealing with organizational survey nonresponse. Organizational Research Methods, 10(2), 195‐209. https://doi.org/10.1177/1094428106294693

26.

Spector

P. E.

Brannick

M. T

. (2011). Methodological urban legends: The misuse of statistical control variables. Organizational Research Methods, 14(2), 287‐305. https://doi.org/10.1177/1094428110369842

27.

Spivey

C. A.

Wilks

S. E

. (2004). Reference list accuracy in social work journals. Research on Social Work Practice, 14(4), 281‐286. https://doi.org/10.1177/1049731503262131

28.

Todd

P. A.

Guest

J. R.

Chou

L. M

. (2010). One in four citations in marine biology papers is inappropriate. Marine Ecology Progress Series, 408, 299‐303. https://doi.org/10.3354/meps08587

29.

Todd

P. A.

Ladle

R. J

. (2008). Citations: Poor practices by authors reduce their value. Nature, 451, 244. https://doi.org/10.1038/451244b