Abstract
This research studied changes in theory building and testing levels, reported in 668 articles published in three leading Human Resource Development (HRD) journals in years 2000 to 2017. Using a modified taxonomy of theory building research, we found evidence to suggest that the trajectory of theory building and testing efforts in HRD suggests that the field has reached a stage of a mature discipline. The study has found that some types of research have become less important as the discipline matured (e.g., Reporters, or articles reporting observation of phenomena) or remain stable (Modifiers of existing theories), whereas others steadily grow in importance (theory Builders and Expanders). Correlating the results of citation analysis with types of articles, we found that articles that propose and test new theoretical constructs at the same time, or those that expand new theories, enjoy significantly higher levels of citations, compared with articles that report observations of practice or duplicate earlier studies.
As an academic field, Human Resource Development (HRD) is still young when compared with such related fields as management and organization studies, adult education, and industrial and organizational psychology. In late 1990s and early 2000s, as the field was still going through the early maturation stages, vibrant debates about the state and future of theory building in HRD were conducted at the annual meetings of the Academy of Human Resource Development (AHRD) and in major HRD publications. Important contributions to these debates were made by essays on the foundations and boundaries of HRD, published by Gary McLean (1998) and Richard Swanson (1995, 1999) and by articles on definitions and theoretical underpinning of HRD, appearing in a 2001 issue of the Human Resource Development International (HRDI; Lee, 2001; McGoldrick, Stewart, & Watson, 2010; McLean & McLean, 2001). In 2002, a special issue of the Advances in Developing Human Resources (ADHR), devoted to HRD theory building methods, made a major contribution to our growing understanding of the types of theory building research relevant to this applied field (e.g., Lynham, 2002a; Torraco, 2002). Articles in the special issue addressed, among other things, existing gaps in HRD theory building and possible future directions for theory development efforts.
Since then, the field has made tremendous progress toward becoming an established applied research discipline. The leading academic professional association in the field, the AHRD, has celebrated its 25th anniversary in 2018. The Academy is sponsoring four peer-reviewed journals: Human Resource Development Quarterly (HRDQ), Human Resource Development Review (HRDR), HRDI, and ADHR. Two of these publications (HRDQ and HRDR) are listed in the Journal Citation Report, with impressive impact factors. The range of topics covered in articles published in the leading HRD journals is highly diverse. This is not surprising, given that HRD is an interdisciplinary field that attracts, in addition to scholars and practitioners trained in HRD doctoral programs, academics and scholar-practitioners with backgrounds in various social science disciplines (e.g., sociology, industrial and organizational psychology, management, adult education, and career and technical education). Likewise, in terms of research methods, HRD publications are displaying openness to a wide range of epistemological stances and to various quantitative, qualitative, and mixed-method designs.
Despite these advances, questions remain whether HRD as a field is making a steady progress in strengthening the rigor and sustainability of theory building efforts. A limited number of articles attempted to identify topics and themes, covered in HRD journals over a period of time (Garavan, O’Donnell, McGuire, & Watson, 2007; Ghosh, Kim, Kim, & Callahan, 2014), and more recently, Han, Chae, Han, and Yoon (2017) provided an overview of the evolving body of the HRD research literature, by tracing the changes in definitions of HRD, citations to influential work, and emerging research topics. However, to our knowledge, no attempts have been made to systematically study the process and outcomes of theory building efforts in the field.
Swanson and Chermack (2013) asserted that a mature discipline needs to have a body of formal theories, developed through a succession of studies that empirically test theoretical propositions or hypotheses. Such tests lead to refinement (and, sometimes, refutation) of theories as a result of finding the new evidence that often contradicts the initial assumptions. Turner, Baker, and Kellner (2018) indicated that, as a discipline, we have not yet answered such questions as “What are HRD’s most researched and pragmatic theories? What are HRD’s formal theories? Are these formal theories represented in the current literature?” (p. 35). Specifically, we do not have empirical data demonstrating that new formal HRD theories are not only proposed in conceptual articles but also subjected to rigorous testing in empirical research.
Thus, the main goal of this article is to take stock of the advances in theory building and testing, made in the field of HRD since the beginning of the 21st century. To achieve this goal, we analyzed articles, published in three leading HRD journals (HRDQ, HRDI, and ADHR) between the years 2000 and 2017. In our analysis, we used a modified version of the taxonomy of theory building and testing efforts, proposed by Colquitt and Zapata-Phelan (2007). Our data set was limited to empirical articles drawing on Bacharach’s (1989) notion that empirical adequacy is an important criterion for evaluating theoretical contributions. Empirical articles published in rigorous peer-reviewed journals consist of both conceptual (theory building) and empirical (theory testing) content, while conceptual articles primarily focus on theoretical development without presenting the results of theory testing (Yadav, 2010).
We acknowledge that conceptual articles play an important role in theory advancement by enhancing conceptual adequacy (Corley & Schinoff, 2017; Shrivastava, 1987). The conceptual adequacy in applied disciplines, like HRD, has a function of generating theoretically adequate conceptual frameworks based on underlying disciplines such as psychology, economics, and sociology (Dubin, 1978). Conceptual articles facilitate theory development by presenting theoretical syntheses (e.g., theory review articles) or introducing new ideas (e.g., proposing new theories or formulating propositions). Therefore, in future studies, we hope to expand our research to include the assessment of contributions of conceptual articles (especially those published in HRDR).
In this study, we focused on four research questions: What are the levels of theory building and theory testing of empirical studies, reported in articles published in three leading HRD journals? What are the differences among the journals in levels of theory building and theory testing contributions? What are the changes in theory building and theory testing over time? and What is the impact of various types of articles, as measured by citation analysis?
The rest of this article is structured as follows. First, we provide an overview of the literature on taxonomies of theory building efforts. Second, we review briefly the literature on theory building efforts in HRD. Third, we explain the methods and procedures, used in our study. Fourth, we present our findings, organizing them in four sections according to the above-listed research questions. The article concludes with a discussion of study implications.
Literature Review and Research Questions
We start this section by discussing literature on what constitutes a theoretical contribution. Next, we discuss taxonomies of theoretical contributions and review HRD literature on theory building contributions and article impact analysis. Finally, we present four research questions, derived from the literature review.
Types of Theoretical Contributions
Academic disciplines mature through articulating theories that address important phenomena for their research domains (Nerur, Rasheed, & Natarajan, 2008; Rust, 2006; Turnbull, 2002a; Weber, 2012). Torraco (2004) maintained that different types of theoretical research are needed depending on the development stages of an academic discipline. However, theory testing in social science disciplines seems to be lagging the generation of new conceptual frameworks. For example, Kacmar and Whitfield (2000) found that only 9% of theoretical frameworks, published in the Academy of Management Reseach Journal, have been tested through empirical research.
Before we elaborate on the notion of theoretical contribution, we need to ask a more fundamental question: “what is a theory?” Although there is a wide range of definitions of theory in the social science literature, many scholars conceptualize theory as either the explanation or the prediction of relationships between independent and dependent variables that are used to describe a phenomenon (Colquitt & Zapata-Phelan, 2007; Turner et al., 2018). Comparing theory with data, variables, and hypotheses, DiMaggio (1995) and Sutton and Staw (1995) emphasized that theory should account for the relationship between variables and go beyond narratives by clarifying a set of boundary conditions and empirical tests of the plausibility of the narrative. Jaccard and Jacoby (2010) maintained that theories are socially constructed and evolve over time. In this study, we define theory as a framework that can be used to explain or predict a phenomenon within a set of boundary conditions and can be tested by empirical research.
There is no agreed-upon answer to a question regarding what constitutes a theoretical contribution. Important contributions to this debate were made by Bacharach (1989), Bartunek et al. (2006), Corley and Gioia (2011), K. G. Smith and Hitt (2005), and Van de Ven (1989), to name a few. Corley and Gioia (2011), based on their review of the literature on theoretical contributions, argued that one way of assessing a study’s theoretical contribution is by looking at its originality, with studies ranging on a continuum from incremental to revelatory. The incremental contributions are made within the old, time-tested paradigm of the normal science, whereby theory advancements are achieved when scholars add to the knowledge base by testing and refining the earlier formulated theories or by proposing new theories to change the status quo in understanding the phenomena of interest (Kuhn, 1962). Thus, theoretical bases of a field are continually evolving through collective effort of multiple players (Snow & Thomas, 1994; Turner et al., 2018). Turner and his colleagues (2018) suggested a conceptual map of a theory’s life cycle, which shows that informal theories that provide new perspectives or challenge existing theories could obtain rigor and become formal theories after passing multiple tests over time, forming a foundation of a new body of knowledge within an academic discipline.
The revelatory type of contributions is made “when theory reveals what we otherwise had not seen, known, or conceived” (Corley & Gioia, 2011, p. 17). Or, as asserted by Mintzberg (2005), a good theoretical contribution challenges previous assumptions, surprises us, and/or looks at the phenomena in new, unconventional ways.
Bacharach (1989) maintained that theoretical contributions of studies may be evaluated using the criteria of falsifiability and utility. Falsifiability refers to the possibility of empirical refutation of a theory, and utility refers to whether a theory can explain or predict phenomenon that occur in the empirical world. Theoretical contributions can be made by either conceptual or empirical studies (Corley & Schinoff, 2017). However, Bacharach emphasized that conceptual research needs to be tested to build complete theories, mentioning that the goal of research is “to ensure that theoritical systems and statements can be empirically tested, and provide some source of explanation and prediction” (p. 512).
Reay and Whetten (2011) explained two different ways through which studies can contribute to the body of knowledge. First, they improve the explanatory power of existing theories by modifying them through new insights. Second, studies reject existing theories and create new theories to replace them. Generally, the first trend is observed more often than the second.
Whetten (1989) used Dubin’s (1978) framework to propose that a good theory must possess four elements: factors (variables, constructs, and concepts), the relationships between factors, underlying assumptions to justify choosing factors and relationships, and a set of boundary conditions. Whetten argued that the impact of theoretical contributions of articles could be different depending on which of the four elements they address (and improve). Even an improvement in only one of the elements could make a strong contribution to theory development. Thus, changes in factors and relationships among them, or introduction of new factors and relationships, can significantly impact the body of knowledge. The modifications of underlying assumptions can, in turn, lead to changes in factors and relationships of the theory.
However, it is assumed that changes in boundary conditions cannot significantly improve a body of knowledge. In other words, applying theories to a new setting by itself cannot contribute to improving a theory. On the contrary, when a theory does not work in a new setting, this finding by itself can provide an impetus for further theoretical improvements: The discrepancy between research findings and the existing theory prompts theorists to focus their attention on understanding the source if they identified anomalies. Finally, when a theory works in a new setting, such theory testing results can contribute to the growth of the body of knowledge by providing a new reference point for theory generalization.
Taxonomy of Theory Building and Testing Efforts
According to Colquitt and Zapata-Phelan (2007), studies can contribute to the disciplinary body of knowledge in two ways: through theory building and theory testing. In certain cases, theory testing is one of the steps of the sequence of efforts, aimed at building a new theory (e.g., Dubin, 1978; Lynham, 2002b; Van de Ven, 2007). However, in other cases, scholars do not have a goal of developing a new theory and, instead, focus on testing existing theories, and this activity by itself can play an essential role in the incremental growth of theoretical bases of a field. Colquitt and Zapata-Phelan’s (2007) taxonomy (Figure 1) shows two different types of theoretical contributions. The authors maintained that most articles are making contributions to both theory building and theory testing, but the extent of such contributions varies. This constitutes a departure from the previously prevailing notion that articles should be classified as contributing to either theory building or theory testing (e.g., Snow & Thomas, 1994). As Figure 1 shows, the taxonomy consists of axes for theory building and testing, each with a 5-point scale ranging from 0 to 5. The authors measured the level of theory building by using Whetten’s (1989) criteria, discussed earlier. In addition, they referred to Sutton and Staw’s (1995) work that described the difference between theory and other concepts, such as hypotheses, for example.

A taxonomy of theoretical contributions for empirical articles.
Colquitt and Zapata-Phelan (2007) classified articles into five types depending on the scores assigned to them in theory building and testing: Reporters, Testers, Qualifiers, Builders, and Expanders. Reporters show low levels of theoretical contribution in both theory building and theory testing. However, these articles play an essential role by attracting attention to previousely unnoticed phenomena. According to Colquitt and Zapata-Phelan, “Testers are defined as empirical articles that contain high levels of theory testing but low levels of theory building” (p. 1286). Such articles use, as a rule, well-developed and established theories as their theoretical frameworks and test these theories using empirical methods. Furthermore, the authors define Qualifiers as articles that “contain moderate levels of both theory testing and theory building. Such articles qualify previously established relationships or processes using conceptual arguments rooted in the extant literature” (p. 1286). Builders are, as a rule, quite high on the theory building dimension but low on testing. Quite often these are inductive studies that propose new constructs and relationships but conduct only limited tests of the proposed theory, using small samples. Finally, expanders are high in both theory building and testing. These studies articulate new relationships or constructs and, at the same time, test the existing theories and models (and, usually, add new constructs to the tested models).
This taxonomy has been acknowledged as a useful framework to evaluate the levels of theory building and testing (e.g., Aguinis, Dalton, Bosco, Pierce, & Dalton, 2010; Fink, 2013). However, the use of such terms as mediator and moderator suggests that this taxonomy is based on postpositivist assumptions. Compared with management research, where postpositivism is the predominant paradigm, HRD research tends to be based on a wider variety of epistemological stances and approaches (Turnbull, 2002b). However, Lynham (2002a) stated that although different theory building methods seem to use different processes, “there is an inherently generic nature to theory building” (p. 221), and an applied theory building process is a recursive system of five distinct phases: “conceptual development, operationalization, application, confirmation or disconfirmation, and continuous refinement and development (of the theory)” (p. 229). Therefore, we assume that the taxonomy, proposed by Colquitt and Zapata-Phelan, could be applied to a variety of empirical studies, whether they are qualitative or quantitative, or based on inductive or deductive research logic, if appropriate modifications are made to categories on the two axes of the taxonomy.
Theory Building in HRD
As HRD is a relatively young academic discipline, the debate about its boundaries, philosophies, and definitions is still ongoing (Garavan, Gunnigle, & Morley, 2000; Han et al., 2017; Ruona, 2016). Previous research has contributed to solidifying the realm of HRD by delineating definitions, major topics of interest, and related disciplines (e.g., Ghosh et al., 2014; Han et al., 2017; Jeung, Yoon, Park, & Jo, 2011; Ruona, 2016).
Ruona (2016) maintained that although various definitions usually focus on one of the aspects of HRD work, for instance, either learning or performance improvement, most of them incorporate the emphasis on learning and development in the workplace (i.e., Hamlin & Stewart, 2011; Ruona & Swanson, 1998; Weinberger, 1998). Ruona further emphasized that while HRD focused on learning and development of individuals at the initial stages of the discipline’s evolution, at the later stages its scope has broadened to include the organizational level and organization development as one of the core areas of HRD practice.
Han et al.’s (2017) study used network and scientometric analysis and discussed changes in the identity and boundaries of HRD, separating the history of HRD into three big waves. The first wave, from 1960s to 1970s, was a period in which HRD was establishing its goals and identity as an academic discipline. Articles, published at that time, seemed to agree that the role of HRD is to develop human resources to achieve the organization’s goals.
The second wave, from 1980s to 1990s, was characterized by two developments. The first was the expansion of the focus of research, from individual to organizational learning and development (Ruona, 2016). At the same time, leadership training and management education had become essential HRD practices (Garavan, 1991; Smith, 1990). The other theme was the debate between learning and performance as the ultimate outcome of HRD. In the third wave, starting in 2000s, HRD scholars have focused on the role of globalization and technological advancements in changing workplace environments.
A limited number of articles in HRD journals used content and citation analysis to determine the evolution of thematical focus and impact of HRD research. Thus, Ghosh et al. (2014) analyzed 939 articles published in ADHR, HRDI, HRDQ, and HRDR in the years 2002 to 2011 to track the changes in research topics through frequency and content analysis. They reported that although the share of research articles on learning and training has been decreasing during this period of time, the number of studies on leadership, performance, and knowledge has been rapidly increasing.
Jeung et al. (2011) identified 20 HRD articles cited most often by publications in disciplines other than HRD. The study sample included articles, published in four HRD journals, sponsored by the AHRD, from 1990 to 2009. The study found that HRD articles in three areas (training transfer and evaluation, learning organization, and knowledge sharing and creation) were cited most frequently by articles published in business, management, education, learning, and performance literature.
Summary and Research Questions
In summary, the above review shows that theoretical contributions can be evaluated based on a number of criteria, and the most fruitful direction is to identify an article’s contribution along two axes: contribution to theory building and contribution to theory testing. These contributuons can be classified into a number of categories. Furthermore, the levels of contributions are likely to change over time as the discipline matures. In addition, it is possble to identify the impact of various types of contributions, using the citation analysis technics and correlating the citation results with contribution types. Applying this logic to our study of theoretical contributions of HRD articles, we formulated the following research questions:
Method
In this section, we discuss our data sources and analysis procedures. We also explain why we decided to modify the taxonomy of theory contributions and present our modified taxonomy.
Data Collection
Our data set consists of all peer-reviewed articles from three journals sponsored by the AHRD (ADHR, HRDI, and HRDQ), published between 2000 and 2017. We downloaded all peer-reviewed articles from the three journals into an Excel file, arranging them by journal, year, and issue. As mentioned earlier, we decided to focus on empirical articles only, following the arguments of Dubin (1978) and Bacharach (1989) that theory building can be considered complete when proposed theories are empirically tested. Empirical research articles (668 total) that make a theoretical contribution by building and testing theory simultaneously were retained in the database, while the articles, identified as nonempirical (e.g., conceptual, theoretical, or essay articles) were removed (Table 1). HRDR was excluded from our study since the journal aims to publish conceptual work as opposed to articles reporting empirical studies.
Peer-Reviewed and Empirical Articles by Journal.
Note. ADHR = Advances in Developing Human Resources; HRDI = Human Resource Development International; HRDQ = Human Resource Development Quarterly.
First issues of the three journals were published in different years (1999 for ADHR, 1998 for HRDI, and 1990 for HRDQ). To begin our review with the same year, we selected the year 2000 as a starting point, instead of 1999, since the first volume of ADHR consisted of conceptual articles, discussing research methods, or providing overviews of various HRD topics, rather than empirical studies.
Data Analysis
After an initial attempt to analyze a sample of articles, we concluded that the taxonomy, as presented in Figure 1, is mostly geared toward the analysis of articles that are based on postpositivist assumptions and does not allow for a more fine-tuned analysis of other empirical contributions (e.g., interpretive, ethnographic, and critical theory-based studies, often found in the HRD publications). To make the taxonomy more applicable to the analysis of a wider range of studies, we borrowed ideas from Reay and Whetten (2011). In terms of the level of theory building, the extent to which an article modifies existing theories or creates new theories was evaluated based on Reay and Whetten’s (2011) argument that there are two different ways to expand theory. To assess the level of theory testing, we followed the method that Colquitt and Zapata-Phelan (2007) used.
As we were developing our rating procedures, we agreed that raters may have different perspectives, biases, and levels of understanding about the rating criteria. To address these possible imbalances, three authors independently rated 20 randomly selected articles, and then discussed each individual’s rationale for their choices. These assessments and discussions took place over the course of a month and involved multiple meetings. Through this debating and learning process, we were able to establish common standards for rating the remaining articles.
After the above-described initial assessment, in which all three authors participate, the rating of the remaining articles in the data set was conducted independently by the first two authors. Despite our efforts to eliminate potential mismatches between raters, the ratings may vary due to raters’ mistakes, their ability to rate properly, or miscommunication between raters (Landers, 2015). To check reliability of ratings between the two raters, we conducted an interrater reliability test using the intraclass correlation coefficient (ICC) with 30 randomly selected articles (Landers, 2015; Shrout & Fleiss, 1979). In social science, the usefulness of ICC for checking reliability between raters is well documented (Bartko, 1966). ICC has been widely used to investigate reliability between raters in various fields of study (Amabile, Conti, Coon, Lazenby, & Herron, 1996; Colquitt & Zapata-Phelan, 2007; Shear et al., 2001). ICC (2) adopted in our interrater reliability test allowed us to identify the consistency of ratings between the two raters. A recent study suggested that ICC values between .5 and .75 indicate moderate reliability and values between .75 and .9 indicate good reliability (Koo & Li, 2016). We obtained a value of .61 for theory building ICC (2) and .76 for theory testing ICC (2). This result indicated that the reliability between two raters was adequate. After checking the reliability, one of the authors coded articles with even numbers and the second author coded those with odd numbers. As in the spreadsheet articles were listed according to journal, year, and issue, both raters had about the same number of articles to rate in each issue. To keep the trail of evidence, the two raters were entering not only ratings but also detailed notes on each article. For instance, when coding theory testing, we were noting what theory, model, or concept were used or tested. For theory building, we were recording comments to indicate that an article identified a new relationship or developed upon an existing relationship.
We evaluated the impact of the articles using Google Scholar (GS) citation counts. Several researchers have shown that the Institute for Scientific Information’s (ISI’s) citation counts that were most commonly used in the past have limited coverage (Harzing & Van der Wal, 2008; Sanderson, 2008). Although GS citation algorism has disadvantages of double counting and inclusion of nonscholarly citation, overall GS is considered as an important alternative source of citation data that is both accurate and based on more comprehensive coverage, especially once double counting is eliminated through the use of specialized software like “Publish or Perish” (Harzing & Van der Wal, 2009).
Modified Taxonomy
After making an initial attempt to classify the articles using the Colquitt and Zapata-Phelan’s taxonomy, we realized that it had another gap that was making our classification effort difficult. As acknowledged by the authors of the taxonomy themselves, the limitation is that some of the cells were not defined and, thus, the taxonomy does not provide comprehensive coverage of various types of contributions. Therefore, to compensate for the drawback of Colquitt and Zapata-Phelan’s classification, we added four types of empirical articles into their taxonomy: Confirmers, Modifiers, Advanced Builders, and Advanced Qualifiers (Figure 2).

The modified taxonomy of theory building and theory testing efforts.
As mentioned earlier, Reporters are defined as articles that report observations or duplicate previous findings. In our modified taxonomy, the Reporters received one to two points for both theory building and theory testing. For example, an article by Wood and Sella (2000), classified by us as a Reporter, explained, through the analysis of in-depth qualitative interviews, the limitations and possibilities that internal training efforts faced in the textile industry in South Africa. The article did not discuss either theoretical foundations of their investigation, or theoretical and conceptual implications of their findings. Confirmers are defined as articles reporting studies testing conceptual arguments that have never been empirically tested or those testing arguments that have had mixed results in previous empirical studies. We categorized an article as a confirmer when the article received 2.5 to 3.5 points in theory testing and one to two points in theory building. Bates, Holton, and Hatala (2012) is a good example of a confirmer. The study tested the Learning Transfer System Inventory (LTSI) to confirm the discrepancy identified in previous studies. Testers are defined as articles reporting studies that test an existing theory under different boundary conditions. Testers received one to two points in theory building and four to five points in theory testing. For example, Chen, Holton, and Bates (2005) tested LTSI in Taiwan to see whether LTSI created in the United States works in different cultures.
We define Modifiers as studies that elaborate on an existing theory by adding moderators and mediators based on previous findings. Articles in this category received 2.5 to 3.5 points in theory building and one to two in theory testing. Rai and Singh (2013) explored the mediating variables such as interpersonal communication in the relationship between feedback and employee performance, grounding their model in multiple previous research findings. Qualifiers are defined as articles that improve an existing theory using additional conceptual arguments. Qualifiers received 2.5 to 3.5 point in both theory building and testing. We classified Tansky and Cohen’s (2001) article as a Qualifier because the authors explored the relationship among organizational support, employee development, and organizational commitment after advancing new conceptual arguments based on a literature review. We define an Advanced Qualifier as a study that improves existing theory using different theories to create new hypotheses. Advanced Qualifiers received 2.5 to 3.5 points in theory building and four to five points in theory testing. For example, Ehrhardt, Miller, Freeman, and Hom (2011) studied the effect of volition in team participation on the relationship between perceived training comprehensiveness and organizational commitment. The authors used the social exchange theory to generate a theoretical framework.
A study in the Builders category builds a new theory by using inductive methods such as grounded theory. Builders received four to five points in theory building and one to two points in theory testing. Coates’s (2017) article is an example of a Builder. The author identified essential themes in the construction of the meaning of work by representatives of Generation Y through a phenomenological study. Advanced Builder is defined as an article that introduces new constructs or relationships based on conceptual arguments. This category received four to five points in theory building and 2.5 to 3.5 points in theory testing. For example, Belle, Burley, and Long (2015) tested hypotheses related to high-intensity teleworkers’ feeling of belonging. They used previous theoretical arguments on organizational belonging to construct their theory related to telework settings. Finally, an Expander is defined as a study that develops and tests new constructs or variables based on existing theory. This category received four to five points in both theory building and testing. Soane and colleagues’ (2012) research can be categorized as an Expander because the study introduced a new model of employee engagement that was different from existing perspectives (e.g., Kahn’s, 1990, or Meyer and Gagne’s, 2008, models).
Findings
In this section, we report the results of our analysis, addressing the four research questions. Scores, assigned to each article, were used by us in various statistical procedures, including calculation of means, the one-way analysis of variance (ANOVA), and hierarchical multiple regression.
Levels of Theory Building and Theory Testing: Overall Results and Results by Journal
To answer our first two research questions, we calculated composite means scores for all three journals, and also scores for each of the three journals. Table 2 shows combined mean scores for three journals, mean scores for each journal, and results of the one-way ANOVA test comparing the theory building and theory testing levels among the three journals. According to the theory building and theory testing mean scores (ranging from 2.24 to 2.87), the articles published in the three journals between 2000 and 2017 focused mostly on expanding the understanding of existing knowledge through exploring new mechanisms of an existing relationship, using previous research findings as grounds for investigation. A one-way ANOVA was conducted to compare the theory building and theory testing levels among the three journals. A one-way ANOVA showed that there was statistically significant difference among journals in terms of theory testing, F(2, 666) = 8.56, p = .000. Post hoc analysis using Tukey’s honestly significant difference (HSD) demonstrated that the theory testing means of HRDQ (M = 2.64, SD = 0.91) and HRDI (M = 2.51, SD = 0.89) were higher than that of ADHR (M = 2.24, SD = 0.85). However, the effect size for this result was small, η2 = .025. The theory building means of the three journals were not statistically significant, F(2, 666) = 1.57, p > .05. ADHR’s mean (M = 2.74, SD = 0.85) was lower than HRDI’s mean (M = 2.81, SD = 0.80) and HRDQ’s mean (M = 2.87, SD = 0.66).
ANOVA Comparison of Theory Building and Theory Testing Levels by Journal.
Note. ADHR = Advances in Developing Human Resources; HRDI = Human Resource Development International; HRDQ = Human Resource Development Quarterly; ANOVA = analysis of variance.
Trends in Theory Building and Theory Testing Over Time
Figure 3 demonstrates the changes in theory building and theory testing means from 2000 to 2017. The number of analyzed articles was on average 37.2 a year, ranging from 16 in 2000 to 50 in 2010. The means of theory building gradually increased from 2.70 in 2000 to 3.00 in 2017. In contrast, the theory testing means displayed much wider fluctuation. However, the trend was upward. The means of theory building were higher than those of theory testing over the period of time, covered in this study. To test whether the linear trend was statistically significant, we carried out simple linear regression by year (Table 3). The results of the regression analysis indicated that the year was a statistically significant predictor of theory building, β = .22, p < .001. The trend line for theory testing was also statistically significant, β = .15, p < .001. These data suggest that the overall levels of theoretical contributions, made by articles, published in these three journals, have been growing.

Trends in theory building and theory testing from 2000 to 2017.
Simple Regression Analysis for the Effect of Year on Theory Building and Theory Testing.
In addition, the change in the proportion of Reporters, Confirmers, Testers, Modifiers, Qualifiers, Advanced Testers, Builders, Advanced Builders, and Expanders over time was explored. To examine the trends in the nine categories, the percentage of each category among all the articles for each year was calculated since the number of articles in each year was different (Figure 4). The percent of Qualifiers and Modifiers together remained quite stable at more than 50%, which was the largest portion. Although the portion of Reporters was the second largest in 2000-2002 (36.99%), it decreased to 12.61% in 2015-2017. In contrast, the proportion of Builders, Advanced Builders, and Expanders increased to about 8% in 2015-2017. The percent of Testers fluctuated up and down widely over the study period. Simple linear regression was conducted to test whether the trend lines were statistically significant (Table 4). The trend line of Builders was statistically significant, β = .525, p < .05. The trend lines of Reporters (β = −.68, p < .001) and Expanders (β = .58, p < .05) were also statistically significant. However, the effect of the year on Confirmers, Modifiers, Qualifiers, Testers, and Advanced Qualifiers was not statistically significant.

Trends in article type from 2000 to 2017.
Simple Regression Analysis for the Effect of Year on Article Types.
Table 5 shows the percentages of articles by article types. In general, about 70% of articles published in the three HRD journals were Reporters, Modifiers, or Qualifiers. Other categories, which fall into the high theoretical contribution types according to the taxonomy, accounted for less than 30%.
Percentages by Article Type.
As shown in Table 5, there is a significant difference between the articles published in 2000-2002 and articles from the latest period, 2015-2017. In 2000-2002, 36.99% of articles were Reporters. In contrast, the proportion of Reporters dramatically decreased to 12.61% in 2015-2017. At the same time, the percentage of Expanders and Builders has increased. Furthermore, the level of confirmation research remained rather low throughout the whole period studied, accounting for between 2.88% in 2006-2008, growing to 8.51% in 2009-2011, and decreasing to 6.31% in 2015-2017.
Article Impact: Relating the Citation Analysis to Trends in Theory Building and Testing
We conducted hierarchical multiple regression analysis using a three-step process to examine the relationship between the levels of theory building and theory testing and article citations. Table 6 shows the results of regression analysis for variables predicting article citations. In Step 1, a control variable (i.e., year) was entered. The control variable explained 12.5% of the variance in article citations (adjusted R2 = .124). The year was statistically significant (β = −.354, p < .001). Theory testing and theory building were entered in Step 2. The results showed that theory building and theory testing accounted for 19.3% (adjusted R2 = .189). The effects of theory building (β = .196, p < .001) and theory testing (β = .122, p < .001) on article citations were statistically significant. Step 3 examined the interaction effect between theory building and theory testing. However, the interaction effect was not statistically significant, which means that the effect of theory building and theory testing on article citations was independent.
Hierarchical Regression Analysis for Variables Predicting Citations.
p < .05. **p < .01. ***p < .001.
In addition, we used dummy variables to scrutinize the article citations associated with article types. Reporters were selected as a reference group. Table 7 shows the results of regression analysis. We entered a control variable, year, in Step 1. Eight dummy variables including Confirmers, Testers, Modifiers, Qualifiers, Advanced Qualifiers, Builders, Advanced Builders, and Expanders were entered in Step 2. The Step 2 accounted for 21.8% of the variance in article citations (adjusted R2 = .207). The unstandardized regression coefficient for Advanced Builders showed that an article in that category obtained 116 more citations on average than a Reporter article. Expanders received 65 more citations, and Builders obtained 57 more citations than Qualifiers. In contrast, categories that Colquitt and Zapata-Phelan (2007) defined as of low theoretical contribution (including Qualifiers and Modifiers) received less citations than articles from high theoretical contribution areas such as Testers, Builders, and Expanders. Finally, Confirmers received 15 more citations than Qualifiers, and results for Testers were not statistically significant.
Hierarchical Regression Analysis for Variables Predicting Citations.
Note. Article type was represented as eight dummy variables with Reporters as the reference group.
p < .05. **p < .01. ***p < .001.
Discussion
This study attempted to trace the progress of theory building and testing in HRD by using a modified version of the taxonomy proposed by Colquitt and Zapata-Phelan (2007). In this section, we discuss the major trends that we observed and our conclusions and implications for further research.
The first important finding of this study is that the levels of both theory building and theory testing increased between the years 2000 and 2017, although these levels fluctuated from year to year in at least some of the categories of articles. This trend is similar to what Colquitt and Zapata-Phelan (2007) found in management research.
In addition to the general upward trends of the levels of theory building and theory testing, the study demonstrated the pronounced shift away from simple reporting of observed phenomena toward increasingly more sophisticated attempts at developing and testing existing or new theories. Back in late 1990s and early 2000s, leading HRD scholars were observing that HRD, at that time, was still a young discipline that is moving toward the maturation stage that still needs to become more sophisticated beyond theoretical explanations and engage in solid theory building efforts (i.e., Holton, 1999; Swanson, 2001). The results of our study provided empirical support for the argument, made in more recent years by a number of HRD scholars, that in recent years the discipline has reached a stage of when it can be classified as a mature discipline (Swanson & Chermack, 2013; Turner et al., 2018). According to Snow and Thomas (1994), it is inevitable that articles that describe interesting phenomena and stimulate further research will constitute a dominant category at the early stages of a discipline’s development. As a discipline matures, the proportion of Reporters tends to decline (Agarwal & Hoetker, 2007). This trend was confirmed in our study: The portion of Reporters declined to 12.61% in 2015-2017. However, compared with management research, the decline in the percentage of Reporters was much less precipitous. Although Colquitt and Zapata-Phelan indicated that by mid-2000s, “the Reporters that were so common in the 1970s and 1980s have become largely extinct in the pages of AMJ, replaced by articles that make a more significant theoretical contribution” (p. 1293), the Reporters group among HRD articles was still the third largest of the nine groups of studies in 2017.
The Qualifiers and Modifiers categories were the next largest after Reporters in 2000-2002 (sharing the second and third place, at 22.99% each) and were the first and second largest categories in 2015-2017. The percentage of Modifiers remained stable (23% in 2015-2017), whereas that of Qualifiers grew somewhat (to 27%). These two categories are characterized by blending of both theory testing and theory building efforts.
At the same time, the Builders, Advanced Builders, and Expanders categories continued to grow, although the overall percentage of articles in these categories still remained modest. Although an increase in these categories may be regarded as a positive trend, there may also be a “dark” side to it. Some authors describe this potentially harmful effect as construct proliferation, when the body of knowledge in a discipline is fragmented and further addition of new constructs leads to even stronger fragmentation, instead of contributing to strengthening of several more significant streams of research (Barley, 2006; Colquitt & Zapata-Phelan, 2007).
In terms of citation counts, an important finding is that Qualifiers, Builders, Advanced Builders, and Expanders were cited more often than Reporters. This result points toward further maturation of the field and, at the same time, growing recognition of HRD research among scholars from related fields.
Overall, the change in percentages of article types should be interpreted with caution. The decline in the number of Reporters does not mean that there are no longer new interesting phenomena left to explore in our field or that all of the observed phenomena can be explained or predicted by existing theories. Ghosh et al. (2014) and Han et al. (2017) reported a rapid increase in the diversity of HRD research topics, which is driven, among other things, by globalization and advancements in new technologies. Furthermore, Fenwick (2004) stated that research interests have moved from traditional topics such as knowledge and skill development to new topics such as justice, fairness, and equity. Consequently, HRD scholars are trying to understand or explain constantly emerging new phenomena through modifying or expanding existing theories or introducing new theories, which results in the decline of the relative percentage of Reporter studies and relative increase of other types of research.
Criticizing the dominant in the management science research approach, Hambrick (2007) argued that excessive focus on new theory building is likely to prevent reporting interesting phenomena because the assumption of the members of the scholarly community is that reporting make only limited theoretical contribution to the discipline and, thus, is not publishable in top journals. The trends, identified by us, are suggesting that the same dangerous tendency may be developing in the HRD community. Editorial policies of HRD journals may demotivate researchers from reporting new and unique observations which could stimulate future research. Thus, one of the conclusions of our study is that it would be productive to conduct discussions in our academic community around the value and place of different types of theoretical contributions. It is important to realize that, although research with higher scores on theory building and/or testing is more likely to have higher impact on the field, there is also an important place for all other types of research, even at more advanced stages of the discipline’s development.
Another interesting finding is that the proportion of Confirmers and Testers was relatively small and stable at around 5% over the studied period. This trend is not unique to HRD. Hambrick (2007) lamented that few management scholars pay attention to theory testing, the reason being that the theoretical contribution of such articles is perceived by journal editors and reviewers as limited and these types of articles tend to be rejected. This is in contradiction with the long-standing tradition of theory advancements through subjecting it to multiple iterations of testing (Kenworthy & Sparks, 2016).
Furthermore, ADHR, HRDI, and HRDQ have published many articles that contain theoretical arguments but are not empirically tested (note that we had to exclude from our sample close to 50% of all articles in the initial database). The low percentage of Confirmers could be suggesting that too few of the new proposed theoretical models are tested in empirical research. This result may suggest that, as an academic community, we are forgetting that theoretical arguments and constructs are not theory unless they have been tested (Bagozzi & Phillips, 1982).
Finally, this research shows that the percentage of Modifiers and Qualifiers has remained rather high and stable throughout the studied period. This result is inconsistent with Colquitt and Zapata-Phelan’s (2007) study which reported the significant upward trend for Qualifiers. This inconsistency may reflect the fact that Colquitt and Zapata-Phelan’s database went to much earlier dates (starting in 1960s), at which time various new research methods were just being introduced (Alasuutari, 2010).
The comparison of percentages of different types of articles suggests some additional implications. Thus, low percentages of Confirmers and Testers suggest that there is a need to pay more attention to theory testing to further develop more complete theories and/or to convert informal theories into formal theories.
Limitations and Future Research
This research has a number of limitations. First, the study used the data collected from three major journals sponsored by the AHRD. These three journals were selected because AHRD is the most influential scholarly association in the HRD field both in the United States and globally. However, articles published in other important HRD journals (e.g., the European Journal of Training and Development), or HRD-related articles, found in a range of other publications (e.g., in management, organization studies, higher education, and adult education fields) were not included. Therefore, future studies should include a wider range of relevant publications. Second, this study excluded conceptual articles that could potentially make significant contributions to the advancement of HRD theories. Future studies would need to expand our research to evaluate the changes in theory building contributions of conceptual articles over the time.
Third, although three authors spent considerable amount of time discussing the data analysis strategy, conducted ICC (2) to check reliability between the raters at the initial stages, and reevaluated ratings of difficult cases together, we still assume that there could be problems with accuracy of some of the ratings, related to subjectivity and individual biases. Future studies may benefit from checking reliability again after raters evaluate all the data. Finally, although we studied the changes in types of theory building and testing articles, we did not attempt to trace trends in emphasis on specific theories or constructs. Future research could expand our study by tracing (and correlating) changes in theory building efforts with frequency of publication of articles, focused on a number of most prominent in our field theories and models.
Conclusion
This research traced the changes in theory building and theory testing efforts, reported in 668 articles, published in three leading HRD journals over the past 18 years. We conclude that the efforts in theory building and theory testing in our field are following the growth and expansion trajectory, previously observed in a related discipline (management), and continue to contribute to strengthening of the legitimacy of HRD as an academic discipline. The study has confirmed that some types of research are becoming less important as the discipline matures (Reporters) or remain stable (Modifiers), whereas others continue to grow (Builders and Expanders). The citation analysis results suggest that articles that both propose new theoretical constructs and test them at the same time or those that modify and expand new theories enjoy higher levels of citations, compared with articles that report observations of practice or duplicate earlier studies.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
