Putting the Methodological Cart Before the Theoretical Horse? Examining the Application of SEM to Connect Theory and Method in Public Administration Research

Abstract

The application of psychometric statistical techniques, such as confirmatory factor analysis and structural equation modeling, has grown significantly in public administration research over the past three decades. Given the growth in the application of these techniques, we take stock of the ability of these statistical approaches to advance public administration theory by examining their use in two areas of research: public service motivation and red tape. We further argue that theoretical and methodological diversity in public administration is desirable, so long as scholars recognize that the application of new and multiple methods in a single study do not inherently lead to better tests of theory. Instead, scholarship should focus on emphasizing that each theoretical and methodological approach adds significant, yet partial, contribution to public administration scholarship.

Keywords

research methods public service motivation red tape discipline public administration theory

Introduction

Since its origin as a self-identified field of knowledge, scholars have examined public administration from myriad theoretical vantages. As such, it is not surprising that our contemporary understanding of the field coalesces around the belief in, and acceptance of, public administration as an interdisciplinary field of inquiry (Frederickson, Smith, Larimer, & Licari, 2015; Hou, Ni, Poocharoen, Yang, & Zhao, 2011; Raadschelders, 2011a, 2011b; Riccucci, 2010; Wright, 2011). As an interdisciplinary field, we have recognized that many of the problems faced by public administrators are “open-ended, multidimensional, ambiguous, and unstable” and also that our process for researching said problems “cannot be bounded and managed by classical approaches to the underlying phenomena” (Klein, 1996, p. 142; citing Mason & Mitroff, 1981; Rittle & Webber, 1973). Instead, as a field, we have applied diverse theoretical lenses to integrate knowledge and advance our understanding of government (Klein, 1996; Raadschelders, 2011b).

Although interdisciplinarity, with its aim of the accumulation and integration of knowledge, is not without challenges and pitfalls (see, for example, Mainzer, 1994; Pfeffer 1993), the ability to bring multiple perspectives to bear when examining complex phenomena is widely viewed as a necessary advancement in the knowledge production process itself—one that leads to better solutions—and a powerful tool for classifying and organizing intricate concepts and ideas (Blume, 1985; Klein, 1996). Practically, interdisciplinary inquiries in a given field necessitate merging elements from different disciplines to solve problems, which further requires researchers aggregate disparate epistemologies, concepts, theories, and methodologies. Yet despite efforts to integrate disparate perspectives into a coherent whole, an inherent by-product of interdisciplinarity is theoretical and methodological pluralism.

Like many other public administration scholars (e.g., Raadschelders, 2011; Riccucci, 2010), we start from the assumption that interdisciplinarity generates sounder, more robust knowledge in the field. We also assume that the resulting theoretical and methodological pluralism of interdisciplinarity is generally a net gain. Pluralism allows researchers to examine a single problem or classes of problems from multiple perspectives and using multiple tools. In this sense, pluralism is akin to the concept of triangulation in research methods, which assumes applying two or more methodologies in a study allows researchers to compare and validate findings as well as to “map out, or explain more fully, the richness and complexity of human behavior by studying it from more than one standpoint” (Cohen & Manion, 1986, p. 254). Bringing multiple theoretical and methodological perspectives to bear in the field of public administration acknowledges that each perspective “makes a useful, but partial, contribution” to our understanding of core issues (Mainzer, 1994, p. 360).

Although we view theoretical and methodological pluralism as largely positive, we would also argue that the field has, at times, poorly integrated or applied different methodological orientations when grappling with diverse interdisciplinary theories. And, at times, this tendency naturally and unsurprisingly spills over into our evaluation of homegrown theories too. In part, our struggle to integrate different methodological orientations occurs precisely because we are so willing to borrow from other disciplines and fields. At the risk of sound like a broken record, we borrow not only their theories but also their methods and then are required to integrate these methods into something resembling a coherent whole. Unfortunately, the road to integrating diverse methods and methodologies can be fraught with significant challenges—even more so when diverse methodologies are used to evaluate and advance theory. As such, in our greatest strength also resides our principal weakness.

Put another way, with an expanded methodological toolkit, we too frequently assume multiple methodological hammers pounding on one theoretical nail drive it better than a single hammer. It is our central thesis in this article that we should not assume the use of multiple methodological approaches in a single study—no matter how advanced or numerous—results in a better test of theory than the application of a single method.

Yet, in our view, a significant body of current public administration research falls prey to this mistake, too frequently assuming the use of multiple methodological tools leads to inherently superior tests of theory. One area where this problem has grown increasingly pronounced involves the application of psychometric statistical techniques, such as structural equation modeling (SEM).

There are at least two problems associated with the current application of SEM in public administration—neither having anything to do with the relative merits of SEM as a statistical tool itself. First, some studies use SEM as a part of a “kitchen sink” approach to statistical analysis wherein scholars simultaneously use several different data analytic techniques in a single study. Doing so implicitly assumes that using SEM in conjunction with other statistical techniques compensates for the partial contribution offered by any single approach. The application of multiple—often incompatible—statistical techniques does not necessarily enhance research quality nor does it inherently provide a stronger test of theory. Rather, it confuses, muddies, and undermines the potential theoretical and methodological insight offered in a given study.

Second, there are instances where confirmatory factor analysis (CFA) and SEM are used in an exploratory manner. Both CFA and SEM are techniques that are fundamentally confirmatory and not exploratory in nature. At a minimum, the misuse or misapplication of a statistical tool or set of tools produces inaccurate statistical conclusions. Unfortunately, this, in turn, has spillover effects on theory testing and development. When the statistical conclusions rendered in a research study are inaccurate, theory development suffers. As a consequence, public administration research would benefit by acknowledging that SEM, and the type of questions it appropriately addresses, offer a partial, but useful, contribution to public administration knowledge. However, to fully capitalize on the benefits of SEM and related techniques, researchers must place the theoretical horse in front of the methodological cart. To do so necessitates recognizing what SEM is and is not as well as how SEM can and should be used to examine public administration theory.

To address our central argument, we proceed in four steps. First, we describe SEM and its underlying measurement model, CFA. Practitioners of SEM will be well versed in much of this information. However, our discussion is designed to be an elementary introduction for those not well versed in the technique. We will articulate what scholars can expect SEM to reasonably accomplish as well as its advantages relative to other statistical techniques. Second, we will offer an assessment of the rise of SEM in public administration, its early uses, and how scholars commonly use the technique today in public management research. To achieve this aim, we will examine how SEM has been used in two core research areas: public service motivation (PSM) and bureaucratic red tape. Third, we will assess the application of SEM in conjunction with multiple statistical approaches and the necessity of matching method to theory. Finally, we will conclude with some comments on the future of SEM and how it can contribute to theory building and testing in public administration.

CFA and SEM: A Description

Although many applied researchers tend to conceptualize SEM as a single technique, two statistical processes underlie the approach. Prior to estimating a model that examines cause and effect relationships between constructs, researchers must estimate a CFA. The CFA model deals specifically with measurement accuracy and examines the relationships between a set of indicators (e.g., questionnaire items) and the latent, unobservable construct they are intended to measure (Brown, 2015; Kline, 2016).¹ Indicators are observable and directly measurable, but CFA assumes that any given indicator represents only an approximation for the intended construct. Thus, CFA requires multiple indicators for any given construct. The common variance between items is assumed to measure the construct whereas the unique variance associated with each item is assumed to represent measurement error (Brown, 2015; Kline, 2016).

Ideally, researchers should use a minimum of three indicators to specify a given construct. The application of the three indicator minimum construct ensures that the construct is identified because the known information in the data set equals the number of parameters to be estimated. However, three indicator constructs, or “just identified” constructs, always fit the data perfectly when estimated in isolation. Researchers may decide to select more indicators per construct when the primary research question involves evaluating measurement accuracy. Multiple indicator constructs can empirically test for construct validity when indicators are appropriately selected (Little, Lindenberger, & Nesselroade, 1999). This approach offers more valid and reliable measurement of constructs when compared with traditional regression analysis which assumes that a single indicator is a perfect measure of a construct.

An important distinction between CFA and other factor analytic techniques frequently used in public administration involves specification of the relationships between an indicator and construct (Brown, 2015). In exploratory factor analyses (EFAs), the relationship between construct and indicator, as well as the number of constructs resulting from a given number of indicators, is statistically derived. EFA approaches treat the number of factors to be extracted from a given set of items as an empirical question, and the researcher can rotate the axes to find the best fit to the data. Although EFA results in a model that fits the data, the resulting factors could be the product of chance relationships among indicators. CFA models, on the contrary, require the researcher to specify the relationships between an indicator and construct based on sound theoretical argument. Researchers must also a priori specify the number of constructs in the data, which means that the number of factors for a given set of items is a theoretical—rather than an empirical—question. If those theoretical assertions are not empirically tenable, model fit suffers. One of the primary assumptions behind CFA is that the researcher should possess some prior knowledge concerning the appropriate number of constructs to examine as well as the indicators used to measure each construct.

Following the specification of a CFA, and a determination that the factor model specified adequately fits the data, the researcher can then examine the structural model. SEM analyses amount to regression analysis conducted with latent variables verified through CFA. SEM also offers researchers a few specific advantages over traditional regression models in addition to correcting for measurement error. First, SEM can handle complex patterns of relationships between multiple latent constructs. Given the flexibility of SEM, the researcher is free to establish any pattern of relationships between the constructs examined. As such, SEM is particularly well suited for examining models where mediation is a theoretical possibility. Second, even where mediation is not a possibility, SEM can handle multiple “y” variables in a single analysis. In other words, unlike in traditional ordinary least squares (OLS) regression, the researcher is not limited to examining a single dependent variable. Finally, SEM can offer distinct advantages when constructs change over time. A construct measured at time point 1 can be used to predict the same construct at time point 2, thereby controlling for its prior level (Burkholder & Harlow, 2003). As such, SEM is useful in the context of both cross-sectional and longitudinal data. Likewise, SEM is also useful for examining multilevel/hierarchical data. Nested data structures violate the independent and identically distributed assumption associated with traditional statistical models, which can result in biased parameter estimates when ignored (Snijders & Bosker, 1999). Therefore, the ability to account for nestedness in a multilevel SEM context is a considerable strength.² Given these advantages, a scholar’s capacity to test theory increases with the flexibility of SEM. Although these advantages offer researchers advantages for testing theoretically relevant questions in public administration, there are also pitfalls associated with SEM. The relative advantages and pitfalls of SEM include the following:

Advantages

Ability to model latent variables

Identification of measurement error

Simultaneous estimation

Capture direct, indirect, and total effects

Capable of showing reciprocal causal relationships

Numerous applications: multilevel data, multiple group comparisons, longitudinal data

Flexibility when modeling

Pitfalls

Model must be theoretically grounded

Easy to misinterpret

Flexibility allows for model manipulation

Generalization constraints due to sampling and selection effects

Confirmation bias (equivalent models)

Limited/single indicators

Requires large sample size

In sum, there are two hurdles researchers must clear when conducting research that applies SEM. First, the researcher must specify a measurement model that theoretically relates indicator to construct. If the measurement model fails to adequately fit the data, the measurement expectations articulated by the researcher are not supported. Second, if the data support the researcher’s theoretical measurement expectations, the researcher may then examine the structural relationships between the constructs. The relationships between constructs can be examined with a latent regression model. By combining an empirical test of measurement accuracy and a statistical examination of the relationships between constructs, CFA and SEM provide valuable tools for theory building and testing in public administration research. The following section examines the early application of these techniques in public administration and how their use has evolved.

CFA and SEM in Public Administration Research: Two Examples

CFA and SEM were little utilized in public administration research 25 years ago, but both approaches are now widely used to address a host of questions. Nevertheless, certain areas within public administration appear to have been more widely colonized by psychometric statistical techniques such as CFA and SEM. Not surprisingly, given the nature of research questions and constructs examined, research streams associated with public management more broadly apply CFA and SEM. Frederickson et al. (2015) identified public management as a subfield of public administration research more closely associated with theories originating in psychology and generic organization studies. Because public management research more frequently adapts theory from psychology and organization studies, the methodology associated with those disciplines also tends to characterize studies conducted within those streams of literature.

Although our list is far from exhaustive, the research traditions associated with two core concepts in public management illustrate nicely the growth of CFA and SEM in public management research: PSM and bureaucratic red tape. Admittedly, CFA and SEM are widely applied to examine other concepts in public administration research. However, a chronicle of all instances of research that uses CFA or SEM is beyond the scope of this article. Given the prominence of these two research topics, they serve as meaningful cases to examine the potential of CFA and SEM to connect theory and method in public administration.

PSM has become one of the most widely researched, if not core, topics in public management (Bozeman & Su, 2015; Ritz, Brewer, & Neumann, 2016; Vandenabeele, Brewer, & Ritz, 2014). The original conceptualization of PSM characterized it as “an individual’s predisposition to respond to motives grounded primarily or uniquely in public institutions and organizations. The term ‘motives’ is used here to mean psychological deficiencies or needs that an individual feels some compulsion to eliminate” (Perry & Wise, 1990, p. 368, emphasis added). Given the initial conceptualization, much of the theoretical base used to understand PSM originates in psychology (Perry & Hondeghem, 2008; but see Le Grand, 2003, for an alternative explanation). Arguably, the widespread application of psychometric statistical techniques, such as CFA and SEM, to examine PSM is due not only to its theoretical foundations in psychology but also to the early application of CFA to examine construct measurement (Perry, 1996). Since that time, CFA procedures have been used to examine the validity of the PSM measurement instrument with different types of populations and in different countries (Coursey, Perry, Brudney, & Littlepage, 2008; Jensen & Vestergaard, 2017; Kim et al., 2013; Vandenabeele, 2008a). Given the initial measurement focus within PSM research, several studies also seek to test research hypotheses with SEM (Davis, 2010; Davis & Stazyk, 2014; Pandey, Wright, & Moynihan, 2008; Park & Rainey, 2008; Stazyk & Davis, 2015; Wright, Moynihan, & Pandey, 2012).

Based on the trajectory of PSM research, many uses of psychometric techniques were designed to assess construct measurement (Coursey et al., 2008; Kim et al., 2013; Perry, 1996). Although PSM researchers continue to focus on construct measurement issues, the research has evolved to use SEM as a hypothesis-testing technique. Admittedly, the application of SEM as a hypothesis-testing mechanism is far from universal in the PSM research community. Nevertheless, the widespread application of psychometric techniques is evident to even the passing observer of the PSM literature. In sum, much like any other research community, PSM researchers face methodological challenges (Wright, 2008), yet this stream of research serves as an example of methodological progress in the sense that it has capitalized on the accessibility of modern statistical techniques such as CFA and SEM. The question of whether the application of psychometric techniques in PSM serves as a mechanism for theoretical advancement, however, remains an open question.

In addition to PSM, bureaucratic red tape serves as a central public management concept examined extensively in volumes of research over several decades (Bozeman & Feeney, 2011; Pandey & Scott, 2002). Unlike PSM, however, the study of bureaucratic red tape was not initially grounded firmly in theories of psychology. The initial conceptualization of red tape included organizational red tape, which was rooted in objective organizational structure, and stakeholder red tape, which included organization members’ subjective interpretation of rules (Bozeman, 1993). Although examining objective organizational structure need not necessarily use psychological theories, exploring individual perceptions is firmly rooted in the realm of psychological inquiry. Given that red tape partially involved individual perceptions of organizational rules, subsequent scholarship sought psychological explanations for how individuals experience rule demands and rule burden (Davis & Pink-Harper, 2016; Pandey & Kingsley, 2000; Pandey & Welch, 2005; Scott & Pandey, 2005). The psychological foundations of red tape eventually contributed to increased application of CFA and SEM to explore its measurement, causes, and consequences (Coursey & Pandey, 2007; Davis, 2013; Davis & Stazyk, 2014; Moynihan, Wright, & Pandey, 2012).

Nonetheless, the application of psychometric techniques to examine red tape has been comparatively slow relative to PSM. Arguably, the adoption of psychometric statistical techniques by the red tape research community was due to limited attention to construct measurement in early studies. Although red tape researchers have consistently examined questionnaire data since the origins of the concept (Bozeman & Feeney, 2011; Buchanan, 1975; Rainey, Pandey, & Bozeman, 1995), scholarship has generally searched for both objective and subjective measures of the phenomenon (see Bozeman & Feeney, 2011; Pandey & Scott, 2002, for an extensive discussion of red tape measurement strategies). Importantly, syntheses of red tape scholarship highlight myriad measurement strategies used over the past 25 years. As such, the concerted emphasis on measurement in the infancy of the concept was not as pronounced as compared with PSM. The lack of early measurement emphasis is not a critique of the red tape literature, rather it highlights why CFA and SEM took longer to become mainstream techniques in this area.

The following section uses these examples to examine how and to what extent have psychometric statistical techniques contributed to sound theory development in public administration research. Specifically, we assess whether the application of SEM in conjunction with other statistical approaches has generally enhanced or hindered theory testing and development in these areas.

A Sound Merging of Method and Theory?

As stated above, CFA and SEM have become widely applied statistical techniques used to examine public management’s homegrown theories. Nevertheless, these techniques could become the methodological cart that precedes the theoretical horse, which could inevitably stall theory development and discourse. In this section, we examine whether the increasing ease with which scholars can use psychometric statistical techniques has given the false impression of reliable and meaningful theory development in the areas considered. The two areas examined are far from exhaustive. However, given the volume of research on PSM and red tape, each serves as a useful venue to evaluate whether methodological advances have resulted in sound theoretical advances.

We take up two issues in our assessment. First, we examine several instances when scholars have used a “kitchen sink” approach to statistical analysis. Using multiple statistical techniques in a given study can give the illusion of providing sounder tests of theory, but there are also good reasons to assume it can lead to inaccurate conclusions. Second, we examine instances where the application of psychometric techniques, particularly in the realm of measurement, deviates substantially from the logic underlying CFA and SEM. Unfortunately, these deviations have, in some cases, been used to inaccurately redefine theory.

The “Kitchen Sink” Approach to Statistical Analysis

The number of statistical options available to researchers has increased drastically in the past decade. This is partially due to (a) research progress among statisticians, (b) the increased availability of computing resources, and (c) increased interdisciplinarity. Nevertheless, the increased availability of modern data analysis techniques does not suggest that a given study should use every type of statistical analysis available. On one hand, one may assume that using multiple statistical approaches in a single study remedy the flaws or weaknesses inherent in any one analytical technique. On the other hand, it is also reasonable to conclude that using various techniques in one analysis accentuates research flaws, particularly when two statistical techniques are based on competing statistical assumptions. The simultaneous application of CFA procedures and traditional regression analysis represents one place where statistical assumptions compete. The competition between these approaches centers on assumptions regarding measurement accuracy. Researchers applying CFA assume—either explicitly or implicitly—that no single measure is a perfect representation of a construct. Rather, CFA researchers assert that any given indicator includes variation due to the construct and variation due to several sources of measurement error (Brown, 2015). Researchers applying regression analysis, however, assume that all the variation in a single measure defines the construct and is free of measurement error or other biases that might compromise measurement accuracy (Wooldridge, 2006).³

Given competing assumptions about measurement accuracy between CFA and traditional regression analyses, applying both types of analysis in the same study does not necessarily serve as a sound methodological strategy nor is it certain to advance theory. In the context of public management research, several examples of studies using these two techniques simultaneously exist (Andersen, Heinesen, & Pedersen, 2014; Chen, Hsieh, & Chen, 2014; Giauque, Ritz, Varone, & Anderfuhren-Biget, 2012; Jensen & Vestergaard, 2017; Kjeldsen & Hansen, 2016; Park & Rainey, 2008; Rose, 2013; Vandenabeele, 2008b). In several of these examples, researchers use CFA results to justify creating an index to include in subsequent regression models; in at least one instance, an SEM was also estimated (Park & Rainey, 2008).

Based on our review of the literature, the simultaneous application of CFA and regression analyses is more highly prevalent in PSM than red tape research. This could be due to Perry’s (1996) initial application of CFA to validate PSM measures. Unfortunately, switching between CFA’s underlying measurement model and index variables in regression analysis slows theory development by providing inaccurate methodological conclusions regarding cause and effect relationships. More specifically, inaccurate depictions of the statistical relationships between variables occur because moving from the CFA to the regression model reintroduces measurement error into those variables estimated in the regression analysis.

It is important to note that the incentive to include multiple, but potentially incompatible, forms of statistical analysis in a single study may emanate from sources outside the control of researchers. As argued above, theoretical diversity within public management contributes to methodological pluralism. To some extent, this suggests that those responsible for reviewing articles are not versed in all the possible, and acceptable, methodological approaches pursued in public administration scholarship. Reviewers may ask for the inclusion of a traditional regression analysis when the primary method used in a study is SEM without realizing the contradictions in assumptions and the trade-offs of each. Unfortunately, demands that researchers simultaneously include regression and CFAs do not attenuate the weaknesses of a single analysis; instead, it accentuates the problems inherent in both approaches.

Finally, readers should not construe our focus on CFA and SEM as suggesting these statistical tools are inherently superior to others. Instead, we are of the opinion that different statistical tools provide different and unique bits of information that can be useful when testing and evaluating theory. Instead, we simply suggest the application of multiple statistical tools in a single paper is often more likely to generate inaccurate results than would a comparison of results across several papers with each using a single and perhaps different method.

Matching Measurement Options to Theory

At their foundation, CFA and SEM are statistical techniques that require sound theoretical expectations for model building. Moreover, these techniques not only require plausible cause and effect arguments for the relationships between constructs but also require plausible cause and effect arguments regarding the relationships between the indicator and construct. As suggested earlier, this means that the researcher should have some previous theoretical understanding regarding the number and nature of constructs that will be examined and of the pattern of connecting an indicator to a construct. However, under some circumstances in public management research, CFA analyses are used in a more exploratory fashion (Coursey & Pandey, 2007; Coursey et al., 2008; Perry, 1996; Vandenabeele, 2008b). At some level, these studies are not purely exploratory in the sense that each based the pattern of relationships between construct and indicator on prior theoretical explanations. In exploratory applications of CFA, however, indicators are discarded because they did not fulfill empirical expectations (Perry, 1996), or dimensions are excluded due to poor empirical performance in previous studies (Coursey et al., 2008). These types of modifications are empirically expedient as opposed to theoretically justified. Poor fitting CFA models suggest that scholars should revisit theory as opposed to make empirically convenient model adjustments.

In addition to trimming indicators, connecting indicators to constructs involve theoretically specifying the relationship between a second-order construct and its subdimensions. A second-order construct assumes that the relationship between the indicator and a construct passes through intermediary latent constructs. In the context of public management research, PSM was initially defined—both theoretically and empirically—as a second-order construct (Perry, 1996; Perry & Wise, 1990). Conversely, red tape was depicted as having two fundamentally different forms, organizational and stakeholder red tape (Bozeman, 1993). For red tape researchers, there was no prior assumption that these forms of red tape collectively defined overall red tape. That said, scholars have used latent red tape concepts in broader analyses (Davis & Stazyk, 2014; Moynihan et al., 2012; Stazyk, Pandey, & Wright, 2011). Moreover, there is some emphasis in the literature on examining red tape as a second-order construct (Coursey & Pandey, 2007). To this end, examining the dimensionality of red tape is challenging given that limited theoretical emphasis has been invested in identifying construct subdimensions. Red tape may serve as an area ripe for the simultaneous progress in both theory and method.

Since its initial conceptualization, PSM has been defined as the product of self-sacrifice, commitment to public service, compassion, and attraction to policy making (Perry & Wise, 1990). As such, there was a greater theoretical foundation for examining PSM as a second-order construct than red tape. However, research consistently indicated that a four-factor PSM construct did not adequately fit the data due to the attraction to policy-making dimension, which encouraged some scholarship to suggest that PSM was a second-order construct with only three subdimensions (Coursey et al., 2008). Unfortunately, second-order constructs with three subdimensions cannot be empirically distinguished from three first-order constructs allowed to correlate. A CFA model with three latent constructs allowed to freely correlate and a second-order CFA model with three subdimensions to have identical degrees of freedom, which means that these models are mathematically equivalent. The model fit for each would be identical, assuming no covariates are included in the model (Kline, 2016). To be sure, there is still room for significant theoretical progress in PSM measurement. However, the redefinition of PSM as a three-dimension, second-order construct remains, first and foremost, a theoretical endeavor rather than a simple matter of empirical expedience or to showcase a cutting-edge statistical technique.

Another variation of specifying the relationships between indicators and constructs involves making theoretical assertions regarding whether the construct causes variation in the indicator or vice versa. Generally speaking, CFA assumes that the construct under examination causes variation in the indicator or dimension. However, under some circumstances, researchers assume that the indicator—or dimension—causes variation in the construct. This distinction points to differences between what is called reflexive and formative CFA analyses. Analyses that assume the construct causes variation in the indicator are reflexive, whereas analyses that assume the indicator causes variation in the construct are formative. These two types of analysis present fundamentally different theoretical assumptions. Importantly, both red tape and PSM have been examined using both reflexive and formative specifications (Coursey & Pandey, 2007; Kim, 2011). In the case of red tape research, it may be possible to theoretically justify examining a model with formative specification, as did Coursey and Pandey (2007). Bozeman (1993) initially conceptualized red tape as having two fundamentally different forms. As such, each form could cause overall red tape insofar as each causes separate and unique variation in the overall concept. PSM, however, was clearly specified in terms of dimensionality from the outset. Perry and Wise (1990) suggested, and many PSM scholars have maintained, that PSM is defined by variation in the four subdimensions, which means that variation in each dimension of PSM is caused by the overall PSM concept. In the case of PSM, formative specification makes little theoretical sense. Though the distinction between formative and reflexive specification represents a sophisticated methodological approach, its application must be carefully matched to theory to advance public management knowledge. The ability of scholars to apply formative model specification should not be mistaken for a theoretical advance in the field.

Conclusion

The purpose of this article is to broadly examine the extent to which methodological progress—or, more specifically, the use of new or multiple methods in a single study—may be mistakenly assumed to generate stronger tests of theory. We argue that the public administration research community should be open to diversity in both theory and method as it applies to research on public organizations. However, we also argue that the acceptance of theoretical and methodological pluralism requires a better understading of methodological assumptions. There are times when the underlying assumptions of different methodologies are clearly incompatible; using them jointly as a “coherent, integrated” test of theory actually has the effect of slowing and perhaps even harming theory development and progress by producing inaccurate statistical results.

To support these points, we have focused on two streams of literature in public management: PSM and red tape. These areas of research provide a useful avenue to assess how researchers have, at times, applied new and multiple methods in stand-alone studies. There are clearly some excellent examples of scholarship in public administration that effectively advance theory by using sophisticated methodological approaches. Unfortunately, some studies appear to assume the use of more complex and numerous statistical techniques in a single study inherently improves methodological conclusions and, therefore, results in stronger evaluations of theory.

As a research community that hopes to both retain its legitimacy and confront pressing social issues, we should strive to ensure that theory and method mesh in ways that allow us to confront public administration’s deepest challenges. To do this means we must remain open to pluralist approaches. However, if we are to borrow theories and methods from other fields, we must also be exceedingly careful in our efforts to merge and apply these theories and methods, lest we stall theoretical progress and our ability to contribute to the field’s most pressing issues.

Our concern, throughout this article, has been that the application of new and complex methods have stalled theoretical progress by producing inaccurate statistical results—results that are inaccurate because they are generated through incompatible tests. In the face of methodological and theoretical pluralism, we need a field that is both tolerant of different approaches but also conservative and discriminating in its selection and application of them.

Practically, we would suggest the peer review process is the best place to meet these aims, requiring something of both authors and reviewers. First, scholars must themselves (a) be extremely wary of using new and different methods and methodologies until they have a sound grasp of the fundamental assumptions built into them, and (b) craft a compelling case supporting the appropriateness of using new and multiple methodologies in a single study. Second, reviewers must be (a) vocal in challenging and questioning the use of new and multiple methods in a study, and (b) avoid encouraging authors to include additional statistical techniques that may be incompatible with a study’s existing method and theoretical approach.

As regular users of CFA and SEM, we would also note that both are useful tools for advancing theory. So are other techniques and methods. We would, however, argue that researchers who routinely apply CFA and SEM must start by recognizing that measurement is a theoretical endeavor. Measurement models that do not adequately fit the data suggest theoretical rather than empirical problems. Trimming indicators from a construct or dimensions from a second-order construct may be empirically convenient, but those activities must also be theoretically tenable. Successfully merging theory and method means that researcher must be willing to not only trim parameters from a poorly fitting model but also acknowledge the flaws in their theoretical assertions—something CFA and SEM users in public administration too frequently violate.

Based on our examination, some streams of research in public administration invested significant theoretical attention on concept definition prior to turning toward measurement issues. Alternatively, some scholars have attempted to redefine theoretical concepts based largely on matters of empirical convenience. Although theoretical assertions and empirical evidence are intertwined, concept measurement must be firmly grounded in theoretical logic. Stable theoretical foundations serve as the bedrock of high-quality measurement. Given the concepts we examined here, red tape seems to have invested more attention toward theoretical definition in early research, whereas the theoretical definition of PSM has constantly shifted to accommodate empirical inconveniences. We do not intend to suggest that one of these research streams is qualitatively superior to the other. Rather, we offer each as examples of empirical research in public administration that can effectively use psychometric statistical techniques to merge theory and method.

In sum, empirical research applying CFA and SEM should ensure that the foundations of research have a stable theoretical base from which to progress. This is not to suggest that empirical evidence cannot redirect a stream of research based on evidence that fails to confirm theory. Instead, it suggests that the continuous interplay between theoretical and methodological advancement drive the scientific accumulation of knowledge so long as logical theoretical expectations drive empirical modeling strategies. To adequately use CFA and SEM to address the most pressing problems in public administration, scholars must put the theoretical horse in front of the methodological cart. Anything else mistakes methodological choices and progress for theoretical advancement.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Notes

Author Biography

Randall S. Davis’s research examines issues of organizational behavior and human resources management in public organizations. His research interests involve exploring the psychological factors that give rise to job performance in the public sector workplace. Professor Davis has conducted research on several topics in public management including goal ambiguity, role stress, job satisfaction, public service motivation, and unionization.

Edmund C. Stazyk’s research focuses on the application of organization theory and behavior to public management, public administration theory, and human resources issues. His primary interests are in the areas of organizational and individual performance with an emphasis on motivation. Professor Stazyk also conducts research in the areas of bureaucracy and organizational design, ethics, and human capital. He teaches courses in public administration, organization theory and behavior, public management, and human resource management.

References

Andersen

L. B.

Heinesen

Pedersen

L. H.

(2014). How does public service motivation among teachers affect student performance in schools? Journal of Public Administration Research and Theory, 24, 651-671.

Blume

(1985). After the darkest hour: Integrity and engagement in the development of university research. In Wittrock

Elzinga

(Eds.), The university research system: The public policies of the home of scientists (pp. 139-163). Stockholm, Sweden: Almqvist and Wiksell International.

Bozeman

(1993). A theory of government “red tape.” Journal of Public Administration Research and Theory, 3, 273-304.

Bozeman

Feeney

M. K.

(2011). Rules and red tape: A prism for public administration theory and research. Armonk, NY: M.E. Sharpe.

Bozeman

(2015). Public service motivation concepts and theory: A critique. Public Administration Review, 75, 700-710.

Brown

T. A.

(2015). Confirmatory factor analysis for applied research (2nd ed.). New York, NY: The Guilford Press.

Buchanan

(1975). Red-Tape and the service ethic: Some unexpected differences between public and private managers. Administration & Society, 6, 423-444.

Burkholder

G. J.

Harlow

L. L.

(2003). An illustration of a longitudinal cross-lagged design for larger structural equation models. Structural Equation Modeling, 10, 465-486.

Chen

C. A.

Hsieh

C. W.

Chen

D. Y.

(2014). Fostering public service motivation through workplace trust: Evidence from public managers in Taiwan. Public Administration, 92, 954-973.

10.

Cohen

Manion

(1986). Research methods in education. London: Croom Helm.

11.

Coursey

D. H.

Pandey

S. K.

(2007). Content domain, measurement, and validity of the red tape concept: A second-order confirmatory factor analysis. The American Review of Public Administration, 37, 342-361.

12.

Coursey

D. H.

Perry

J. L.

Brudney

J. L.

Littlepage

(2008). Psychometric verification of Perry’s public service motivation instrument results for volunteer exemplars. Review of Public Personnel Administration, 28, 79-90.

13.

Davis

R. S.

(2010). Blue-collar public servants: How union membership influences public service motivation. The American Review of Public Administration, 41, 705-723. doi:10.1177/0275074010392367

14.

Davis

R. S.

(2013). Unionization and work attitudes: How union commitment influences public sector job satisfaction. Public Administration Review, 73, 74-84.

15.

Davis

R. S.

Pink-Harper

S. A.

(2016). Connecting knowledge of rule-breaking and perceived red tape: How behavioral attribution influences red tape perceptions. Public Performance & Management Review, 40, 181-200.

16.

Davis

R. S.

Stazyk

E. C.

(2014). Making ends meet: How reinvention reforms complement public service motivation. Public Administration, 92, 919-936.

17.

Frederickson

H. G.

Smith

K. B.

Larimer

C. W.

Licari

(2015). The public administration theory primer. Boulder, CO: Westview Press.

18.

Giauque

Ritz

Varone

Anderfuhren-Biget

(2012). Resigned but satisfied: The negative impact of public service motivation and red tape on work satisfaction. Public Administration, 90(1), 175-193.

19.

Hou

A. Y.

Poocharoen

O-. O.

Yang

Zhao

Z. J.

(2011). The case for public administration with a global perspective. Journal of Public Administration Research and Theory, 21(suppl. 1), i45-i51.

20.

Jensen

U. T.

Vestergaard

C. F.

(2017). Public service motivation and public service behaviors: Testing the moderating effect of tenure. Journal of Public Administration Research and Theory, 27(1), 52-67.

21.

Kim

(2011). Testing a revised measure of public service motivation: Reflective versus formative specification. Journal of Public Administration Research and Theory, 21, 521-546.

22.

Kim

Vandenabeele

Wright

B. E.

Andersen

L. B.

Cerase

F. P.

Christensen

R. K.

. . . Liu

(2013). Investigating the structure and meaning of public service motivation across populations: Developing an international instrument and addressing issues of measurement invariance. Journal of Public Administration Research and Theory, 23, 79-102.

23.

Kjeldsen

A. M.

Hansen

J. R.

(2016). Sector differences in the public service motivation–Job satisfaction relationship exploring the role of organizational characteristics. Review of Public Personnel Administration. Advance online publication. doi:10.1177/0734371X16631605

24.

Klein

J. T.

(1996). Interdisciplinary needs: The current context. Library Trends, 45(2), 134-154.

25.

Kline

R. B.

(2016). Principles and practice of structural equation modeling (4th ed.). New York, NY: The Guilford Press.

26.

Le Grand

. (2003). Motivation, agency, and public policy: Of knights and knaves, pawns and queens. Oxford, UK: Oxford University Press.

27.

Little

T. D.

Lindenberger

Nesselroade

J. R.

(1999). On selecting indicators for multivariate measurement and modeling with latent variables: When “good” indicators are bad and “bad” indicators are good. Psychological Methods, 4, 192-211.

28.

Mainzer

L. C.

(1994). Public administration in search of a theory: The interdisciplinary delusion. Administration & Society, 26(3), 359-394.

29.

Mason

R. O.

Mitroff

(1981). Challenging strategic planning assumptions: Theory, cases, and techniques. New York, NY: Wiley.

30.

Moynihan

D. P.

Wright

B. E.

Pandey

S. K.

(2012). Working within constraints: Can transformational leaders alter the experience of red tape? International Public Management Journal, 15, 315-336.

31.

Pandey

S. K.

Kingsley

G. A.

(2000). Examining red tape in public and private organizations: Alternative explanations from a social psychological model. Journal of Public Administration Research and Theory, 10, 779-800.

32.

Pandey

S. K.

Scott

P. G.

(2002). Red tape: A review and assessment of concepts and measures. Journal of Public Administration Research and Theory, 12, 553-580.

33.

Pandey

S. K.

Welch

E. W.

(2005). Beyond stereotypes: A multistage model of managerial perceptions of red tape. Administration & Society, 37, 542-575. doi:10.1177/0095399705278594

34.

Pandey

S. K.

Wright

B. E.

Moynihan

D. P.

(2008). Public service motivation and interpersonal citizenship behavior in public organizations: Testing a preliminary model. International Public Management Journal, 11, 89-108.

35.

Park

S. M.

Rainey

H. G.

(2008). Leadership and public service motivation in U.S. federal agencies. International Public Management Journal, 11, 109-142.

36.

Perry

J. L.

(1996). Measuring public service motivation: An assessment of construct reliability and validity. Journal of Public Administration Research and Theory, 6, 5-22.

37.

Perry

J. L.

Hondeghem

(2008). Motivation in public management: The call of public service. Oxford, UK: Oxford University Press.

38.

Perry

J. L.

Wise

L. R.

(1990). The motivational bases of public service. Public Administration Review, 50, 367-373.

39.

Pfeffer

(1993). Barriers to the advance of organizational science: Paradigm development as a dependent variable. Academy of management review, 18(4), 599-620.

40.

Raadschelders

J. C.

(2011a). The future of the study of public administration: Embedding research object and methodology in epistemology and ontology. Public Administration Review, 71, 916-924.

41.

Raadschelders

J. C.

(2011b). Public administration: The interdisciplinary study of government. Oxford, UK: Oxford University Press.

42.

Rainey

H. G.

Pandey

Bozeman

(1995). Research note: Public and private managers’ perceptions of red tape. Public Administration Review, 55, 567-574.

43.

Riccucci

N. M.

(2010). Public administration: Traditions of inquiry and philosophies of knowledge. Washington, D.C: Georgetown University Press.

44.

Rittle

H. W. J.

Webber

M. M.

(1973). Dilemmas in a general theory of planning. Policy Sciences, 4(2), 167-169.

45.

Ritz

Brewer

G. A.

Neumann

(2016). Public service motivation: A systematic literature review and outlook. Public Administration Review, 76, 414-426.

46.

Rose

R. P.

(2013). Preferences for careers in public work examining the government–Nonprofit divide among undergraduates through public service motivation. The American Review of Public Administration, 43, 416-437.

47.

Scott

P. G.

Pandey

S. K.

(2005). Red tape and public service motivation: Findings from a national survey of managers in state health and human services agencies. Review of Public Personnel Administration, 25, 155-180. doi:10.1177/0734371x04271526

48.

Snijders

T. A. B.

Bosker

R. J.

(1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Thousand Oaks, CA: Sage.

49.

Stazyk

E. C.

Davis

R. S.

(2015). Taking the “high road”: Does public service motivation alter ethical decision making processes? Public Administration, 93, 627-645.

50.

Stazyk

E. C.

Davis

R. S.

Sanabria

Pettijohn

(2014). Working in the hollow state: Exploring the links between public service motivation, contracting, and collaboration. In Dwivedi

Y. K.

Shareef

M. A.

Pandey

S. K.

Kumar

(Eds.), Public administration reformation: Market demand from public organizations (pp. 124-143). New York, NY: Routledge.

51.

Stazyk

E. C.

Pandey

S. K.

Wright

B. E.

(2011). Understanding affective organizational commitment: The importance of institutional context. The American Review of Public Administration, 41, 603-624.

52.

Vandenabeele

(2008a). Development of a Public Service Motivation Measurement Scale: Corroborating and extending Perry’s measurement instrument. International Public Management Journal, 11, 143-167.

53.

Vandenabeele

(2008b). Government calling: Public service motivation as an element in selecting government as an employer of choice. Public Administration, 86, 1089-1105.

54.

Vandenabeele

Brewer

G. A.

Ritz

(2014). Past, present, and future of public service motivation research. Public Administration, 92, 779-789.

55.

Wooldridge

J. M.

(2006). Introductory econometrics: A modern approach (3rd ed.). Mason: OH; Thomson Southwestern.

56.

Wright

B. E.

(2008). Methodological challenges associated with public service motivation research. In Perry

J. L.

Hondeghem

(Eds.), Motivation in public management: The call of public service (pp. 80-98). New York, NY: Oxford University Press.

57.

Wright

B. E.

(2011). Public administration as an interdisciplinary field: Assessing its relationship with the fields of law, management, and political science. Public Administration Review, 71, 96-101.

58.

Wright

B. E.

Moynihan

D. P.

Pandey

S. K.

(2012). Pulling the levers: Transformational leadership, public service motivation, and mission valence. Public Administration Review, 72, 206-215.