Abstract
This article discusses some ontological and epistemological differences in qualitative and quantitative approaches to concepts and measurement. Concept formation inevitably raises the issue of ontology because it involves specifying what is inherent and important in the empirical phenomenon represented by a concept, e.g. ‘What is democracy?’ Qualitative researchers adopt a semantic approach and work hard to identify the intrinsic necessary defining attributes of a concept. Quantitative scholars adopt an indicator-latent variable approach and seek to identify good indicators that are caused by the latent variable. Concepts and measurement also raise epistemological issues about the nature and quality of knowledge. In quantitative analyses, the challenges of knowledge generation are closely linked to ‘error’, understood as the difference between an estimated value and a true value. By contrast, in qualitative analyses the challenges of knowledge generation are more closely linked to ‘fuzziness’, understood as partial membership in a conceptual set.
Quantitative and qualitative 1 scholars differ systematically and often dramatically in their approaches to concepts and measurement (Goertz, 2005). Some of these differences are not particularly surprising. Qualitative scholars have long, involved, ‘wordy’ discussions about the meaning of concepts. In this respect, they resemble (political) philosophers, who also spend much time on concept analysis. By contrast, quantitative scholars need data for their statistical models. Accordingly, they focus attention on the nature and quality of quantitative measures. They spend less time on the concept and more time on operationalization, aggregation and resulting datasets.
The present article focuses on two important differences between the quantitative and qualitative approaches to concepts and measurement. The first concerns ontology. Concept formation inevitably raises the issue of ontology because it involves specifying what is inherent and important in the empirical phenomenon represented by a concept, e.g. ‘What is democracy?’ As we shall see, quantitative and qualitative researchers differ sharply in their approaches to specifying the ontology of concepts. Qualitative researchers adopt a semantic approach and work hard to identify the intrinsic necessary defining attributes of a concept. Quantitative scholars assume an unmeasured or latent variable and then seek to identify good indicators that have a causal relationship with the latent variable.
The second big difference concerns epistemology. In quantitative approaches, the challenges of knowledge generation are closely linked to ‘error’. The whole field of statistics is concerned with producing valid knowledge in a context in which error is present. By contrast, in the qualitative tradition, the challenges of knowledge generation are linked to ‘fuzziness’, understood in the sense that cases may have partial degrees of membership in conceptual sets. While the problem of fuzziness might seem similar to the problem of error, in fact they are different. Fuzziness is an ontological claim about the real world (i.e. some cases are partial members of conceptual sets) that has epistemological implications. By contrast, error is inherently epistemological: it is about the quality of our knowledge.
Ontology
‘Ontology’ might seem like a strange rubric for an article on concepts and measurement. Yet, most concepts are intended to represent phenomena in the empirical world as they actually exist. Thus, when one asks about the meaning of a concept, one is asking about the nature of reality. When scholars debate the meaning of a concept, they are arguing about the substance of the empirical world.
In the qualitative approach, discussions and debates about concepts concern semantics – i.e. they concern the meaning of concepts. It is completely standard to ask questions about and have discussions concerning the definition of concepts. For example, one might ask, ‘What is your definition of the welfare state?’ A typical qualitative answer involves presenting a list of attributes or characteristics that constitute the concept. There is nothing too mysterious about this practice, because it is basically what dictionaries do. Dictionaries and qualitative scholars try to specify the characteristics that make an entity what it is.
Within quantitative research, discussions and debates about concepts focus on issues of data and measurement, and less on semantics and meaning. While some discussion of the definition of a concept is normally necessary for gathering data about that concept, it is not the focus of attention. In the case of quantitative measurement articles, a concept section may not exist at all. Instead, researchers focus on the ‘operationalization’ and ‘measurement’ of the concept. Operationalization typically involves finding ‘indicators’ comprised of numerical data that are correlated with each other and thus with the unmeasured, latent variable. Once such indicators are found, more or less involved measurement procedures can be applied for purposes of coding cases. These procedures range from simple addition to complex Bayesian, latent variable models. The measurement model generates the data scores for cases vis-à-vis the concept (variable) of interest.
As an example of this quantitative approach, consider the Global Terrorism Database (GTD – see CETIS, 2007), which is now often used in statistical work on terrorism. Terrorism is a notoriously problematic concept (see Schmid & Jongman, 1988, who discuss dozens of definitions). If one reads the GTD codebook, the problematic nature of the concept is clearly acknowledged in the introduction, but almost all of the codebook is about the data. Once it is acknowledged that the concept is hard to define, the definitional issue drops out of consideration. 2 Discussions of the data proceed without reference to definitional issues. In fact, to find out what the GTD concept of terrorism actually is, one must read an appendix. 3
Not needing numerical data for large numbers of cases, qualitative scholars are freer to debate about concepts and their defining attributes. One hazard of this freedom lies in increasing the complexity of the concept. Qualitative definitions can be long, complicated – even Byzantine – in character. One of our favorite examples is the influential definition of corporatism developed by Schmitter:
Corporatism can be defined as a system of interest representation in which the constituent units are organized into a limited number of singular, compulsory, noncompetitive, hierarchically ordered and functionally differentiated categories, recognized or licensed (if not created) by the state and granted a deliberate representational monopoly within their respective categories in exchange for observing certain controls on their selection of leaders and articulation of demands and supports. (Schmitter, 1974: 93–94)
This definition has many different attributes, some of which are contained within others. If one were to try to unpack Schmitter’s definition into individual characteristics, there might be ten or more features. Moreover, different people might come up with different lists.
In quantitative methodology, the process of coding data on corporatism involves the use of indicators. These indicators may not be explicitly mentioned in the definition of the concept. For example, labor centralization has been used as a quantitative indicator of corporatism (see Kenworthy, 2003, for a discussion of other quantitative indicators). It is a matter of interpretation how this indicator fits within the abstract language of Schmitter’s definition. In general, the move from concept to concrete data will almost always involve significant simplification. Often this simplification entails redefining a concept to include a more limited number of defining dimensions.
For qualitative researchers, the failure of indicators to represent well all the defining attributes of a concept raises concerns. For these researchers, the attributes of a concept are obligatory features that literally are the concept. Each must therefore be measured. Qualitative researchers resist extreme modes of simplification. They believe that concepts must be defined independently of data considerations. The definition of a concept should not be driven by the data that happen to be available to measure that concept.
Unlike the attributes that constitute a concept, indicators are optional, substitutable and not necessarily definitional. Different indicators are all measures of the same conceptual entity. Treier & Jackman’s (2008) discussion of the Polity measure of democracy illustrates nicely the difference between defining attributes and indicators. Polity defines democracy in terms of five attributes, and it suggests that each of these attributes is an inherent feature of democracy. For Treier & Jackman, however, these attributes are simply indicators for the same latent concept (democracy). Two of the five indicators do not meet the statistical requirements of their methodology and are thus discarded. Their final measure of democracy consequently uses only three of the five polity dimensions. Bollen & Grandjean’s (1981) use of fairness of elections as an indicator of political democracy offers another example. To the qualitative scholar, fair elections are a defining and obligatory attribute of democracy. Fair elections are not optional; this attribute is necessary for democracy (Bowman, Lehoucq & Mahoney, 2005; Mainwaring, Brinks & Pérez-Liñán, 2001).
In the literature on concepts, the language of indicators disappears and is replaced by language such as minimal requirements for a democracy. A good example is Collier & Levitsky’s influential notion of a diminished subtype: ‘For example, “limited-suffrage democracy” and “tutelary democracy” are understood as less than complete instances of democracy because they lack one or more of its defining attributes’ (Collier & Levitsky, 1997: 436–437). The very idea of a diminished subtype makes little sense if the attributes of the root concept are optional features that are not necessary for conceptual membership.
One can think of the two approaches in terms of differing emphases on concepts versus measurement. Qualitative scholars focus on concepts but do not think much about measurement models or how to aggregate defining dimensions. Conversely, quantitative scholars focus their attention on measurement, and devote less attention to concepts. We think this opens up the possibility for interesting work that cuts across these cleavages. Qualitative scholars need to think more about measurement and aggregation. They should not be content with default aggregation and dichotomous measurement. Quantitative scholars need to think more clearly about concepts and their defining dimensions and what that implies for measurement. This kind of cross-cutting work would also almost by definition build bridges between qualitative and quantitative scholars, because each side would be taking the other side’s concerns very seriously.
A related difference involves attributes/concepts in qualitative research versus indicators/variables in quantitative research. In the qualitative research, the relationship between attributes and concept is a semantic, definitional one. In the quantitative approach, the relationship between indicators and latent variable is a causal one: in the standard view, the latent variable causes the indicators. (Bollen, 1989). 4
Figure 1 illustrates a typical latent variable model. The causal arrows in the figure run from the latent variable to the indicators. This helps to make sense of the idea that indicators are substitutable factors. The latent variable might be the cause of many different things, and the scholar is just choosing some of them. As noted above, tension arises when qualitative researchers believe these indicators are really defining attributes. In the qualitative approach, defining attributes cannot be causally related to the concept of interest; they cannot even be temporally separated from the concept. They are the concept.

Latent variable model of democracy. Source: Bollen & Grandjean (1981).
The idea that measurement should be based on causal theories has a long and distinguished history. Hempel (1952) made this connection using the natural sciences as his focus. For example, the usefulness of a thermometer as a measure of temperature depends on a causal theory of heat expansion. In the social sciences, the same idea underpins the large literature on latent variables and measurement (Bollen, 1989). Our purpose is not to call into question the view that indicators and latent variables should be causally related. Within the quantitative tradition, and for many phenomena, this view makes perfect sense. It seems reasonable to believe that one’s intelligence – latent variable – might affect one’s performance on tests of intelligence. Likewise, it seems reasonable that one’s political ideology might affect one’s answers on a questionnaire about politics.
From a qualitative standpoint, nevertheless, the key issue will remain addressing the meaning of the concept of interest. These researchers will press the quantitative scholar by asking, ‘What exactly is the definition of intelligence or political ideology?’ They will be dissatisfied with any answer that suggests the concept can be defined in terms of the indicators that are used to measure it. 5 From the qualitative perspective, the quality of indicators must always be assessed in light of the meaning of the concept being measured. Many quantitative researchers will agree in principle, but the concerns of this group lead the discussion to center more on issues of measurement and indicators than on issues of meaning and definition.
Epistemology
When coding cases, the qualitative and quantitative traditions exhibit important epistemological differences in their beliefs about the quality of our knowledge. Cases that are considered to be good candidates for accurate description and coding in one approach are often considered to be poor candidates in the other. The kind of case that the quantitative researcher assumes is subject to higher levels of measurement error is often precisely the kind of case that the qualitative researcher assumes is subject to the least amount of measurement error (and vice versa).
We suggest that fuzzy logic is the natural way to model the qualitative approach to concepts and measurement. Classic work on concepts such as that by Sartori is explicitly founded on logic and makes reference to the classics of philosophical logic. Fuzzy sets are the natural extension from Aristotelian logic to continuous logics. Fuzzy logic is indeed a theory of applied semantics, which includes a theory of measurement. ‘Set membership’ is a way to associate meaning to numbers in the zero to one range. Fuzzy logic was originally designed to model human language (see McNeill & Freiberger, 1994, for a nice non-technical history and introduction; see Ragin, 2000, 2008, for social-science applications). While it goes beyond the scope of this paper, fuzzy logic connects semantics and measurement. For example, with fuzzy logic we can make sense of the statement ‘the plane is very full today’, by linking what the pilot means by ‘very full’ to the number of seats occupied (see Ragin, 2008, for details).
To explore differences between the quantitative and qualitative approaches, we can begin by clearing up the relationship between ‘error’, which is central to all statistics, and ‘fuzziness’, which is an important idea in fuzzy-set analysis. These two concepts might, at first glance, seem quite similar. When quantitative scholars hear the word ‘fuzzy’, they might initially believe that it suggests a lack of clarity, which in turn implies ‘uncertainty’ or ‘error’. Yet, in fact, the analogy between error and fuzziness is quite misleading.
We can go back to the two main ideas of this article, ontology and epistemology, to see why this analogy is misleading. As Peter Hall (2003) has noted, there is a tendency to blur differences between epistemology and ontology. In statistics, error estimates concern epistemology in the sense of ‘quality or nature of knowledge’. Indeed, what distinguishes statistics from other ways of making numeric estimates about the world is precisely the inclusion of a stochastic element to give us an idea about the quality of our estimate.
By contrast, fuzzy-set membership values are ontological; they are statements about features of the world. For instance, if one asserts that a case has a fuzzy-set membership value of 0.75, one is making a claim about the empirical nature of that case – i.e. the case is mostly but not entirely within a given conceptual set. Using our example above, the membership value might represent the meaning of ‘very full’, which is a description of the contents of the plane. One is not making any assumptions or statements about error or the quality of knowledge. Nor is the claim probabilistic in any sense.
A better analogy is the idea that a fuzzy-set membership score for a case is similar to a value on a given variable for a particular observation in a quantitative dataset. Although fuzzy membership values and variable values cannot be mechanically translated from one to the other (see Ragin, 2008), there is a parallel between the two. The big remaining difference is then that fuzzy membership values usually have no error or uncertainty estimates associated with them. 6 At the same time, quantitative datasets also often report values that have no error estimates attached to them, e.g. the Polity or Freedom House datasets for democracy.
Although they do not present explicit and numerical estimates of error, qualitative researchers do routinely discuss the difficulty of accurately coding particular cases. They may include elaborate and thoughtful discussions of their reasoning behind certain codes for individual cases. They may ground their decision in the existing expert literature or their own specialized knowledge. From a quantitative perspective, it might seem strange that these researchers would worry so much about the specific codings for a few individual observations. Within the quantitative approach, it is usually not a good investment of time and resources to focus so closely on a small number of problematic observations.
These qualitative coding decisions often are made by assessing the extent to which a case corresponds to an ‘ideal-type’, or a pure and complete example of a given concept. The ideal-type serves as a standard against which all empirical cases can be evaluated; scholars ‘calibrate’ (Ragin, 2008) their case codes in light of this standard. In terms of a scale of fuzzy-set membership scores, the ideal-type is at one extreme of the scale: ideal-typical cases have a membership value of 1.00. These cases unambiguously have all of the defining characteristics of the concept in question. Similarly, a membership score of 0.00 means that the defining characteristics are completely absent.
With this qualitative approach, the general intuition is that cases closer to the ideal-type are easier to code, and thus that the error associated with these codings is lower (Eliason & Stryker, 2009; Ragin, 2008). Likewise, cases with scores of 0.00 are usually easier to code, since it is often clear when something is not at all a member of the concept. By contrast, cases with fuzzy membership scores of 0.50 exhibit maximal fuzziness and can be especially difficult to code. Thus, as one moves from the close approximations of ideal-types, with fuzzy membership scores of 1.00, toward maximal fuzziness of 0.50, it becomes more difficult to code accurately and error is more likely. As one moves down from 0.50 scores and approaches the 0.00 pole, coding again becomes easier and error less prevalent. Thus, in practice, there is often a roughly curvilinear relationship between level of fuzziness and level of error. This relationship is contingent and depends on the particular phenomenon being measured, but we think it is probably pretty common, at least in social-science settings.
In the statistical tradition, the relationship between variable values and error follows the opposite pattern. If one asks where the largest error estimates will occur for a continuously coded variable, the statistical answer is among the cases with extreme (i.e. high or low) values. Error is greatest at the upper and lower bounds of a variable, and lowest in the middle. This relationship is something that students learn early in their statistical training when they see a confidence band around a regression line. The estimated error is at its minimum at the mean of X and the mean of Y. It gets progressively higher the further one moves from that middle point.
A good illustration of the relationship between variable values and error in the statistical tradition is Treier & Jackman’s (2008) measurement model of democracy illustrated in Figure 2. Normally, the codes for democracy in a quantitative dataset (e.g. Polity or Freedom House) do not include any explicit error estimate. For example, one case may have a democracy value of 3 and another case a value of 6, but the error associated with those values is not estimated. Treier & Jackman’s model provides a basis for making these estimates. Figure 2 presents Treier & Jackman’s estimates of error for democracy codes with Polity data. One sees the classic shape of a confidence band: it is narrowest in the middle and widest at the extremes.

Error in statistical measurement. Source: Treier & Jackman (2007, fig. 2). Note: The democracy scale is arbitrary.
From a qualitative perspective, these estimates of error for democracy seem counterintuitive. It is usually easy to code cases that are fully democratic or definitely not democratic; the ones that are hard to code are the ones in the middle. Everyone agrees that Sweden is fully democratic, and that Cuba is definitely not democratic. But how should we code ‘borderline’ cases like contemporary Guatemala, Venezuela and Honduras?
One way to explore this issue is to ask about the level of agreement between different datasets on democracy. Presumably, these datasets will tend to agree on the easy-to-code cases and be more likely to disagree on the hard-to-code cases. In Figure 3, we report the variance for Polity and Freedom House codes using the country-years where the two datasets overlap (see Goertz, 2008, for details). When both datasets completely agree, the variance between them is zero. As their disagreement grows, the variance between them increases. In Figure 3, we see clearly that the variance is lowest at the extremes of autocracy and democracy (i.e. −10 and 10); that is, there is little disagreement when Polity or Freedom House code an extreme autocracy (a score of −10) or full democracy (a score of +10). As we move toward the gray zone in the middle (a score of 0), we see that the variation in how they code a given country-year increases significantly. In fact, as we move down from a score of 10 to a score of 0, the variance increases by nearly 1000-fold (from .025 to 22.6). A large shift also happens as we move up from extreme autocracy (–10) to a score of 0, though the increase is ‘only’ by a factor of 10. 7

Variance in the gray zone: qualitative approach.
Contrasting Figures 2 and 3 illustrates the two approaches and their sharply different views about the location of measurement error. Figure 2 is what a quantitative researcher naturally expects to find; Figure 3 is what a qualitative scholar thinks will happen.
Conclusion
This article has discussed how quantitative and qualitative scholars differ on ontological and epistemological issues related to concepts and measurement. With respect to ontology, they have different views about the basic nature of concepts, which are the fundamental building-blocks for any description of reality. In the qualitative tradition, in general, concepts are constructed through a semantic process, one in which the researcher specifies the meaning of a concept by identifying the attributes that constitute it. In the quantitative tradition, in general, concepts are constructed through the identification and aggregation of indicators that are caused by (or perhaps cause) the concept of interest.
On the epistemological side, the two traditions have different views on when and where error is likely to be most common. Qualitative scholars feel most certain about their estimates when working with cases that have extreme values, such as cases that approximate ideal-types. They are least certain for cases with values in the middle. By contrast, quantitative scholars feel the least certain about cases with extreme values and most certain for cases with middle values.
The ontological and epistemological orientations of each approach are reasonable, with long-standing histories behind them. But such deep-seated differences make it hard to go back and forth between the traditions or to arrive at some compromise synthesis. This helps to explain why the qualitative literature on concepts and the quantitative literature on measurement have had so little to say to one another.
At the same time, there is much room for crossing the boundaries between these two traditions. Qualitative scholars can benefit by thinking more about issues of measurement and error. Quantitative scholars can profit by considering more seriously issues concerning the defining characteristics of their concepts and their implications for measurement. Looking ahead, we believe that the most fruitful work on concepts and measurement will take seriously the traditional concerns of both qualitative and quantitative research.
Footnotes
This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
