Abstract
This inquiry developed during the process of “quantitizing” qualitative data the authors had gathered for a mixed methods curriculum efficacy study. Rather than providing the intended rigor to their data coding process, their use of an intercoder reliability metric prompted their investigation of the multiplicity and messiness that, as they suggest here, are inherent to work that crosses the epistemological boundaries of academic fields and research paradigms. Even as the authors developed a deeper understanding of—and appreciation for—the nature of quantitative rigor, they moved toward a more deeply constructivist view of the research process itself. Ultimately, the authors abandoned the neat study of intercoder reliability that they had envisioned and moved toward the present dialogic report to reveal and examine their process.
Keywords
The complex and extended inquiry we recount here began life rather simply. In conjunction with a larger mixed methods study we were conducting, we planned a small-scale methodological self-study of our use of Krippendorff’s alpha—an intercoder reliability metric first developed for content analysis in media research, but not yet widely used in educational research. Our choice of a mixed methods design, both for the original curriculum study and for the methodological self-study, reflected the growing popularity of this model in the social sciences generally (see Green & Preston, 2005), as well as the idea that its complementarity might offer substantial advantages for investigating complex human systems (see Greene, 2005).
As will become clear, the design of the original study was fundamentally critical, arising from concerns over constructions of identity, power, and authority in college curricula. The first author is particularly concerned with the current governmental and popular preoccupation with rendering educational research “scientific” (see also, Seltzer-Kelly, 2008). Yet our original research design accepted the data coding practice and its underlying methodology rather uncritically, reflecting the largely technical view of quantitizing that, as Sandelowski, Voils, and Knafl (2009) have argued, tends to predominate—one that glosses over “the foundational assumptions, judgments, and compromises involved” in the process of rendering qualitative data quantitative (p. 208). To at least some degree, in retrospect, this was because the decision to quantitize our qualitative data was driven by a combination of influences and hopes that, as the authors note, are rather common: the idea that it might help us recognize patterns that could otherwise be overlooked and also by the more political concern that it might lend an increased aura of rigor to our findings.
Issues related to incommensurability between the qualitative and quantitative research paradigms manifested very early in our planning process, and as our work progressed, our conviction grew that some measures of quantitative value, rather than imparting greater rigor, can instead actually undermine the research process as their use becomes separated from crucial parts of the logic underpinning both qualitative and quantitative research. Our study rather quickly became a deconstruction of the approach to mixed methods research we had used—particularly the process of quantitizing—which was followed by a far more lengthy reconstruction of our understandings of what it means to engage in a mode of scientific research that draws from a range of methods to allow for deeper and more contextual study. Paradoxically, too, even as our analysis of our use of a quantitative tool undermined our own claims to construct validity, it led toward what we believe is richer and more complex research. Thus, our work extends on the efforts of Sandelowski et al. (2009) to engage in critical and ongoing inquiry into quantitizing, while also acknowledging its value in some circumstances. We further enlarge on these ideas to consider their implications for mixed methods research generally.
This article is organized to reflect our three rounds of investigation. We begin with an overview of the original small-scale methodological study, and share our subsequent attempts to rethink the assumptions behind that very same study and its methodology. We then recount our second round of analysis, which took a qualitative approach to consider the methodological issues of researcher reflexivity for research teams, and especially the unique challenges of multidisciplinary teamwork. In the final section, we offer our reconstruction of the notion of a mixed methods approach to inquiry that draws on the Deweyan perspective that research method is a mode of evolutionary epistemology (Popp, 2007; Seltzer-Kelly, 2008), and thus fundamentally post hoc and autobiographical in its articulation (Biesta & Burbules, 2003). In particular, we came to embrace the need for recursive and iterative redirection between the research question(s) and methodology in light of new information as it emerges through inquiry (Dewey, 1938/1986)—a view that stands in sharp contrast to the firewall that is commonly erected between the articulation of a research question and the method that is selected to direct inquiry into it.
We provide a more developed discussion of methodological issues in the second section, but pause here to note that the spirit of this article is phenomenological in the sense that lived experience is the main object of our inquiry. Our methodology is hermeneutic insofar as it deals with our attempts to interpret. Since, in this case, our methodology is inseparable from questions of lived experience (i.e., what we want to interpret is lived experience), we follow Laverty (2003) in using the term hermeneutic phenomenology to describe our project. This hermeneutic phenomenological approach ultimately demonstrated to us “how philosophical structures are immanent in empirical work,” and prompted us “to rethink the relation between empirical work and philosophy in a way that posits an engagement with not knowing as an ethical and political move” (Lather, 2009, p. 343). This represents our active engagement in, and transparency as to, a very messy and deeply revelatory research process: Our move to embody an alternative vision of rigor we believe is imparted through truthfully imparting the research process in all its multiplicity and fractionality (Law, 2004).
As will become apparent, our use of different vernaculars in this work reveals our varied research paradigms, as well as the spirit of multidisciplinarity and exchange that informed our project. We seek to ground the discussion that follows in the range of traditions that informed our process—using a narrative analysis to develop the story of the interactions between our epistemic frameworks and our interpretations of the data, and resorting to more traditional analytic writing to encapsulate our quantitative findings and associated theoretical perspectives as they emerged. In this, we follow Macbeth’s (2001) advocacy for “positional reflexivity” as a research exercise that “rather than ‘leveling’ the world with a singular, objectivizing narrative voice, [ . . . ] preserves and recovers the polysemy of multiple positions, interests, and agencies in the settings it analyzes” (p. 39).
Round 1: A Methodological Self-Study of the Use of Krippendorff’s Alpha While Quantitizing
The empirical data that generated the present inquiry were collected during a curriculum efficacy study of the diversity component of an undergraduate-level arts and humanities course created and taught by the first author. 1 Concern had been growing among faculty and students that the university’s common core curriculum was insufficiently attentive to the increasingly diverse campus community. Although there had long been a requirement that students take a designated diversity course, some consensus had emerged that this was insufficient, and there was tentative interest in replacing this with infusion of diversity topics throughout the core curriculum. The first author’s course was intended to model that approach, and the study was designed to assess its potential to effect changes in students’ attitudes as they encountered diversity-related materials across the semester.
Debbie, the first author, is a curriculum studies specialist with a deep grounding in the humanities. Her research perspective is Deweyan; she is concerned by the ways in which the educational system itself, as well as approaches to research into it, can become mechanisms for the reification of the existing economic and social systems (see Seltzer-Kelly, 2008). Although she is primarily a qualitative researcher, our decision to use a mixed methods approach for this study was driven by her interest in its possibilities for complex investigation and analysis, as noted above. Sean, the second author, was, at the time we began the study, a master’s student in communication. He is now a doctoral candidate in that field. His educational background in, and professional experience with, quantitative research methods made the mixed methods approach even more appealing since we would have a team member with expertise in each realm.
The formative stages of the study were marked by lively debates between the two, reflecting their differing disciplinary commitments and research perspectives. Debbie and Sean disagreed about issues ranging from the possibility and/or desirability of realizing anything that might be considered to be “objective” knowledge through the research process, to whether they could—or should—make any claims regarding the generalizability of their conclusions. Although not rejecting an interpretivist posture, Sean would argue that approximations of objective data are useful in modeling complex human experiences in scientific study (see Box & Draper, 1987). Debbie’s research and teaching perspective, in contrast, is Deweyan and deeply interpretivist; although she is interested in the ways in which quantitative research design can reveal patterns that might otherwise be overlooked, her approach to its analysis is markedly qualitative. Still, they were ultimately able to agree on a design for a series of anonymous surveys to elicit participants’ perspectives on identity and diversity issues generally, and to monitor experiences with these constructs as the course progressed.
The surveys combined scaled variables to provide a baseline for participants’ attitudes and perspectives with open-ended questions to allow elaboration on the ways in which the curriculum affected students’ thinking on issues of race, identity, and diversity education. Five rounds of surveys were administered to approximately 100 participants: students enrolled in two sections of the course, both taught by the first author using identical curriculum, during a single semester. One administration occurred at the beginning of the semester, three during it, and one at the end of the semester. All responses were anonymous, so no tracking of changes to individual attitudes over time was possible; we intended rather to look for group-level shifts in responses to the questions across the semester.
The qualitative responses were intended to function in two ways. First, since we planned to look for correlations among the qualitative and scaled responses, the qualitative responses would be coded to render them “amenable to statistical assimilation” with the scaled responses (Sandelowski et al., 2009, p. 210). Excerpts from the qualitative data would also be reported in conjunction with the quantitative results, providing elaboration on them (Brannen, 2005). Finally, they would serve as the foundation for independent qualitative analysis—a study that has been completed and reported elsewhere (Seltzer-Kelly, Westwood, & Peña-Guzman, 2010).
In addition to their differing methodological perspectives, Debbie and Sean disagreed fundamentally about what our research might show regarding racism on campus. Debbie is a White female and a member of the so-called Baby Boom generation, with a significant and long-standing commitment to social justice. Based on prior research (Chesler, Lewis, & Crowfoot, 2005; Duncan, 2002; TuSmith & Reddy, 2002), as well as her own past experiences on our campus and elsewhere, she expected our study data to reflect racism as a persistent force, accompanied by widespread denial of that persistence on the part of White students. Sean is a White male member of “Gen Y.” His view, grounded in his experiences among diverse peers on our campus, was that race was not the most consistent or implicit source of oppression for most youth, and was therefore not the most useful category for analysis of identity and interaction. He was also (rather ironically given our personal identities and generational experiences) far more attentive to gendered sources of oppression than was Debbie in the early stages of our work together.
Our third investigator and author, David, joined the study shortly after data collection began. David, at the time an undergraduate, contributed a background in philosophy and gender studies; he was primarily a philosophical and theoretical researcher. Over the years that have spanned this inquiry, David has moved into a doctoral program in philosophy and specialized in 20th-century Continental philosophy and political theory. His research posture generally tends to incorporate strains of critical theory, Nietzschean interpretivism, and speculative philosophy. However, rather more surprisingly, he also works in philosophy of science (biology) and can sometimes lean toward an empirical–realist interpretation of scientific research. He, like Sean and our participants, is a member of Gen Y, and he is also a Latino immigrant. David added logistical support in data collection, and, more critically, he provided an additional perspective to the divergent views of Debbie and Sean. To our surprise, however, the differences between Debbie and Sean were not only not resolved by David’s entry into the project—rather, they were magnified exponentially.
Even before we saw any coding results to support this belief, it had become clear to us from our early conversations that we would more than likely disagree in our interpretations of the qualitative responses related to racism. So, as we prepared for data coding, we also constructed reflexive hypotheses about what our data might show, subject to our known differences in perspective. This is seemingly a small step, but from a quantitatively oriented perspective, since we came to this agreement before data analysis, we were predicting an outcome that was testable with our data. This was also the point at which we formulated our methodological self-study, since we thought we might be able to model a useful process for other diverse teams.
The Method: Working With Assumptions of Coder Variance
In quantitizing data through coding, the compression of rich data to categorical values is accomplished by humans operating within the norms and practices of their research fields, which filter and frame assumptions and interpretations. Even using the most well-defined instruments, and with a group of seasoned researchers, the complex coding task will create agreement and disagreement by random chance and not because of the instrument or the data (Krippendorff, 1980). The literature on content analysis clearly articulates the need for an intercoder reliability metric to diagnose, assess and, ideally, remove individual variance, reproducibly systematizing the coding process (Krippendorff, 1980; Neuendorf, 2002; Tinsley & Weiss, 1975).
Many advanced measures of assessing coding reliability exist, but even the most statistically powerful only provide a one-way measure for assessing reliability: similarity of perspectives. The measures do not correct for problems of interpretation through statistical means that identify the location of significance or insignificance, and rather simply state that there is or there is no variance in coder classification. Thus, using a measure such as simple percent agreement may inflate actual agreement or deflate agreement. 2 Given our desire to accurately search for variance, rather than simply reducing it to an “acceptable” amount, we selected Krippendorff’s alpha. As a highly conservative measure it seemed likely to yield lower agreement values, but we selected it for its two primary strengths: the inclusion of confidence intervals for all alpha scores, and corrections for random agreement or disagreement among coders. We believed that this rigorous intercoder reliability procedure would enrich our discussion of our conclusions—that it would enable us to explore the complexity of our interpretations and to more substantively interact with the existing body of scholarship.
Round 1 Results and Discussion: Still Seeking Intercoder Reliability
Categories for coding were, in the ordinary way, created prior to our examination of the sample data set, and were based to a large degree on Debbie’s grounding in the relevant literature. Our coding on our sample set showed that our intercoder reliability was acceptable overall for drawing tentative conclusions (α = .77), 3 with agreement sufficient for meaningful reliability (ranging from α = .81 to α = .94) on many items. However, it fell considerably below the traditional threshold for statistical validity on some scattered variables (α = .38 to α = .44).
Discussions of our inconsistencies quickly revealed that many of our problems were a result of differing interpretations of categorical definitions. For example, one question was “Do you believe that diversity education (about varieties of cultural identity, including racial/ethnic, socioeconomic, religious, sexual identity/orientation, ability/disability, etc.) is an important part of a college education? Why or why not?” Our coding categories included the idea that college is meant to be a generally broadening experience, for which our reliability was quite low (α = .44). This had been intended to contrast with the concept that the focus of diversity education should be the inculcation of specific diversity-related skills, a category for which we were quite consistent (α = .94). Clearly, our definition for the first category had been inadequate, whereas the second had functioned well.
Our coding results for responses to one question were particularly striking, in light of our initial hypothesis for the methodological self-study. The question was “Do you agree that people who are members of minorities experience the world very differently from those who are members of the majority? Why or why not?” The categories of particular interest to us, and their respective measures for intercoder reliability, were (a) whether racism is/was a genuine problem/challenge (α = .81); (b) whether racism is a current problem/challenge (α = .44); and (c) whether racism as a problem/challenge is largely manufactured by the wider society, families, or schools (α = .88).
Again proceeding along traditionally accepted lines in our quest for quantitative legitimacy, we began the process of trying to convert disagreement to agreement. We decided to revise our instrument with more comprehensive definitions in the hope that this would facilitate agreement. David seized the opportunity to learn more about content analysis, and spent hours consulting the literature and exploring possible sources of valid definitions within existing published research. His efforts produced a far more extensive codebook, with complex definitions and examples for the analysis of our textual data.
We independently reexamined our coding of the data, and, as we had anticipated, nearly all our differences were resolved. However, our reliability was now actually a bit lower (α = .42) on a single variable: whether racism is a current issue or largely a thing of the past. The problem arose in our coding of the round of surveys that had been administered after a classroom viewing of Devil in a Blue Dress, a film set in 1948 Los Angeles that highlights the racial divisions of the time. Depending on our coding decisions on the sample set of responses, student comments such as “In our modern age, it is more socially acceptable to have a diverse society,” “It’s a jungle out there & [sic] I’m glad I don’t live in those days,” and “I think that it’s good to learn about the past and what it used to be like” might tend to affirm the perspective that racism is largely a thing of the past and that opportunity is now equal regardless of color—or not.
Debbie, whose immersion in the literature of the field had driven our initial selection of categories, consistently coded these responses as relegating racism to the past. However, neither of the two other authors “saw” this pattern reflected in these or similar responses—not, as we had predicted, Sean, nor, more remarkably, David, who had personally experienced what both he and Debbie characterized as overtly racist treatment in both the wider community and within the university. This persistent inability to code consistently clearly derived from deeply held differences that were grounded in research literature and personal experience. And, although we had removed most of our disagreements, we had now produced a codebook that was, in terms of word count, longer than the total quantity of the text we were studying!
The next step in resolving intercoder differences would customarily have been to adopt a “majority rules” approach, or for the most “authoritative” or “expert” member of the team to make the decision. But who was the most authoritative member of the team? Debbie was the senior member of the team and the one with the greatest research and teaching experience in diversity. Did her ability to situate our data within the larger field imply greater objectivity and/or insight? If so, should her perspective override that of the majority, comprising Sean and David? Alternatively, Sean had the most expertise and experience in the actual data-gathering and analytic procedures we were using, and his age and life experience seemed the most similar to those of the White students who comprised the majority of the subject pool. David’s age also placed him on an equal footing with the students we were surveying, while his status as a racialized immigrant gave him the most lived experience to speak on the subject of race in the classroom. We had engaged the crisis of representation: We had recognized that we were attempting to forge consensus over data in ways that might or might not reflect the voices of its actual creators.
Implications: Deconstructing Embedded Assumptions
As Sandelowski et al. (2009) have argued, the entire notion of data is constructed as researchers make the decision whether the respondents’ responses are actually representative—not just individually, but in relation to one another: whether the response is “a valid representation of . . . ‘real’ experience” and thus data (p. 219). Relatedly, then, interview data (or, we would argue, survey data) may fail to reflect actual experience, or tend to indicate experience that is, in fact, absent. More critically, the selection of scale levels embeds the cultural norms and standards of the researches into the research instrument (Lee, Jones, Mineyama, & Zhang, 2002).
Our experiences placed us in tension not only with these critiques to our approach but also through our more fundamental concern with methodology, we had engaged an additional layer of confounding issues. We had begun by seeking to be conscious of the ways in which we had selected what would be considered “data” and how and for what purpose we had gone about quantitizing the qualitative portion; now we had begun to unpack additional complexities within the coding process itself.
Even by its own standards, as we had begun to realize, the validity claimed through intercoder reliability measures rests on shaky ground—even when there is no significant disagreement among the coders. At the pragmatic level, intercoder reliability means that individuals agree consistently on how to place an input within output categories. Fundamentally, it is assumed that agreement among coders is equal to valid interpretation; researchers claim internal validity because coders agree with one another, not because the data agree with the coders. But coders are a self-selected or purposive sample drawn because of personal relationships, academic requirements, or need for money; they are not randomly assigned members of the population and are not representative of the population. Furthermore, even if a group of coders were drawn from the population in a representative process, the work of meaning formation they would undertake is still the result of the data processing, not the actual data.
The conversion of rich textual objects to coded variables is understood to distance analysis from the raw data, but this is considered acceptable because it is assumed that the conversion process preserves some core meaning in the reduction of rich text to limited categorical values. There are, we argue, few reasons to make this assumption. As our research process indicates, intercoder reliability procedures can and do conceal significant disagreements among coders. Hak and Bernts (1996), in fact, have argued that that these procedures actually comprise a mode of socialization to the desired perspective, rather than merely serving to iron out misinterpretations and misunderstandings among individuals. An additional layer of obfuscation occurs when the data are analyzed using only the final version of the coded items; statistical significance—and thus greater perceived weight—can easily be produced using data about which there is, in fact, considerable disagreement. Thus, the deification of numbers and statistics as a proxy for valid interpretations of data can very effectively serve to mask meaning.
We were aware that it is routine for researchers to continue to refine a failed coding frame, to fire problematic coders, or to file away unreportable findings. At this point, though, we became interested in the problems that our experiences suggested for the conduct of research and the interpretation of findings.
Round 2: A Methodological Inquiry Into the Meanings of Researcher Reflexivity
Our emerging realizations, clearly, had been triggered by our differences As Siltanen, Willis, and Scobie (2008) discovered when they began their research together, the vast majority of the existing literature on reflexivity presumes a lone researcher. Like us, they were concerned about the ways in which self–other relationships inform the team process, and they understood reflexivity not “as a form of meta-analysis about the researcher’s relation to the research process . . . [but] an always-present aspect of all interpretive practices” (p. 47). However, unlike ours, their self-study had as its focus the process of forging common understandings and interpretations of their data. We, in contrast, were interested in what the lack of common understandings meant—both for our data and for the larger field of researcher. Thus, we moved toward hermeneutic phenomenology to explore our rather unusual attempt to work reflexively together on an empirical research team that crossed disciplinary boundaries.
A Revised Methodology
As Altheide and Johnson (1994) noted in connection with the reflexive turn of qualitative research in the latter half of the 20th century, “the scientific observer is part and parcel of the setting, context, and culture he or she is trying to understand and represent.” Thus, ethnography, for example, may become less about “what happens in the field” than “what takes place ‘back in the office’ when the observer or researcher is ‘writing it up’” (Altheide & Johnson, 1994, pp. 486-487). We were confronted by a similar dilemma as our attention turned from the interpretation of data, to consideration of our own interpretive processes. To some degree, of course, this falls within the normal realm of researcher reflexivity. However, as the generational and cultural differences among us emerged as a subject of study, we felt ourselves pulled further toward the autoethnographic. We found that we experienced uncertainty in balancing “between story and context” (Holman Jones, 2005, p. 764), struggling over how much of ourselves to include. At times, our project seemed likely to become a “personal text as critical intervention in social, political, and cultural life” (Holman Jones, 2005, p. 763) as we turned new light into what we were beginning to think of as some rather shadowy corners of the world of mixed methods research—a very different focus from the rather standard research report we had envisioned.
While we created and have preserved many autoethnographic artifacts, including individual and joint writings and scores of e-mails considering these issues and their larger meanings, in the end we turned toward hermeneutic analysis. In Gadamer’s (1976/2004) terms, we confronted the hermeneutic problem as we moved across thought and tradition strange to each of us, while also reexamining and sharing our individual perspectives on our own familiar bodies of research and literature. We worked through hermeneutical reflection to bring our own prior knowledge to full consciousness, and to reconsider it in relation to new readings from a range of scholarly traditions, accompanied by our conversations with one another about those meanings. Thus, the object of interest for us involved not only the age-old traditions of textual study and interpretation but also our conversational interactions as we sought to distance ourselves as much as possible from our own previously acquired meanings and to arrive together at new and enlarged meanings that incorporated the new phenomena we had observed and experienced.
Round 2 Findings and Discussion: The Paradigmatic Nature of the Research Process
As Morgan (2007) has argued, even as the term paradigm has come to pervade social sciences research, it has been defined in varied and conflicting ways. Certainly, it seems accurate to define our “broad differences in . . . assumptions about the nature of knowledge and the appropriate ways of producing such knowledge” (Morgan, 2007, p. 52) as a matter of the differences among our individual paradigms as they inform our epistemological stances. At the same time, we agree with Morgan that it may be more helpful to consider paradigms as “shared beliefs within a community of researchers . . . about which questions are most meaningful and which procedures are most appropriate for answering those questions” (p. 53). We lean toward this latter view because, for the duration of this study, we became a small community of researchers and this work became a case study of our paradigmatic stances and inquiries.
At the very least, we would argue that the assumption that consensus can be achieved in the research process—in this case through intercoder reliability measures—seems to presume an overreliance on similar backgrounds and views among researchers—a phenomenon well supported by Kuhn’s (1962/1996) articulation of the paradigmatic nature of research communities. As we saw with Debbie’s initial approach to our study and our coding categories, the power and authority of prior literature in the field of diversity research had, in combination with her life experiences, informed her research paradigm, influencing her understandings as to questions, methodology, and interpretation of data. Adding complexity to our situation, we were experiencing two different versions of paradigmatic difference (Morgan, 2007): the epistemological level, as our beliefs and practices differed in the selection of research questions and methods and the ways in which our respective domains of knowledge influenced our interpretations and metaphysically, seen in Debbie and Sean’s disagreements over the nature of quantitative data and the data’s ability to represent reality. The question became, were these truly incommensurate, or did a possibility of shared ground exist?
More critically, as we began to explore the crafting of a new and different research article, we began to wonder whether and how we could communicate effectively with one another, as well as to our eventual readers. As the meta-analytic elements of our work swamped simple research reporting, we moved toward a narrative structure, and our reflexivity came to consist not only of locating and deconstructing “the intersections of author, other, text, and world” but also of “penetrating the representational exercise itself” (MacBeth, 2001, p. 35). Through our reflexive turn, we moved to privilege the story of our decision to “breach canonical conventions and expectations” (Bochner, 1997, p. 434), rather than to report our results in a way that would be more conventional for any or all our respective fields.
This process also became another of the many administrative conflicts between fields in cross-disciplinary research. Debbie and David found the narrative approach natural and embraced it immediately, whereas Sean was resistant to it in the interrogation and representation of methodology and the philosophy of meaning he offered in the article. We were united in our focus on reflexive meaning, but spent pages of e-mails and several telephone conversations debating these issues. We have written a great deal individually, and large segments of those writings are preserved here. As the first author, Debbie functioned as a centralizing force, weaving our individual writings and our conversations into this work. Still, there are portions that represent authorial consensus, and others that highlight the differing meanings we brought to this inquiry and the understandings we returned to our individual realms.
Some of Sean’s voice is revealed in the insertions of more technical language where needed to accurately convey our justification for considering the possibility of significant difference before the start of analysis. Yet Sean and Debbie struggled over language use in relation to audience throughout our writing process; segments such as the following, while clearly appropriate for Sean’s usual quantitatively oriented audience, struck Debbie as potentially problematic for researchers more accustomed to working in a qualitative tradition:
We might even, we thought, be able to discuss our findings in terms of replicability and generalizability. From this perspective, the a priori construction was vital because it allowed us to confront the issue of significance with a prepared stance, rather than forcing logic on significance or insignificance after data analysis in infamous “hunts for significance.” Anticipating variance, we began to evaluate means for assessing intercoder reliability in light of our desire to accurately search for variance, rather than simply to hone in on an “acceptable” amount. (From a draft dated March 21, 2009)
The final section of the present article represents the synergy of Sean’s and Debbie’s joint methodological analysis, conveyed primarily through Debbie’s writing. Most notably, while several of our conclusions extend on Debbie’s earlier research, that realization was actually Sean’s, so that his insights drove the overall trajectory of that section.
David’s authorial voice predominates through much of our philosophical analysis of incommensurability and precarity, although other philosophical elements of this work were the product of considerable negotiation. One particularly striking example of this arose rather late in our process, in our choice of terms to describe our overall methodological approach. Debbie—whose perspective on hermeneutics is Gadamerian—had originally regarded our method as primarily heremeneutic with occasional autoethnographic insertions. In response to a reviewer’s comment on the phenomenological aspects of our work, she moved toward Laverty’s (2003) model of hermeneutic phenomenology. David, drawing from his own purely philosophical background and not yet having seen Laverty, pointed out:
Since we don’t really have any phenomenological analyses in here or quote any phenomenologists or even deal with “mental content” in the way the phenomenologists do, I think we should drop the word entirely. . . . I don’t think our project is phenomenological in so far as it does not deal with the appearance of mental objects to consciousness. (E-mail communication, January 22, 2011)
Once we had all read Laverty, and after another round of discussions among the three of us, David’s suggestions for clarifying language were adopted in relation to our use of this term.
As an individual power struggle this is uninteresting, and the results of our negotiations appear without much contention throughout this article. As Sean and Debbie realized after one particularly intense “squabble” (to use Sean’s term), they were no longer certain who had actually written which individual portions, and both felt their views had been represented. It is also, of course, quite unusual to discuss this issue in a research report (Siltanen et al., 2008), but the assignment of power and the implications for results that power structures hold are particularly important in constructivist analysis.
Adding tension to our negotiations over meaning and expression was our multidisciplinarity. As Wear (1999) notes, in order for team-oriented self-reflexivity to have a positive effect on the research process, academics participating in interdisciplinary or cross-disciplinary research must be willing to question the primacy of their field and recognize that it is just one of multiple. This means that when conflicts in interpretation of data or mere differences in opinion impede the advancement of the research, academics from different fields must learn “how to separate the major from the minor debates, the major from the minor players, and the debatable from the given” (Wear, 1999, p. 299). In other words, when incommensurability impedes the progression of research, critical self-reflexivity must be taken up by the research team “through a comprehensive ‘give-and-take’ progression” (Pickett, Burch, & Grove, 1999, p. 302). Investigators must be willing to negotiate questions of difference, step outside their comfort zones (i.e., their specializations), and learn the language of other fields to facilitate discussion. Although at first glance this seems intuitive enough, in practice—as seen in our above examples—it can result in long and tortuous debates among coauthors, especially because the language has been unmoored from its disciplinary precision.
Philosophically speaking, the conversations and exchanges that lead to a common ground from which researchers can be said to speak as a collective without one voice dominating the others must be informed by an ethic of precarity. Ettlinger (2007) depicts precarity as “a condition of vulnerability relative to contingency and the inability to predict” (p. 319). Precarity is the state of mind that is produced when the speaking subject lacks control over “the surety to navigate social, political, economic, and cultural life through everyday discursive and material practices” (p. 325). In the context of research, the state of precarity materializes when individual researchers finally realize that “the whole (that is, the interdisciplinary team) should be greater than the sum of the individual parts” (Turner & Carpenter, 1999, p. 276); when it becomes clear that no particular member of the team has full control over the discursive practices of the group. In our case, the ethic of precarity permeated our entire project. It was there when we had to agree on seemingly inconsequential details of our research such as whether we should use first or last names or locutions such as “the first author” to designate authors. And it was also there when we sought to answer larger methodological questions regarding the implications of our study, such as whether our experience showed that quantitative metrics of analysis are “unfit” for qualitative study, or merely that they should be accompanied by critical self-reflection.
In these circumstances, because the individuality and autonomy of the researcher can be said to be somewhat lost in the group, the researcher is under deconstruction. The cross-disciplinary research process itself functions as a constant reminder of the limitations of the researcher’s ideology. What the speaking “I” believes is no longer the axiomatic norm by which “truth” and “value” are determined, rather it becomes only a supplement to what the team believes. Through critical self-critique and reflection as well as the ethic of precarity, the self becomes displaced and decentralized. The researcher is thus deconstructed by the process of research itself in a way that aligns with Spivak’s definition of “deconstruction as that which helps people think against themselves” (as cited in Lather, 2009, p. 348).
Round 3: Reconstructing Method and Reclaiming the Mantle of Science
We begin this discussion by acknowledging that our emerging theory base is underdetermined by our existing evidence. We have continually been challenged by what David has characterized as an “overflow of meaning”—the very quality that is lost, as we had learned, when multifaceted qualitative variables are cataloged into a structural quantitative metric of analysis. For this reason, our presentation is dialogic rather than definitive; we seek to broaden the discussion of the questions we believe our experiences raise.
Our work embodies an interesting and troubling paradox: An interpretivist posture and our own arguments seem to reject the possibility or desirability of a true mixed methods research design, where quantitative techniques are introduced and used in combination with qualitative approaches. At the same time, the rigorous deployment of a particular quantitative instrument, Krippendorff’s alpha, yielded what we believe is interesting and suggestive data that contributed to our qualitative understandings, especially to our appreciation of reflexivity. Furthermore, it suggested new directions for analysis that might have gone unnoticed during a purely qualitative analysis.
Although our results with regard to diversity at the postsecondary level are far too limited to explicitly question results of prior studies, they do suggest that it is time to incorporate more exploration and more varied analyses of youth perspectives on racism—perhaps through expanded use of multigenerational, multicultural, and multidisciplinary research teams. Our range of interpretations might, for example, prompt us to question whether perhaps the findings of earlier diversity studies on college campuses more accurately reflected the ontologies of social justice–oriented, middle-aged researchers, or those of the students/subjects. Concretely, if we coded our initial data following Debbie’s interpretations, we emerged with a study that tended to echo and affirm the larger body of research showing that racism, coupled with denial of its existence, is a lingering issue on college campuses nationally (Chesler et al., 2005). Conversely, working from Sean’s perceptions, we might move to concepts other than racism to explain our data.
The addition of David’s perspective, combined with our embrace of dialogic process, resulted in a far more complex view. David sees racism as a category that is not necessarily antiquated, but that is overdeployed. Like Debbie, he expected race to be a significant factor in the surveys; however, unlike her, he believed that the label “racism” should be reserved for overtly offensive and discriminatory replies and not for responses that had a racialized element. For David, the attitudes revealed by our subjects’ responses were quite commonplace and mainstream, so—perhaps, admittedly, because of his own desensitization from overexposure—he has ceased to see them as “racism,” per se. This perspective, including David’s acknowledgment that it may be informed by experiences and sensitivities specific to his Latino heritage, suggests to Debbie that voter studies that have indicated the existence of so-called “Black exceptionalism”—Black/White relations as a category for considerations of racism that is distinct from Latino/White, Asian/White (Sears, Fu, Henry, & Bui, 2003; Sears & Savalei, 2006)—might provide a valuable adjunct to analyses of racial attitudes on postsecondary campuses. This is particularly relevant to our team analysis, given that Debbie’s perspective on these issues is grounded to a large degree in the works of African American scholars and her own professional and personal experiences with African American communities and individuals.
The range of interpretations of deceptively simple data also suggests to us that existing practice with regard to reflexivity even within the qualitative research community may be inadequate. Customarily, individual researcher lens(es) are discussed early in the research report as a part of the series of disclaimers and caveats that form the boundaries of any study. We argue that presentation and discussion of research findings and their implications should also include explicit consideration of the researcher(s) to be genuinely reflexive. Much as quantitative omnibus statistical tests function as a first step for further post hoc analysis, processes such as creating and revising codebooks and efforts to increase the intercoder reliability score should open discussion as to agreement/disagreement and its ontological basis among coders, rather than being limited to how to produce significant results through instrument adjustment. Our thought thus came to echo Schwandt’s (2005) analysis, which follows Oakley in proposing that “qualitative methods could do with more self-criticism about the mediation of their research findings by partial, researcher-driven perspectives by more caution, openness and accountability in relation to the findings claimed” (p. 291). We would extend this perspective to mixed methods approaches, and particularly to the quantitative aspects of those designs.
Cycles of Deconstruction and Reconstruction
Still, though, as much as we identified with these perspectives and initially thought they would bring us to closure, our work together was not quite finished. Returning the results of our inquiries together to where they had begun, we considered the construction of complementarity that had underlain our initial decisions. Greene (2005) argues that mixed methods research can seek “understanding that is woven from strands of particularity and generality, contextual complexity and patterned regularity, inside and outside perspectives . . . not so much convergence as insight” (p. 208). These insights, as Greene notes and we had found, can be highly generative, troubling simplistic answers and leading to further inquiry. However, this perspective is hardly unproblematic in today’s politicized research environment, even though it echoes decades of scientific thought.
Kuhn (1977) argued that science has progressed historically from the combination of what he called “divergent” and “convergent” thinking. Bateson (1941/2000) argued similarly for the importance of a variety of approaches, given that “the advances in scientific thought come from a combination of loose and strict thinking, and this combination is the most precious tool of science” (p. 86). However, Bateson’s ideas move beyond Kuhn’s historical view and seem to predict the generative possibilities we had experienced, as he advises that researchers consciously alternate between them, that they “accept and enjoy this dual nature of scientific thought and be willing to value the way in which the two processes work together to give us advances in understanding of the world” (p. 86). Critically for our emerging considerations, Bateson also expressed skepticism as to any possibility of rigor and/or relevance for either method when used in isolation from the other. The proposal for “analytic alternation,” in which quantitative research is accompanied by qualitative reporting as to its “in vivo” process seems to us to draw on these ideas as well (Maynard & Schaeffer, 2000; Sandelowski et al., 2009).
According to Dewey (1903/1977), too, robust inquiry naturally progresses recursively between what we now term the quantitative and qualitative realms: Although the generation and testing of scientific hypotheses to yield universal laws is the natural aim of science, defining these is not the end of the process—to end there ignores the vital question of “how it comes to do so, and what it does with the universal statements after they have been secured” (p. 9). This reflects Dewey’s deep engagement with what he termed particularity: the unique qualities and interactions of specific contexts. Concern with particularity, for Dewey (1929/1984), is necessarily combined with the need for abstractions that enable researchers to generalize across contexts. The danger is that these abstractions may be mistaken for actual knowledge, since “erected into complete statements of reality as such, they become hallucinatory obsessions” (Dewey, 1929/1984, p. 174).
In contrast to the complementarity and recursivity envisioned by Bateson and by Dewey, we would argue that current discussion of rigor in mixed methods approaches too often seeks to keep the respective ontologies and epistemologies segregated. For the most part, the assumption seems to be that the quantitative and qualitative data will be considered separately (Dellinger & Leech, 2007), so that the tendency is to defer to established norms for the respective spheres (Leech & Onwuegbuzie, 2009). Schwandt (2005), however, uses Dewey’s thought and that of Oakley to argue that these discussions of method artificially segment qualitative and quantitative approaches “as though these were inherently opposed, rather than simply being aspects of the way we all live in and make sense of this world” (p. 291).
What made our process particularly interesting, in this light, was the way in which we had experienced these modes of thought ourselves. We were researchers from different traditions and different fields, and actually alternated among our comfort zones—a process that also created substantial tension because at the time we did not necessarily see these modes as alternating; we often saw them as mutually competitive zones of thought. Prompted by Sean’s sense that this alternation might embody a Deweyan approach to evolutionary epistemology (Popp, 2007; Seltzer-Kelly, 2008), Sean and Debbie began to theorize the ways in which we had constructed and reconstructed not only our method(s) but also our research question(s) in light of our emerging data.
Traditionally, a method or combination of methods is selected as the appropriate tool to be used for the purposes of the specific inquiry, and then something of a firewall is erected; revisiting of the research question in response to developments that accrue through the method seems to be generally considered illegitimate among qualitative and quantitative research communities alike. This view is seen clearly in Eisenhart’s (2005) analysis of the need for both “variance” (also known as quantitative) and “process” (also known as qualitative) approaches to investigate causation: “They lead to different kinds of research questions and, in turn, to different research designs and methods” (pp. 245-246). Here, clearly, research questions determine design and method and precede data collection.
Howe (2005) has argued that researchers should follow Laudan and embrace a “‘naturalistic theory of methodology’ . . . an element just as subject to testing and revision as other hypotheses and theories” (p. 316). However, it is not clear that this would apply within a given inquiry; we believe that his advocacy still assumes the controlling view of research methodology, which sees the definition of the inquiry and the articulation of method as separate, sequentially distinct, and unidirectional. In contrast, we suggest here that it is more useful and more realistic to look at methods (and their associated methodologies) as tools that can be rebuilt within the research process to produce a multimodal research approach that evolves from research needs—a constructivist stance at the methodological level. From a Deweyan (1938/1986) perspective, these adaptations must respond to the actual conditions of the inquiry as they emerge, rather than relying on an algorithm planned out in advance a la most mixed methods studies.
Our interpretivist perspective also accords with Dewey’s (1929/1984) rejection of the dualism of Western philosophy, where the knower is separated from that which is known. This Deweyan pragmatist sensitivity to context makes it useful to think of researchers as situated actors working within ontological and epistemological “environments,” as opposed to the positivist view of researchers as objective searchers of truth. It also connects to Suchman’s (1987) reformulation of idealized versions of action to more human (and more flawed) improvisations, so that “rather than attempting to abstract action away from its circumstances and represent it as a rational plan, the approach is to study how people use their circumstances to achieve intelligent action” (p. 50). Although Suchman focuses on research subjects, we extended this understanding, as did Dewey, to include the researchers.
As Law (2004) argues, attempts at neat segregation in social sciences research are largely a matter of deception anyway, more a matter of “covering up the traces” (p. 36) of what took place than of reflecting what actually has transpired. The danger of this approach is that it serves to “enact realities” (p. 38), while simultaneously concealing the politics that guided the approach and its reported results; it becomes part of “a series of mechanisms for avoiding the appearance and the experience of multiplicity; for expelling it into invisibility” (p. 66). This is deeply problematic, since it “makes it impossible to think about partial connections” (p. 66) and this suppression of fractionality inevitably obscures the political (in terms of the values, concerns, and goods involved) and appropriates a gloss of impartiality and objectivity while doing so. Thus, building from Denscombe (2008) as well, we have come to believe that the notion of “research paradigm” must accommodate inconsistency and variability. In fact, and perhaps more strongly, we argue here that coherence and consistency serve only to “enact realities” and “cover up the traces,” in Law’s sense, concealing deep and vital political issues.
The alternative embraces Law’s (2004) proposal that research method(ology) is, inherently and inevitably, a process of assemblage, incorporating all the discourses that surround “the fluidities, leakages and entanglements” (p. 41) that define research, including tacit knowledge, research skills and resources, and political agendas—the uncertainties and the unfolding nature of the process. The term assemblage becomes both a noun and a verb, signaling the “recursive self-assembling in which the elements put together are not fixed in shape, do not belong to a larger pre-given list but are constructed at least in part as they are entangled together” (p. 42).
This, as Denscombe (2008) argues mixing methods ought to be, is a deeply pragmatic move: Elements are selected in relation to their instrumental value, rather than in response to an overarching principle, and retained or discarded accordingly. This has the additional advantage of resting methodologically firmly within a long-standing tradition that encompasses not only Dewey and Mead, as Denscombe points out but also Bateson. We agree with Denscombe in his Kuhnian analysis, that the reliance of communities of practice on this type of pragmatist approach is not a threat to the emergence of new kinds of knowledge and practices. Furthermore, our experience with the intentional introduction of multidisciplinarity in a research team seems to suggest that this can create a particularly fruitful disruption of researcher paradigm.
In ways that reflect the tangled assemblage we came to embody, we struggle to provide a neat conclusion. The degree of adaptability, precarity, vulnerability, reflexivity, and transparency that we now advocate may open more questions than a single journal article can contain. Our only firm advocacy is that this discussion is not only interesting but also vital—and that it must come about through a construction of rigor that is focused on transparency and reflexivity rather than on empty employment of methods grounded in a technicist version of scientism. Given all that is at stake, ranging from financial support for research into pressing social needs, to the question of what areas of study will be supported within higher education, we choose to “take the side of the messy” (Lather, 2009) in our presentation of this work, in the hope that it will contribute to further inquiry and continuing experimentation.
Footnotes
Acknowledgements
The authors wish to thank all the participants in that session, as well as the editors and reviewers of this journal, for their insightful comments and helpful critique.
Authors’ Note
The original methodological self-study described in the section titled “Round I: A Methodological Self-Study of the Use of Krippendorff’s Alpha While Quantitizing” of this article was presented at the American Educational Research Association Annual Meeting and Conference in 2009.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
