Abstract
This paper introduces models and techniques for synthesizing multiple qualitative studies on a topic. Qualitative research synthesis is a diverse set of methods for combining the data or the results of multiple studies on a topic to generate new knowledge, theory and applications. Use of qualitative research synthesis is rapidly expanding across disciplines. Aggregative and interpretive models of qualitative research synthesis are defined and distinguished. Several interpretive models are detailed. Their strengths are identified, and their limitations and areas of methodological ambiguity are critically examined. The steps of qualitative research synthesis are discussed and challenges specific to doing qualitative synthesis are identified and explored.
Keywords
Synthesizing multiple qualitative research studies on a topic is a valuable way to extend knowledge and theory. Rigorously synthesizing qualitative research provides a method to address and solve the “one off” problem of failing to build upon prior work. As early as 1959, sociologist Wright Mills (1959: 65) said: “there are never enough bricks and too few good synthesizers to search out the bricks and put the wall together.” Solesbury (2002: 92) pointed out that most qualitative research is new, primary, research but on most topics, there is a vast body of prior work that is ignored: “Social science is very bad at the cumulation and re-use of past research results.” Indeed, a qualitative synthesis completed by Pound et al. (2005) on how people actually “take” medicine demonstrated that many researchers had not located or had not cited prior relevant qualitative research reports on the same topic. Systematically locating, critically examining and synthesizing qualitative studies on a topic can expand theoretical knowledge and better support practice and policy (Ruggiano and Perry, 2019).
Qualitative research synthesis (QRS) has become much more common across disciplines in recent years, with over 500 reports in 2015 (Booth, 2016), up from roughly 65 in 2009 (Tong et al., 2012). Yet, a search of Social Work Abstracts in March 2019 yielded just two publications—neither of which applied a named and well-described methodology. Nonetheless, social workers have begun to publish QRS studies and have introduced a QRS method (Aguirre and Bolton, 2014). Yet, QRS is more apparent in other professions and disciplines: the PsychInfo database revealed 206 QRS reports, and the Web of Science database revealed 2921 reports using the search term “qualitative research synthesis.”
Synthesizing qualitative research studies raises a range of epistemological, methodological and practical challenges. Qualitative research has many different purposes, draws on a wide range of epistemological premises, applies a vast array of data collection and data analysis methods, and is written up in varied formats. These challenges have led to the development of distinct approaches to QRS.
Aggregative and interpretive synthesis approaches
There are two broad camps of QRS methods: aggregative and interpretive. Aggregative methods seek to identify practice and policy applications from qualitative study data. They generally adopt a realist or pragmatist epistemological frameworks, seeking to comprehensively summarize prior work in a descriptive manner. Data are not re-interpreted (Lockwood et al., 2015). Aggregative approaches to QRS often draw on quantitative or frequency-based analytic methods which are applied to the original data found in qualitative studies. They may address large or small bodies of prior work, though they may be most beneficial when applied to fairly large numbers of studies (Lockwood and Pearson, 2013).
In contrast, interpretive methods generally seek to enhance or elaborate prior conceptualization and theory. They tend to adopt interpretivist/constructivist epistemological frameworks (also called “idealist” in the European literature). Interpretivist methods seek to interpret and synthesize the “results” of prior qualitative work, rather than the original data per se. These methods may be applied to large or small bodies of prior work. Some interpretivist syntheses also critically analyze and problematize or deconstruct both prior work and research methods as well as synthesizing results. There may be notable differences in the focus, conclusions, and implications of qualitative syntheses completed using aggregative versus interpretive methods.
Yet, both approaches to QRS emphasize an insider perspective and may attend to differences in contexts and across human diversity. They both can offer information on processes of care, their effects, quality of life, personal identity issues, and service or health care disparities. Both approaches to QRS can generate and even test new knowledge not found or apparent in any single study (Popay et al., 2006).
Unfortunately, poor scholarship, overlapping, and confusing terminology all challenge learning and doing strong syntheses (Frost et al., 2016). This paper seeks to clarify some key differences among QRS methods and to describe widely applied approaches to doing QRS. To begin, there are two main types of qualitative syntheses, each with a different purpose.
Aggregative syntheses
Aggregative syntheses focus on assembling, pooling, and summarizing descriptive qualitative data. The questions guiding aggregative syntheses are typically defined at the outset of the project (Dixon-Woods et al., 2006). Aggregative syntheses fit best where conceptualization is well developed and deemed adequate. Their focus is descriptive rather than seeking to develop or refine concepts and theory. Aggregative syntheses allow for the compilation or integration of original data across studies rather than for researcher (re)interpretations of results. Included studies may be solely qualitative or mixed method. Transparent description of data is a key concern. Popay and Roen (2003) argue that aggregative syntheses enhance data sources for use in evidence-based medicine and evidence-based practice. Lockwood et al. (2015) also state that aggregative syntheses are closest to the models used in quantitative evidence-based practice research.
Aggregative syntheses assume that access to the original raw qualitative data is complete or extensive. Notably, access to extensive raw data is often extremely limited in qualitative research publications where very brief data summaries are required by reviewers and page length limitations (Drisko, 1997, 2013). Aggregative syntheses may be very limited if solely based on the data available in published article-length reports. Defined standards for study quality may be used as inclusion or exclusion criteria, but included studies must always present clear, thorough, and credible data.
There are, however, recent sources of access to original qualitative research data sets. For example, the UK’s Qualidata (Corti and Backhouse, 2000), the German based Qualiservice (2016), and UK Data Service’s Qualibank (ERSC, Universities of Essex and Manchester, and Jisc, 2018) offer access to “nearly 1000” qualitative data sets (Bishop and Kuula-Luumi, 2017: 1). Growth in use of this resource is very recent, with most usage occurring after 2013 (Bishop and Kuula-Luumi, 2017). Similarly, Finland’s Alia qualitative research data archive has been in place since 2014 (Finnish Social Science Data Archive, 2016). Growth in qualitative research data archives comes at a moment when much qualitative data are “born digital” or collected in a manner that both aids transparency while also allowing for easy archiving (Corti and Fielding, 2016; Crow and Edwards, 2012). Access to raw data may also help to limit investigator bias (Fielding, 2004) and makes possible external audits.
Aggregative synthesis methods
Meta-aggregation is an aggregative QRS method which Lockwood et al. (2015, abstract) argue “is most transparently aligned with accepted conventions for the conduct of high-quality systematic reviews”… in which “the reviewer avoids re-interpretation of included studies, but instead accurately and reliably presents the findings of the included studies as intended by the original authors.” Meta-aggregation typically draws upon a pragmatic or realist epistemological perspective. Pearson et al. (2011) note that data from studies are categorized based on similarity of meaning using a standardized a priori protocol targeting the initial research question. These preliminary categories are then synthesized into higher order categories that aggregate meanings. Accurately summarizing the meaning found in the original authors’ reports is prioritized. The yield is “lines of action” to guide practitioners and policy makers (Hannes et al., 2017). Meta-aggregation was developed by the Joanna Briggs Institute, a leading Australian resource for evidence-based practice. The overall process intentionally parallels that of quantitative systematic reviews. Indeed, Munn et al. (2014) offer the ConQual method for grading QRS reports, heavily weighted to the meta-aggregation approach.
Thomas and Harden (2008, abstract) state that “thematic synthesis” is a method that is often used to analyze data in primary qualitative research. Thematic synthesis involves three stages: line by line coding of the primary documents, development of “descriptive themes” that remain close to the original data found in the primary studies, and finally the development of “analytic themes” in which the synthesists generate new explanations or hypotheses. For example, regarding healthy eating, descriptive themes might include “food preferences” and “perceived health benefits.” Analytic themes describe the relationships among descriptive themes. Food preferences and perceived health benefits might be aggregated as components of “chosen foods.” While Thomas and Harden (2008: 7) use the terminology “go beyond” to describe the development of the final analytic themes, they mean this in terms of aggregation and summarizing rather than to reinterpret or to deconstruct the data found in the primary studies. Ultimately, barriers and facilitators of healthy eating were sought to guide education, practice, and policy.
Adaptations of basic content analysis have also been used to aggregate qualitative data (Evans, 2002). Qualitative coding of texts is completed and then analyzed using descriptive statistics. Cavanagh (1997) and Nandy and Sarvela (1997) both use a priori categories in QRS, then adapt basic content analysis methods using frequencies of occurrence to determine importance or impact. Quantitative methods are used to ensure inter-coder reliability.
Two other aggregative QRS methods are Bayesian meta-analysis and Bayesian synthesis. Roberts et al.’s (2002) Bayesian meta-analysis employs qualitative data to identify quantitative variables and data that are included in the meta-analysis. Voils et al.’s (2009) Bayesian synthesis seeks to balance qualitative and quantitative data in research synthesis.
A key strength—and a simultaneous limitation—of aggregative synthesis is its realist focus on the findings or data found in qualitative studies rather than the researchers’ summarized results, which are viewed as potentially problematic interpretations. However, as noted above, the extent of “raw” data found in many qualitative reports is limited. Analysis based on the frequency of content is generally used to determine its importance. Such analysis may be limited by sampling and reporting limitations and may miss or omit useful interpretations. The methods and theories used in the publications in aggregative synthesis are not problematized. The focus is solely on research data or results and their practical applications.
Interpretive synthesis
In contrast to aggregative approaches are the interpretive QRS approaches. These approaches are generally constructivist or interpretivist in epistemology, allowing for multiple meanings and emphasizing the importance of culture and context. “Data” in qualitative research publications are viewed as “first-order interpretations.” First-order interpretations are the emic meanings made by participants who originally shared their experiences and views. Researchers report “second-order interpretations” or their understandings of the participant’s views filtered through the theories, goals, and purposes of the research project and the researcher’s cultures and personalities. Interpretive QRS centers on “third-order interpretations” or the understandings of the synthesists of the second-order interpretations of the publishing researchers. In this approach, reported interpretations rather than raw data are the materials combined in the synthesis.Interpretive synthesists may also deconstruct or employ a Critical theory perspective regarding the adequacy, completeness, and utility of the reported (second order) interpretations, theories, and methods they synthesize. This allows for interpretations that expand on the (first order) “data” per se in order to point out the strengths and limitations of works included in the synthesis. The second-order interpretations, rather than the “data” found in published reports, are the materials to be synthesized in interpretive QRS.
Interpretive syntheses focus on the development of new concepts, theory, and perspectives. However, they do not necessarily start out with a clear and specific focus. “What works” is often too simple a starting point and presumes a well-developed conceptual and empirical base for the synthesis—which may not always be present. Analysis is inductive, iterative, and evolving, as is common in qualitative research. Further, Paterson (2012) proposes that the constructed and situated nature of diverse social realities and meanings poses a challenge for interpretive QRS in that it may reduce the richness, detail, and uniqueness of content and participants. On the other hand, interpretive QRS may make contextual and human diversity features more apparent across studies, making the implicit or contextual more explicit. Finally, study quality—the credibility, verisimilitude, and completeness of the second order interpretations—can vary widely across included studies. Assessing the quality of included studies is a challenge for interpretive QRS. Interpretive synthesists tend to accept as credible and meaningful the second-order views of the researchers whose work they analyze, at least for reports that meet their established inclusion or quality criteria. Yet, it is a fair critique that such inclusion of the original researchers’ interpretations may lead to further analysis of “thin” data and concepts, unless careful quality criteria are applied.
Key methods of interpretive synthesis
Noblit and Hare’s (1988) pioneering meta-ethnography is the first defined QRS method and one that remains widely used across disciplines (Campbell et al., 2011). Originally intended, as the name implies, to synthesize ethnographic studies, it introduced many key methods and procedures in QRS (Forte, 2010). Meta-ethnographic techniques are now widely used or adapted to synthesize qualitative studies using many different research methods, although authors of other methods frequently fail to mention or cite the prior work on which they draw (Barnett-Page and Thomas, 2009).
Noblit and Hare’s (1988) seven-step QRS approach parallels that now used in quantitative systematic reviews: (a) defining the topic or area, (b) setting parameters for the search and for study quality, (c) comprehensively searching the literature, (d) reviewing study quality, (e) extracting interpretations, (f) synthesizing the extracting material, and (g) writing up the study methods and results. Most QRS approaches apply these steps or some very similar methods, through the extent of the literature search and review process and also analytic methods vary considerably among named methods. (Note that Noblit and Hare’s work preceded the standardization of the current quantitative systematic review processes.)
Noblit and Hare’s (1988) other methodological innovation was the creation of several comparison and contrast analytic methods. They describe three distinct analytic methods: (1) reciprocal translation analysis, (2) refutational analysis, and (3) lines of argument synthesis. Reciprocal translation analysis involves looking for similarities and differences across the studies included in the QRS project. To begin, the study results must prove to be similar enough to allow “translation” and back-translation into each other’s concepts and terminology. Starting from the codes, concepts, or terminology of one well-done study, the synthesist explores whether or not other studies are reporting the same or very similar concepts in other words, with or without differences in nuance, setting or sample. The most useful and revealing approaches to coding and conceptual developments are kept and distinctions across the studies are identified. By contrast, refutational analysis centers on exploring the contradictions and differences across study concepts and terminology. It begins with studies whose interpretations do not allow reciprocal translation, but that are comparable, not incommensurate. Comparison and contrast allow for identification of what distinguishes or differentiates the yield and terminology of the several included studies. Refutational analysis may be used jointly with reciprocal translation analysis, allowing comparison and contrast where cross-translation is not possible or revealing. Where the results of studies appear to be incommensurate, lines of argument synthesis involves building up a multi-dimensional model of a whole from the divergent, incommensurate results of studies on the same topic (Noblit and Hare, 1988). Such differences and distinctions may be due to differences in purposes, epistemology, theory, interpretation, or settings. Creating multiple lines of argument provides researchers with a variety of approaches and ways of understanding the same area as viewed very differently across researchers and studies. Lines of argument synthesis may also produce an overarching description of a phenomenon across studies and settings. Thus, Noblit and Hare (1988) offered analytic models for similar or overlapping results and concepts found among studies; for comparison of contrast and differences among studies; and for synthesis of divergent results, terminology and concepts.
Grounded formal theory (GFT) (Kearney, 1998)—not to be confused with Glaser’s (2006) formal grounded theory (GT)—applies Glaser and Strauss’ (1967) iterative and constant comparative methods to QRS. GFT synthesizes findings from GT reports to create a more encompassing “formal” theory. It is intended to clarify concepts, processes, and the contextual elements surrounding them. The iterative steps of open, systematic/relational, and variational sampling, as well as use of open, axial (focused), and selective coding are followed as is done within a single GT study. This process yields categories, subcategories and dimensions, and core categories, but now across multiple studies on the chosen topic. GFT fits well with synthesizing GT studies as it applies consistent methods (Kearney, 1998, 2001). Small literature, however, may pose serious challenges in applying the GT steps fully. For instance, it is not immediately clear how a GFT synthesis would allow for challenging, clarifying, or adding complexity if the available literature is small and not much varied. There might be little room for variational sampling and selective coding to further challenge and clarify prior (open and axial) coding, which is their main function within individual GT studies. Nonetheless, it may be possible to achieve saturation of codes and concepts within the terms of the GFT synthesis and available literature. Further, the role of selective coding to test open and axial codes may be problematic if there is not a large and varied literature on which to test out the fit and utility of the codes developed in GFT. GFT seeks to stay close or grounded in reported results from the original studies.
Meta-study (Zhao, 1991) is a three-part QRS method that aims for scope and comprehensiveness. The first part, Meta-data analysis, like meta-ethnography, is interpretive and seeks similarities and differences across study results. The next part, Meta-method, entails critical appraisal of epistemologies and methodologies used across the included studies. The third and final part, Meta-theory, centers on review of applied theories and contexts (Paterson et al., 2001). Together, the three steps allow for development of new theory, for deeper consideration of prior theory or a new overarching (more widely applicable) theory; all while keeping an insider perspective. The three-component parts are synthesized and reported, jointly yielding the complete meta-study. Just how such synthesis is done, however, is not fully transparent in the meta-study methodological literature. The comprehensiveness of this model is its key strength. Analysis in each step, synthesis of interpreted data, and review of theory and method appear to draw heavily of the analytic methods of meta-ethnography. Comparisons using reciprocal translation and refutational analysis, or methods very similar to them, are applied to each of the three aspects of each included study. It is not fully clear how incommensurate data, epistemologies, theories, and methods are addressed in meta-studies, although the model would be useful in identifying them.
Critical interpretive synthesis (CIS) (Dixon-Woods et al., 2006) involves both synthesis and, as proves relevant, a fundamental (Critical) critique of included studies, including questioning “taken-for-granted” assumptions. Calling Noblit and Hare’s reciprocal translation analysis limiting and reductionist for large (>50 publications) and diverse literature, and noting that refutational analyses appear in few in publications, the authors argue for “a critical and reflexive approach to the literature, including consideration of contradictions and flaws in evidence and theory” (p. 6). Just how large literature pose analytic challenges in meta-ethnographies is not fully explicated. The volume of material will be challenging but does not inherently conflict with meta-ethnographic methods. The clear strength of CIS is its reflective and critical perspective. Again, it is not clear why such a perspective could not be part of meta-ethnography or meta-study, but an analysis using a Critical theory lens is not explicitly part of either of the other methods.
Like the meta-study, CIS allows for review of findings, epistemology, methods, and theories found in the literature to be synthesized. CIS allows for a wide-ranging and comprehensive synthesis of prior qualitative work on a topic, and for commentary on its strengths, limitations, and omissions. A key purpose of CIS is including in the synthesis consideration and reflection/reflexivity on the credibility and quality of the evidence. This may include Critical judgments about how each concept builds the synthesizing argument, while staying rooted in appropriate critique of existing evidence and concepts. Dixon-Woods et al. (2006: 5) argue that CIS enhances meta-ethnography’s lines of argument synthesis by “offering more insightful, formalised, and generalisable ways of understanding a phenomenon.” They argue that in CIS, a synthesizing argument can be produced by “detailed analysis of the evidence included in a review, analogous to the analysis undertaken in primary qualitative research” (p. 5). Such analysis may generate “synthetic constructs” built by transforming the study evidence into new conceptual forms. Synthetic constructs are based upon, or grounded, in study evidence, yet built from an interpretation of all the included evidence, allowing that several disparate aspects of a phenomenon can be unified in new useful and explanatory ways. While synthetic constructs are a key part of CIS, they appear analogous to the possible yield of reciprocal translation analysis, refutational analysis or lines of argument synthesis. Lines of argument synthesis appear to allow for Critical interpretations and identifications of omissions in prior qualitative studies, although the use of Critical theory lens is not specifically included or addressed by Noblit and Hare (1988). Similarly, GFT and meta-study could also allow for Critical interpretations in the synthesis.
There is one specific social work QRS method, Aguirre and Bolton’s (2014: 283) “qualitative interpretive metasynthesis” (QIMS). This method seeks to apply a synergistic, “akin to” person-in-environment, perspective to QRS, although this seems potentially to be a core aspect of all holistic interpretive QRS methods so long as the researchers chose to apply it. Aguirre and Bolton outline the steps of QIMS following Noblit and Hare’s seven steps, and argue for including multiple “traditions” which cross-disciplinary and epistemological differences. (“Traditions” is a term used by Jacob (1987) referring to named approaches to qualitative research and their disciplinary and epistemological differences.) “Theme extraction” applying Noblit and Hare’s meta-ethnographic methods is central to the synthesis proper. QIMS, however, does not directly employ Noblit and Hare’s methods. QIMS themes instead include concepts, metaphors and terms. These are synthesized with attention to Patton’s (2002) types of triangulation: methods, tradition, sources, and analysts. QIMS adds a focus on different analysts to three domains addressed in the meta-study model. As an alternative to the QIMS approach, applying Denzin’s (1970) four types of triangulation: methodological triangulation (different data collection methods), data triangulation (sources, times, settings), investigator triangulation, and theory triangulation might be more comprehensive and preferable. Yet, it is unclear if triangulation is lacking in other named QRS methods, or if data of each type will always be present in a given set of reports available for synthesis. It is also unclear if QRS methods that focus on specific disciplinary concerns significantly add to knowledge on synthesis methods in general. QIMS focuses on a holistic perspective but other approaches may also allow or encourage such a perspective.
There are several additional interpretive QRS methods: Meta-synthesis, thematic synthesis, textual narrative synthesis, realist synthesis, framework synthesis, ecological triangulation, qualitative cross-case analysis, and meta-narrative (Mohammed et al., 2016). These methods are briefly or “simply” detailed by their creators and those who apply them (Barnett-Page and Thomas, 2009). However, published synthesis reports applying these additional methods do not make fully clear how their analyses are completed.
Strengths and limitations of QRS methods: Comparison and contrast
Identifying the synthesis topic or question
Noblit and Hare’s (1988) seven steps outlining qualitative synthesis appear to be (more or less) adopted across all aggregative and interpretative synthesis approaches. First, having a clear research question or topic area is crucial to a successful qualitative synthesis, although the area may be broadly defined (e.g. taking medicines or coping with domestic violence). Paterson (2012) adds that defining likely goals or outcomes, and intended audience, will guide the QRS project. The focal question may be refined or may evolve in interpretive syntheses. Further, the composition and views of the research team (realist or idealist in epistemology; experienced or novice qualitative researchers; broad experience with multiple qualitative methods; and/or single discipline versus multidisciplinary) may also guide the choice of aggregative versus interpretive QRS projects and team membership.
Searching the qualitative literature
Second, the scope and completeness of the literature search, and the amount and quality of available literature on a topic, shape and limit the potential yield of a QRS study. Smaller literature on the chosen topic may lack variety in samples, methods, and theories applied in the original qualitative studies. This may limit translation and refutation across studies regardless of synthesis method. The use of GFT may be severely limited with very small literature, as the set of iterative stages central to this process may not be possible to complete with a small sample. Smaller literature may also yield failed syntheses if included studies are too alike or too different (Noblit and Hare, 1988). Larger literature are difficult to manage but do not appear to inherently pose inherent limitations for any QRS method. However, larger literature might yield both commensurate studies suitable for reciprocal translation or refutational analysis and also studies that are incommensurate and require some type of lines of argument synthesis. Applying a Critical and reflexive emphasis, as is made explicit in CIS and meta-study, may also require specific types of data and/or published results from which to apply these methods.
Depending on how well the topic is researched and the availability of primary studies on the topic, the scope of the review is initially determined (i.e. whether a narrow or broader comprehensive search will be conducted). Generally, if there are few apparent publications of interest in the area and/or it is a recent area of interest, a more comprehensive and exhaustive search approach is likely to be advantageous. More comprehensive searches are likely to result in larger but manageable numbers of topically or conceptually relevant studies. These become the initial QRS “sample.”
The challenges of searching for qualitative reports
Comprehensive searching of the qualitative literature may be very difficult regardless of QRS method. In contrast to quantitative studies which often include defined keywords or are distinguished in online databases by research design type (i.e. experiment or quasi-experiment), few such qualitative terms are widely employed. For example, there are very few available PubMed MeSH search terms to filter searches on this extensive database. Bates et al. (2017) found in a systematic review that web search engines were not as effective as were bibliographic databases in aiding the location of academic publications. While they did not specify whether they sought a particular research methodology (i.e. qualitative or quantitative) or specific content, they found that scholars cannot be certain that Boolean logic will be correctly and fully applied in web search engines. This is especially so when the order of entry and/or content of groupings were altered. Further, Bates et al. point out that Web search engines lack truncation functions, especially for missing words, and often do not reveal abstracts of publications as do database searches. Overall, bibliographic databases provided better sensitivity (percentage of total materials) than did web search engines, but both provided limited precision (percentage of relevant materials). They also found that using multiple search engines and databases is likely to improve location of relevant results.
Specific to qualitative research in social work, McGinn et al. (2016), examining the topic of intimate partner violence found that 92% of materials were found by six databases, and that no additional materials were found using web searches. They also found that databases performed inconsistently across case studies, affirming the use of a range of databases when searching for qualitative research publications.
To aid QRS, Cooke et al. (2012) created the SPIDER search approach. SPIDER addresses Sample, Phenomenon of Interest, Design, Evaluation, Research but is nonetheless limited by the lack of methodological search terms for qualitative studies. Booth (2016) encourages structured and transparent search and review methods (based on Cochrane Collaboration and ESQUIRE standards and objectives). Booth further recommends that the selected databases and search terms/filtering strategies employed be fully and transparently documented, which parallels the PRISMA (2009) standards for reporting quantitative systematic reviews and meta-analyses. Booth notes that “pearl growing” techniques (i.e. following frequent citations in the reference lists of included article) are valuable supplemental strategies to electronic searches. Indeed, not all qualitative reports on a topic will be located by electronic database searches. For example, Pound et al. (2005) completed a meta-ethnography on asthma and medications to treat it. They found 43 relevant reports, 21 of which were not located by electronic search. A combination of electronic and pearl growing techniques appears optimal.
For aggregative syntheses, the availability of trustworthy and extensive data is a key requirement. Ample data from original publications or online qualitative data sources are necessary for a quality aggregative synthesis.
How to sample the located literature reveals many differences in approach. Some authors argue that very large literature (i.e. medication compliance, falls by elders) can make sampling a feasibility issue (Gough, 2007). Where literature are large, several QRS scholars argue for a purposive search of the literature with the goal of establishing a holistic interpretation of the topic under study (Dixon-Woods et al., 2006; Ring et al., 2011; Suri, 2011). Others argue that smaller located literature should be exhaustively searched. Researchers must also bear in mind that what seems relevant or necessary may iteratively evolve during the literature search and/or the analysis. Thus sampling the literature may risk failing to locate studies that might be beneficial to the QRS before analysis begins. Rigorous scholarship appears to require exhaustive searches in QRS projects.
Ethical concerns
There is little discussion of ethical issues in the QRS literature. The assumption appears to be that the use of secondary data, most often already in the public domain, limits ethical challenges. However, Ruggiano and Perry (2019) note that re-use of existing data sets, as is done in aggregative synthesis, must be performed in the context of maintaining the privacy and confidentiality of human research participants. Researchers must ensure that the original data were collected with appropriate and documented ethical protections in place (such as institutional IRB or tribal IRB approval). These obligations hold for all researchers undertaking QRS projects.
Study quality appraisal
Appraising the quality of the located literature is an under-developed area in the QRS literature. Inclusion criteria initially center on relevance, scope or breadth, as well as methodological quality. Once located, some authors argue that “weak” studies should be excluded and offer detailed appraisal checklists (Campbell, et al., 2003, 2011; Saini, 2007/2012). Other authors argue against study quality appraisal, noting that “surface mistakes” may lead to discarding ultimately important and revealing studies (Sandelowski and Barosso, 2007). Dixon-Woods et al. (2006: 4) emphasize study relevance over “type” or design and methodological standards, arguing for signal (relevance) over noise (the inverse of methodological quality) to maximize conceptual variety. In practice, Dixon-Woods et al. (2007) found that 50% or more of published QRS reports did not use quality criteria. Aggregative syntheses need both high quality and ample data to be completed meaningfully. More conceptual and empirical work on this topic is needed.
Despite the minimal state of the study quality literature, checklists of study quality are found in the literature. For examples, the UK’s Critical Appraisal Skills Programme (CASP, 2018) offers a 10-point checklist that organizes assessment of study quality. The topics are broad and specific elements are rarely addressed. No scoring system is offered—purposefully. Joanna Briggs Institute (2017) offers the QARI Critical Appraisal Checklist for Qualitative Research, a comprehensive outline of study elements with a scoring system. A follow-up section also elaborates on each scored element. Other quality standards for qualitative research, not based on a checklist approach, may also be used to guide study quality assessment.
Reading, re-reading, extracting data, and synthesizing results
Given a well-defined topic of merit and worth, synthesists must complete extensive reading and re-reading of included studies to familiarize themselves with study content in detail (Drisko, 1997, 2013). This step begins with coding to determine what data or results to extract and compare across studies. Sandelowski and Barosso (2007) suggest the use of a spreadsheet to organize such documentation—a step that overlaps with analysis in aggregative syntheses. Qualitative data analysis software (ATLAS.ti, EPPI-reviewer, etc.) can also be used for this purpose. Familiarization with the included studies should also help reveal differences in participants, time periods, settings, cultures, and other aspects of human diversity that may differentiate results. The key purpose of re-reading is to begin to identify similarities, differences, incongruities, and absences among the included studies that guide the synthesis. Repeated, in-depth, iterative review of the studies may also suggest the need for changes to literature search or inclusion and exclusion criteria. In this way, the QRS process may be iterative, like much qualitative research.
Across QRS methods, concepts, themes, or categories are initially identified. Their “grounding” in specific data or published statements should be clearly documented. Next, the dimensions of the concepts or themes as relevant, and links among them are established. Multiple, often iterative, coding reviews lead to overarching themes or mid-level theories. How this is done is typically stated in general terms given the variety of materials that may be appropriate to a given QRS project. Immersion in study materials may help clarify factors rendered invisible in prior studies and omitted topics. The use of peer reviewers is also recommended. Some QRS methods, such as GFT, apply steps to challenge their initial results and concepts using iterative sampling changes (i.e. selective, variational). That is, efforts are made to identify gaps in what is documented and to locate materials that illustrate exceptions to what is most often reported. For example, is income a factor in pill taking when poor families may have to choose between buying food and medication?
Notably, it appears that all methods of interpretive QRS appear to employ aspects of Noblit and Hare’s (1988) three core meta-ethnography analytic methods. Comparison and contrast, and/or the development of a set of lines of argument where literature are incommensurate, are central interpretive QRS methods. Meta-study purposefully includes three domains (data, theory, and methods) in its comparisons and contrasts. CIS adds a wide-ranging critical analysis which may involve critique of omissions and the application of new reflexive perspectives to study results. Across these several methods, how holism is established is often minimally addressed or is left implicit. Application of some qualitative research methods, such as GFT’s use of variational sampling and selective coding, may require large and diverse sets of publications on the synthesis topic. Ensuring saturation or information required for triangulation would also seem to require large and diverse sets of publications, as well as varied results within the literature. QRS methods often invoke widely used and possibly applicable analytic methods, but too often do not explain their use in detail nor address their potential limitations in synthesis use.
The details of doing QRS could be much more fully stated. A common complaint about QRS publications is that they fail to fully describe the analytic choices made and the rationales for these decisions (Booth, 2006; France et al., 2019). Another important concern is that efforts to locate and incorporate potentially challenging views and results are not a routine part of QRS analyses.
Writing up QRS
Once the analysis is completed, the QRS results must be written up in a clear but thorough manner. QRS reports vary from article- to monograph-length publications. Summarizing QRS results in 20 pages may be difficult and appears to require condensing or omitting aspects of the analysis. This challenge may be one reason QRS analytic methods are often poorly detailed in article length reports.
Flow charts and matrices can be useful ways to convey the overall yield of a QRS project (Verdinelli and Scagnoli, 2013). Matrices are most commonly published, with network maps and taxonomies frequent as well. Visual diagrams also illustrate a holistic look at the included study results.
Several guides for reporting QRS projects have been developed and are similar to those used for quantitative systematic reviews (PRISMA, 2009). Booth (2006) recommends the standardized reporting of QRS literature search components regardless of research approach. His STARLITE is a mnemonic for addressing Sampling strategies, Type of study, research Approach, Range of years, Limits, Inclusion and exclusions, search Terms used, and Electronic sources examined. The goal is transparency and clarity in reporting literature searches. France et al.’s (2019) eMERGe project seeks to clarify reporting of meta-ethnographies. The 19 standards proposed emphasize transparency in the reporting of the analytic steps in meta-ethnography. In many respects, the eMERGe project argues for clear descriptions of methods, and discussion of analytic choices—what analytic decisions were made and how they shaped, and may have limited, the reported results. The project also argues for identification and discussion of alternative interpretations developed during the synthesis. Of course, one consequence of such clarity is to lengthen QRS reports.
Summary
QRS is a useful way to combine the data or results of multiple qualitative reports to enhance their usefulness and impact. There are two key approaches: aggregative and interpretive. Aggregative approaches employ methods similar to content analysis, while interpretive methods all appear to employ some version of Noblit and Hare’s (1988) compare and contrast synthesis methods. Other interpretive methods add Critical analyses of included reports or intentionally address the results, methods, and conceptual frames applied in the included publications. QRS projects need to be reported in detail, and often fail to fully describe their analytic processes. Nonetheless, QRS can be a valuable research methodology to expand professional knowledge and to guide both policy and practice.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
