Abstract
In response to this special issue, concerned with methods and measurements, a comprehensive review of the last 5 years of qualitative research was conducted in the top five journals that primarily publish articles pertaining to the hospitality industry. A total of 197 articles were read and analyzed for this review with a focus on the state of trustworthiness in the contemporary hospitality literature. An outline of the methods, techniques, and successes are presented in this review as are recommendations for scholars, journal editors, journal reviewers, and our partners in industry who use qualitative data for many reasons including but not limited to employee satisfaction surveys, market focus groups, and employee exit interviews. In addition, the relatively novel and nascent ideas regarding empirical rigor such as transparency and replicability are introduced to the hospitality field.
Keywords
Qualitative research focuses on exploring and investigating the world by studying human behaviors. There are two identified features of those behaviors: (a) They are always influenced by the environment and (b) they always go beyond what has been observed (Schmid, 1981). With a consideration of those features, Kirk and Miller (1986) developed a working definition of qualitative research as “a particular tradition in social science that fundamentally depends on watching people in their own territory and interacting with them in their own language, on their own terms” (p. 9). Because the value of qualitative results heavily relies on an investigator’s observation and judgment toward a specific cultural and geographical setting, how to evaluate the veracity of inferences made becomes challenging.
To evaluate the quality of research, we first acknowledge our own social paradigm and corresponding desirability for trustworthiness in qualitative research. Our ontological and epistemological perspectives align with a postpositivist view of the world. Being critical realists, our belief is that there is a “truth” in existence, but through human observation, we can only approximate that truth (Cook & Campbell, 1979). Furthermore, our modified dualist stance assumes that objectivity remains an unachievable ideal, and that findings are subject to falsification (Lincoln & Guba, 1985). Morrow (2005) suggested the postpositivist social paradigm accounts for the majority of assumptions made in social science research. Researchers can thus conduct “a rigorous and systematic procedure that requires multiple researchers to examine the data and come to a complete consensus about their meaning” and to “reduce[s] bias inherent to just a single researcher analyzing data and employ[s] a series of internal audits to ensure the quality of data analysis and fidelity of the findings” (Tu et al., 2019, p. 90). Given our postpositivist ontological and epistemological perspectives, we see value in Lincoln and Guba’s (1985) framework for qualitative research, in that, at a broad level, it should be credible, transferable, dependable, and confirmable to be trustworthy.
In the contemporary literature, the utilization of qualitative research and naturalistic inquiring has become more commonplace in various areas of study such as sociology, psychology, occupational therapy, information and communication, and health services; however, in-depth discussions regarding the establishment of its scientific rigor have been relatively neglected (Forero et al 2018; Krefting, 1991; Nowell et al., 2017; Shenton, 2004). Quantitative research has a long tradition of being evaluated by its reliability and validity, yet, those evaluative criteria are not necessarily a good fit for qualitative research (Agar, 1986).
However, it is still vitally important to make sound inferences in qualitative studies. Although some scholars may disagree with our particular evaluative criteria for qualitative research based on their own ontological and epistemological perspectives or specific methodologies, there should still be an interest in enhancing the scholarly conversation around topics and methods that are important to individual researchers and that inform theory in hospitality research.
For results of studies to be of value and interest to the field, they must provide readers with information that can be built upon, an assertion brought forward by and a problem highlighted in contemporary qualitative research by Aguinis and Solarino (2019). If qualitative studies are deficient in being able to produce “quasi-replication” results defined as “robustness of prior studies to different empirical approaches” or the “generalizability of prior studies’ results to new contexts” (Ethiraj et al., 2016, p. 2191), which Aguinis and Solarino (2019) argue are the most important type of results for business scholars, we need to improve the quality of rigor currently represented in the field. In fact, a majority of qualitative papers published have been found to be unable to even provide results that allow for conceptual replication (Aguinis & Solarino, 2019), making building theory and further empirical probing problematic. Our postpositivist paradigm meshes well with the trustworthiness framework established by Lincoln and Guba (1985), and is cited throughout this article by studies that we have reviewed as a basis for their methodological decisions, making it the basis for our inquiry. Therefore, we seek to accomplish two aims with this study, the first of which is to understand the current state of trustworthiness in hospitality research. Our second aim is to identify ways authors, reviewers, and editors can work together to improve the level of trustworthiness in the field.
Literature Review
The idea of the “trustworthiness of inquiry” as well as the four criteria of credibility, transferability, dependability, and confirmability developed by Lincoln and Guba (1985) have been widely accepted by most scholars when creating or evaluating the rigorousness of qualitative research. As described by Lincoln and Guba, trustworthiness is related to a consideration of why the findings of a particular qualitative study are “worth paying attention to” (p. 290). Before trying to persuade the readers and audiences that qualitative results are “worth taking account for,” researchers are encouraged to ask questions regarding what the four aspects of trustworthiness establish in terms of methodological rigor. Like all scientific inquiry, qualitative research, as well as quantitative research, must establish the following four constructs to be considered rigorous: truth value, applicability, consistency, and neutrality, which will be discussed next.
Truth value refers to the confidence in the truth of the findings of a particular inquiry (Lincoln & Guba, 1985), and corresponds with the establishment of a study’s credibility. One notable thing here is that the confidence of the truth is established based on the particular research design, informants, and context (Krefting, 1991). In qualitative research, truth value should be subject oriented and generated from observing human behaviors, not defined a priori by the researcher (Sandelowski, 1986), or as Creswell (1998) suggested, the mark of quality in qualitative research is verification. Creswell went on to list several ways that an author can produce rigorous qualitative work: prolonged engagement, persistent observation, triangulation, peer review or debriefing, negative case analysis, clarification of research bias (reflexivity), member checking, rich-thick description, and external audits, all of which are techniques highlighted by Lincoln and Guba (1985). Indeed, Creswell agrees that using the trustworthiness framework is a sound framework to ensure a study is sufficiently rigorous.
The concept of transferability helps to establish the applicability construct, which refers to research that is able to apply findings to people other than the researchers themselves or those specific individuals who participated in the study, as expressed by Hammarberg et al. (2016): Applicability, or transferability of the research findings, is the criterion for evaluating external validity. A study is considered to meet the criterion of applicability when its findings can fit into contexts outside the study situation and when clinicians and researchers view the findings as meaningful and applicable in their own experiences. (p. 500)
In quantitative research, sampling from an increasingly larger pool can help to establish the generalizability of the findings by ensuring the representativeness of the population of interest (Thompson, 2012). In qualitative research, a larger sample size does not help to establish applicability, because larger samples can dilute an individual’s voice and create a problem of breadth over depth in the research (Hammarberg et al., 2016). In addition, a larger sample size in qualitative research can hamper the ability to analyze the data adequately, creating problems in applying the findings made. Rather than an increasingly large or representative sample size as typically favored with quantitative methods, scholars should instead focus on data saturation that allows theories, themes, and other ideas to be inferred through the analytic process (Charmaz, 2005). Data saturation refers not to when the sample is statistically representative of the population, but encourages naturalistic inquiry to stop when informants stop providing new information and it is reasonable that further data collection will offer no new insights into the phenomenon (Glaser & Strauss, 1967). Sample sizes in qualitative research are thus linked more to the phenomenon of interest and not the particulars of a population, which is why the samples tend to be much smaller than in quantitative studies, and differ wildly from qualitative study to qualitative study.
Consistency or dependability of the results (reliability in quantitative research) refers to the idea that another scholar would find similar results (Lincoln & Guba, 1985). It is important to note, and Hammarberg et al. (2016) stressed, that consistency does not mean that other researchers would find the same results replicating a study, but that similar patterns would emerge between the two studies. To help establish consistency in qualitative inquiries, researchers often seek maximum variation in the experience of a phenomenon, not only to illuminate it but also to discourage fulfilment of limited researcher expectations (for example, negative cases or instances that do not fit the emerging interpretation or theory should be actively sought and explored). (Hammarberg et al., 2016, p. 500)
Furthermore, Morse and Richards (2002) suggest this is often the stage of a study where researchers will seek to have their findings verified by another team member, or it is also plausible to verify findings with an independent person who can critically review the initial findings.
Finally, neutrality is a concept that suggests a level of objectivity or the ability for results to be confirmed (Lincoln & Guba, 1985) and corresponds with the confirmability of the study. Given the subjective nature of naturalistic inquiry (Schmid, 1981), it is important for scholars to acknowledge their own background and position and how that might influence the framing of the study, data collection, analysis, interaction with informants, and any other step of the research process (Malterud, 2001). Such a concept is referred to as reflexivity. Malterud (2001) went on to state that “Preconceptions are not the same as bias, unless the researcher fails to mention them” (p. 484), because without acknowledging a researcher’s own lens in which the world is viewed, the results of the study could be skewed toward the worldview of the researcher. As such Lincoln and Guba’s (1985) trustworthiness framework suggested several techniques that can help the researcher remain neutral to help establish confirmability in the results.
Trustworthiness in naturalistic inquiries is a parallel to rigor in conventional inquiries, achieving the same ends, but in different ways (Lincoln & Guba, 1985). As a summation, conventional inquiries establish truth value through internal validity, applicability through external validity, consistency through reliability, and neutrality through objectivity. As a point of distinction, the requirements of naturalistic inquiry are different and, therefore, they establish the same constructs through different means: truth value through credibility, applicability through transferability, consistency through dependability, and neutrality through confirmability. Table 1 highlights how qualitative researchers can use trustworthiness to help improve the veracity of the inferences made in their research and how that parallels the efforts made by conventional researchers. The next section describes each of the four tenants of trustworthiness in greater depth and detail.
Comparing Rigor Across Research Methods.
Trustworthiness Criteria
The first criterion of trustworthiness is credibility. To be specific, it deals with the question of “how congruent are the findings with reality” (Merriam, 2009, p. 213) and it is one of the ways to establish trustworthiness. Lincoln and Guba (1985) went on to suggest five techniques for ensuring credibility as outlined in Table 2. In addition to those basic five techniques, aimed at advancing credibility in qualitative research, previous scholars also tried to add more criteria according to their own research experiences, as listed in Table 3.
Lincoln and Guba’s (1985) Techniques of Establishing Credibility.
Additional Techniques of Establishing Credibility.
The second aspect of trustworthiness is transferability, it asks about the degree to which the findings from this particular qualitative study can be applied to other contexts/settings or with other subjects (Lincoln & Guba, 1985). Different from quantitative research, the hypotheses of a qualitative study can only be set “with a description of the time and context in which they were found to hold”; therefore, naturalists cannot simply specify the external validity of an inquiry, but what they can provide is a thick description (Lincoln & Guba, 1985, p. 316). In qualitative research, naturalists only make “transferability judgments possible” for their audiences or future scholars (p. 316).
Dependability is the strategy that can be used to ensure that one study is replicable and sufficient to establish future studies. It is a response technique to the third aspect of trustworthiness, consistency. In quantitative research, reliability is the criterion concerned with the stability, consistency, and equivalence in the study (Sandelowski, 1986). In quantitative work, researchers are able to check the reliability by employing techniques to show whether the work can be repeated and similar results can be generated by using the same methods and the same participants in the same context; however, the changing nature of the phenomena studied by qualitative researchers makes it impossible in naturalistic inquiry (Marshall & Rossman, 1999). Guba (1981) suggested the term auditable, which refers to that with a well-structured and rich description, other researchers can clearly understand and follow the decision trial used by the investigator.
The last aspect of trustworthiness in qualitative research is confirmability, which represents whether the research is free from bias, in other words, the findings are a function solely of the informants and conditions of the research and not of other biases, motivations, and perspectives (Guba, 1981; Sandelowski, 1986). Lincoln and Guba (1985) suggested the major technique here is a confirmability audit, and its operationalization came from the work of Halpern (1983) who specified the components and process of the audit. Halpern identified six audit trail categories: (a) raw data (field notes, video and audio recordings), (b) data reduction and analysis products (quantitative summaries, condensed notes, working hypotheses), (c) data reconstruction and synthesis products (thematic categories, interpretations, inferences), (d) process notes (procedures and design strategies, trustworthiness notes), (e) materials related to intentions and dispositions (study proposal, field journal), and (f) instrument development information (pilot forms, survey format, schedules). Therefore, these various trustworthiness criteria are used to accomplish the two aims of this inquiry, to understand the current state of trustworthiness in the field and to make recommendations in improve the future of the scholarly conversation.
Method
To determine the current state of qualitative research in the extant literature, a systemic review was conducted of the top five journals 1 that publish primarily hospitality research (Scimago Institutions Rankings, 2019). Although other journals may provide excellent contributions to the field, we had to focus on a select number of journals to prevent the scope of the work from becoming unwieldy for our discussion into the topic of methodological rigor. Although five is in arbitrary cutoff, we believe that studies older than 5 years may not necessarily discuss the “current state” of the literature; therefore, we reviewed what we felt was a reasonable time frame to discuss the topic in a comprehensive and intelligible way. During the review process, every article from each journal was read, if the paper did not employ qualitative methods, it was disregarded. If the paper did use qualitative methods, we proceeded to review the methods, results, and discussion sections for elements of trustworthiness. We excluded systematic reviews to focus only on research that collected firsthand data the authors in turn analyzed. We also disregarded research that used qualitative methods only to develop scales as instruments, because when conducting a scale-development study, validity and reliability are the goals, not trustworthiness.
A total of 197 articles published in the last 5 years were reviewed. Among those five journals, International Journal of Contemporary Hospitality Management (IJCHM) had published most of the qualitative papers in hospitality (N = 86), followed by International Journal of Hospitality Management (IJHM; N = 69; Table 4). We also observed an ascending trend in terms of the number of qualitative papers being published in hospitality journals. Method wise, a majority of the studies employed thematic analysis (22.84%), followed by content analysis (15.74%), and case study (15.23%). Fourteen studies employed grounded theory (7.11%), whereas few of them used methods such as critical incident technique (N = 8), ethnography (N = 6), and phenomenology (N = 5). A number of articles (N = 43) did not report the method they used.
Number of Articles Published by Journal.
Note. IJHM = International Journal of Hospitality Management; CQ = Cornell Hospitality Quarterly; IJCHM = International Journal of Contemporary Hospitality Management; JHMM = Journal of Hospitality Marketing and Management; JHTR = Journal of Hospitality and Tourism Research.
The sampling method was analyzed, and the results indicated that three sampling methods were frequently chosen by hospitality researchers. Purposive sampling was the most popular one as evidenced by a majority of the published studies (61.42%) adopting this sampling method. An additional 7.11% of the studies used a combination of purposive and snowball samplings. Convenience sampling was employed by some studies (27.41%), one study used polar sampling and two studies used random sampling, whereas the rest did not report their sampling method (4.06%). As for data collection, most of the studies had interviews as part of their data collection process (see Table 5 for details). Thirty-two studies collected data using third-party sources, and 18 studies used three sources for data collection (such as a study using direct observation, semistructured interviews, and publicly available third-party data). A small number of studies (N = 8) used focus groups as their only means of data collection. The rest of the studies did not report their data collection process (5.58%). Sample size varies from two to 234 with the average being 36. About 72% of the studies reported their coding procedure, whereas the rest failed to provide that information. In terms of the participants they studied, our results indicated that most studies recruited hospitality workers as their participants (N = 130), followed by guests (N = 68) and others (N = 28). Some studies sampled both guests and employees.
Data Collection Methods.
Furthermore, the trustworthiness of these studies was evaluated following the criteria suggested by Lincoln and Guba (1985). To evaluate credibility, seven techniques were considered (i.e., prolonged engagement, persistent observation, triangulation, peer debriefing, negative case analysis, referential adequacy, and member checking). Our analysis indicated that the majority of the studies employed at least one of the seven techniques to establish credibility (N = 167). Thirty studies did not include any of the seven techniques in their studies. More than 60% of the studies conducted research using either prolonged engagement (67.51%) or triangulation (65.48%; see Figure 1 for details). In addition, 13 studies employed five techniques to establish credibility, whereas two studies used six out of seven techniques.

Techniques for Establishing Credibility.
In terms of transferability, Lincoln and Guba (1985) suggested the use of thick description as a technique to achieve external validity. Thick description is about describing a phenomenon in sufficient detail, so that it can be used to evaluate the extent to which the conclusions can be transferable to other times or settings (Lincoln & Guba, 1985). In this research, we examined the number of studies that used thick description. Our results indicated that 76.65% of the studies incorporated this technique in their research. Similarly, dependability was evaluated using the inquiry audit technique, which refers to the process of having inquiry auditors review the inquiry processes to be certain that they fall within the norms of “good professional practice” (Lincoln & Guba, 1985). Results exhibited that only 14.21% of the studies employed an inquiry audit to establish dependability. A majority of the research either did not adopt any technique to achieve dependability or failed to report it. Finally, four criteria were used to evaluate confirmability: confirmability audit, audit trail, triangulation, and reflexivity. Results indicated that audit trail was the most common technique used by hospitality researchers to establish confirmability (79.70%), followed by triangulation (62.44%) and reflexivity (48.22%). Confirmability audit was used to a lesser extent (14.21%; see Figure 2). In addition, 28 studies did not contain any of the confirmability elements. The rest of the 197 studies employed at least one technique and 12 studies used all four elements to establish confirmability.

Techniques for Establishing Confirmability.
Our review of prior literature showed that there rarely exists one study that has sufficiently recognized all elements of trustworthiness, and a few that even failed to communicate any of them. However, some exemplary studies from our retrieved hospitality literature have done an exemplary job establishing one or multiple elements of trustworthiness.
Examples of Trustworthiness in the Hospitality Literature
In terms of credibility, although prolonged engagement and triangulation are the two most commonly employed techniques, negative case analysis has received the least attention. For instance, in Aslan’s (2016) study, the author conducted 47 semistructured interviews to examine sexual intimacy between male hotel workers and female tourists in hotels in Marmaris, Turkey. The author communicated a high level of prolonged engagement with the research context and persistent observation through performing 47 interviews with 49 interviewees all by himself during an extended time span of 4 years, spending adequate time speaking with interviewees to develop relationships (as both a researcher and a coworker) and rapport with them, and conducting about 390 hr of participant observation in two hotels. Triangulation was achieved in this study by obtaining multiple data sources (interviews, participant observation) to understand the phenomenon of interest and to safeguard the richness and robustness of the findings. For member checking, the author shared the research findings with his interviewees from the last four interviews to seek their feedback on the findings and to avoid selective representation of findings. All of them agreed that the findings were plausible. In another study by Presenza and Petruzzelli (2019), the authors adopted an in-depth case study approach to understand the role and behavior of an Italian chef–entrepreneur Romito. The authors tracked the progress of the behavrioal management of chef Romito from 2000 to 2018. During this long period of analysis, one of the authors met chef Romito in several informal meetings as well as formal interviews, to explore the entrepreneur’s personality, character and expectations, the history, the situation, and various factors influencing his innovation. They also paid 24 direct visits on site. All of these practices conveyed a high level of prolonged engagement and persistent observation. In addition, the authors collected data from multiple sources for triangulation, including information from direct observations; from local, national, and international press; from scientific journals and several specialized sources (e.g., restaurant magazine and official websites for chefs and foodies); from the personal accounts (five books, website, and social media profiles) of chef Romito, from discussions and observations of students and teachers of the Niko Romito Academy, as well as from 16 interviews with Niko Romito that lasted a total of 171 min. When analyzing the data, the authors first discussed their data interpretation to develop a preliminary understanding and then performed a further series of iterations between their data and the literature.
Transferability is another important element when establishing trustworthiness in qualitative research. A number of exemplary studies have well-communicated transferability and dependability of the research and findings. In Binder et al.’s (2016) study, the authors performed semistructured in-person interviews with owners or general managers of 12 small and medium enterprises in the Viennese hotel sector, aiming at identifying how different forms of organizational innovativeness may result in different innovation outcomes. In recognition of the significance of addressing the transparency of data collection and data analysis, the authors provided a detailed account of the hotels analyzed based on theoretical sampling, and further described in sufficient detail both the research method and the analysis methods (i.e., the structural qualitative content analysis and detailed structure analysis). For another example, Khoo-Lattimore et al. (2016) compared the food perceptions of Taiwanese and Malaysian Chinese consumers using a projective technique, where respondents were instructed to collect images and then participate in a 90-min in-depth interview. The authors presented a thick description of the unique research technique they employed (a psychoanalytical tool called the Zaltman metaphor elicitation technique), respondents’ recruitment, a nine-step data collection method, as well as data analysis of the transcribed interview data. Smith et al. (2017) established transferability by revisiting their coded results in relation to the previously referenced family purchase behavior literature.
Dependability is often achieved by involving external researchers during the research process. The goal is to safeguard the accuracy of the findings and interpretations. For instance, Binder et al. (2016) invited three independent individuals to analyze the data and verify the findings. All three individuals had no prior exposure to the interview materials and no knowledge of the businesses under investigation, thus fostering an independent and unbiased approach. Similarly, Putra and Cho (2019) involved three doctoral students who were independent of their study and were well trained in qualitative methodologies. These students were invited to check and refine the data analysis as well as the identified categories in addition to the two authors. Practices such as these were recognized to allow an outsider to challenge the findings and to enhance the rigor of a qualitative research study.
As discussed above, confirmability can be achieved by applying four techniques, namely, a confirmability audit, audit trail, triangulation, and reflexivity. Several studies were identified to have utilized multiple techniques to establish a high level of confirmability of their research process and findings. For example, Smith et al. (2017) aimed to examine the dynamics of a shared decision-making process of couples when choosing a luxury resort. To observe the process in real time, the authors employed both observations and video recordings of the decision-making process (triangulation of multiple data sources). During the coding, the assigned codes were independently audited by a third researcher. During the debriefing of the analysis results, all (except for three) participating couples indicated that the quasi-experiment they were in was either “pretty close” or “very close” to their actual decision-making process, offering evidence for the confirmability of the results, or a great extent to which the findings were shaped by the participants instead of shaped by the researcher bias, motivation, or interest. For all the steps starting from the design and execution of the experiment, to the analysis, and to the report of the findings, the authors provided a transparent description, further evidencing the confirmability of their findings. Another good case in point refers to Alford and Duan (2018), which introduced an in-depth case study by rigorously compiling data from semistructured interviews and internal destination management organization (DMO) documents (triangulation of data sources), to identify the key factors influencing collaborative innovation in a DMO. In recognition of potential bias related to the interviewer’s role as well as a need to address reflexivity and subjectivity, the authors followed a process of bracketing as introduced by Jones et al. (2013), during which they proactively took into consideration the possible biases and have the study’s coresearcher challenge any assumptions that might have been made based on overfamiliarity with the case (researcher triangulation). These practices helped safeguard a degree of neutrality, or the confirmability, of the study findings.
Conclusion and Future Directions
Although it was not the focus of this review, it is important to note that while gathering data for this article, we did not encounter a review of the literature as it related to qualitative methodological rigor within the hospitality field. In fact, most studies that discuss qualitative methods do so as a way to inform the reader regarding research concepts or discuss improvements to specific methodologies (see Creswell, 1998; Hammarberg et al., 2016; Morrow, 2005; Sandelowski, 1986), rather than reviewing the current ways in which scholars are employing various research techniques as a way to improve rigor. As such, given the absence of a systemic review of hospitality journals, we provide a snapshot of not only the current state of the research conducted most recently (from 2014–2019) but also identify the relative strengths and weaknesses that we as a field have demonstrated. Therefore, our contribution to the scholarly conversation is a to provide a guide not only for authors to write better papers, or graduate students to learn more about their craft, but also for reviewers who may be subject matter experts but are not familiar with naturalistic inquiry to be better able to consider qualitative papers on their merits and contributions.
Suggestions for Authors, Reviewers, and Editors
Establishing trustworthiness
The present research offers significant theoretical implications as well as practical implications. First of all, findings of this research offer guidelines for journal reviewers and editors on what criteria should be used to help improve the quality of naturalistic inquiry. Of the 197 papers we report on here, 141 (71.6%) reported on how their data were coded leaving more than a quarter of published studies without any information on how the data were coded. Also, 43 (21.8%) of studies did not report the qualitative method the study employed. Not just in hospitality, Aguinis and Solarino (2019) commented on a similar pattern in the strategic management literature, and lamented the lack of transparency in published qualitative studies. Methodological flexibility in qualitative research is unavoidable because each naturalistic inquiry is a unique process, which requires the researcher to craft his or her own method or make changes in accordance with the circumstances (Tuval-Mashiach, 2017). As such, there has been an emphasis on tolerance of methodological ambiguity as an inescapable, even desired, component of qualitative research (Morrow, 2005). However, an opposing view has emerged in recent years and calls for defining criteria for the evaluation of the rigor of qualitative research and for more systematization in the research process (Guba & Lincoln, 2005; Tracy, 2010). The need for transparency has become increasingly important to qualitative research (Hiles & Čermák, 2007) as it enables the readers not only to learn about the trustworthiness of a study but also to replicate it, or adopt the study’s methods in their own future research (Tuval-Mashiach, 2017). To improve transparency in qualitative research, the present research suggests that reviewers and editors should start with the four factors discussed in this article (i.e., credibility, transferability, dependability, and confirmability), and use them as a checklist to determine whether additional measures are needed to establish trustworthiness.
Within our review, only 28 studies or 14.2% of papers discussed some type of external audit to help establish trustworthiness. By subjecting the raw data, methods, coding books, and coding procedures to an external audit, the authors also help to establish transparency as well as dependability and confirmability. By collecting and analyzing data as a team, then allowing a person who was not a part of that process to audit both the process and product of the research, the authors can significantly strengthen the veracity of their findings (Creswell, 1998; Lincoln & Guba, 1985). Beyond being a technique that helps improve the trustworthiness of the article, by enhancing the study’s rigor, external audits help to foster accuracy in results, and provide an outsider the platform to challenge the process and findings of the authors (Creswell, 1998; Lincoln & Guba, 1985), while also delivering benefits to the authors such as help summarizing preliminary findings, giving the authors an opportunity to assess the data’s adequacy, and gathering important feedback that assists in articulating final results and the development of stronger theory. Given the extent and amount of work an external auditor would conduct, we suggest these people may be included in a research effort by being granted authorship and be included in institutional review board approval for a study. With institutional review board approval, they can have access to potentially sensitive raw data. To truly be an external auditor, however, they would not have played a role in the study until the study’s main authors were ready to have their processes and initial findings audited.
Consent and confidentiality issues
Authors of qualitative studies may have other unique institutional review board concerns that quantitative researchers do not. Naturalistic inquiry collects exceptionally rich and sometimes particularly sensitive data that may be more difficult to share with others than a spreadsheet of numbers with the identifying information scrubbed off. As such full transparency with the data may be difficult for authors of qualitative studies in the way Aguinis and Solarino (2019) suggest. In addition, depending on the study, informants may not consent to participating in the study if their responses and stories are going to be shared raw and unedited with people they do not know or trust. We suggest for authors to conduct external audits with people who have institutional review board approval to be a part of the study and consequently are up-to-date on all ethical training and research guidelines from their home institutions. By sharing the study with another, the authors are taking a step toward both replication and transparency, while also increasing the trustworthiness of the study, and are able to do so without creating an additional burden through the institutional review board approval process or in obtaining informed consent.
In some other cases, sharing our qualitative data may actually be an easier feat given that 32 (16.2%) studies collected only publicly available third-party data. With the advent of websites such as Trip Adviser and Yelp! conducting “netographies” (qualitative research using publicly available online posts) is becoming possible as a research technique. Interestingly, only six of those 32 papers (roughly the same proportion of studies that collected data from speaking directly to informants) conducted some form of external audit on their data and methods. Although consent to share information from interviewees or focus group participants may be difficult to obtain for ethical considerations or privacy concerns, when the data are publicly available, these concerns are alleviated and authors have an opportunity to be more transparent with their research process.
Special concerns for studies with third-party data
One of the main considerations for authors who collect third-party data, and consequently the reviewers who are charged with evaluating their manuscripts, is the richness of the data collected. Reviewing thousands of posts on an online forum may sound good to the quantitative researcher’s ear. However, as Hammarberg et al. (2016), Charmaz (2005), and Glasser and Strauss (1967) all suggest, collecting data from more informants does not only fail to enhance the quality of qualitative research but also can actually be detrimental to the inquiry. This is because authors may not be able to probe the phenomenon of interest deeply enough, have trouble analyzing data from so many different sources, and with so much data would be analyzing redundant information by collecting beyond the data saturation point. Collecting data that cannot be probed by the researcher and where no reflexivity can be added to the data collection process can inhibit the richness of the study and lead to trustworthiness issues. Rather, playing a defined role in the way the researcher interacts with informants can help to solicit more detailed and rich data while also helping the interviewer keep any predispositions in check, or as Lothane (2011) suggests, conducting dramatological interviews will create better inferences than relying on the narrative alone. Thus, we suggest data collected without any interaction or direct observation between the researcher and the informants (i.e., prolonged engagement and persistent observation) be viewed with skepticism and that reviewers of such studies demand the diligent application of trustworthiness and transparency. Finally, the inferences made from studies that only collect qualitative data from third-party sources should be viewed as only being able to make exceptionally narrow inferences, which could be appropriate for some topics such as evaluating the first impressions of a newly opened restaurant and seeing real-time responses to travel disruptions caused by a crisis (e.g., such as how travelers coped with flying during the COVID-19 pandemic).
Beyond the lack of reflexivity with data collected from third-party sources, establishing credibility may also be difficult. When reading online reviews and using a software such as NVivo to analyze the results, having prolonged engagement or persistent observation of the phenomenon may be difficult to establish. Furthermore, if all data are collected through a single third-party source, triangulating the data, conducting member checks, and engaging in peer debriefings may also be impossible. Therefore, the only way to help to establish credibility for studies that collect data exclusively through third-party sources would be to establish some type of referential adequacy and to conduct negative case analysis. These two techniques were rather uncommon in the identified literature of the previous 5 years with only 35 studies (17.8%) using referential adequacy and only six studies (3%) conducting negative case analysis, making it the most infrequently used technique overall. Not only should these be the preferred method to establish credibility for studies with this type of data, but reviewers should demand to see how the authors engaged with these techniques.
In fact, when data are collected from an independent third-party source, using referential adequacy and negative case analysis might be easier than when collecting data from interviews. Once data can be mined from a site such as Trip Adviser or Yelp! an almost endless supply of posts are available to researchers. If authors were to set aside a certain amount of data to be analyzed after initial findings were established, they could more easily use the referential adequacy technique to establishing credibility than if a research team were conducting interviews and set aside every fifth interview to be analyzed after the fact. The speed and ease at which third-party data are collected would make it easier for researchers to reach the data saturation point and still reserve cases to be analyzed later; in fact, this might be one of the biggest advantages of collecting data from third-party sources. Likewise, conducting negative case analyses might also be easier because all the posts are available to authors and they do not have to seek out and identify negative cases that may be extremely difficult for researchers conducting interviews. For example, when looking at posts of restaurant reviews, a research team would have both positive and negative reviews as well as reviews of other establishments in the areas to conduct negative case analyses with, but finding people to interview it may be more difficult to locate people who are not part of a group or who have an experience counter to what is being studied. Again, another strength of collecting third-party data is to have easy access to so many negative cases.
Alternatives to trustworthiness
Finally, although our inquiry focuses on trustworthiness, it should be of note that several other criteria exist for evaluating the rigor of qualitative research. An emerging point of view in the scientific rigor debate comes in the form of transparency and replicability. The emerging discussion on transparency and replicability has mostly focused on quantitative methods (e.g., Bergh et al., 2017; Bettis, Ethiraj, et al., 2016; Bettis, Helfat, & Shaver, 2016). Making Aguinis and Solarino’s (2019) review on qualitative studies so intriguing, as such we will discuss transparency and replicability separately, following their framework.
Replicability is the idea that another scholar could conduct a similar study and find supportive results. Replicability is also broken down into three parts: (a) exact replication, where one study is replicated using the same methods and samples from the same population; (b) empirical replication, where one study is replicated using the same methods, but samples from a different population; and (c) conceptual replication, where one study uses replicated sampling from the same population, but employs a different method (Aguinis & Solarino, 2019). Unlike in quantitative research, replication may not always be a desired outcome for qualitative studies, exemplified by ethnographic work where the researcher takes on the role of a study instrument (Welch & Piekkari, 2017). Other qualitative work may be well suited for replication, such as grounded theory, which was developed with generating midrange theories as a desired outcome (Glaser & Strauss, 1967). The point then of replicability in scientific inquiry is to build replicable and cumulative knowledge (Bettis, Helfat, & Shaver, 2016). Replicability thus is an outcome of research that should be aided by enhanced rigor. By engaging in trustworthy research according to the outlined tenets discussed in this review, researchers should then be able to not only begin to replicate each other’s work but also build knowledge cumulatively, but only if the methods employed are transparent. For a scholar to replicate and build on previous work, to be the dwarf standing on the shoulders of giants, the earlier work needs to be transparent. Most researchers regardless of their chosen methodology would most likely argue for the concept of transparency.
Transparency was defined by Aguinis and Solarino (2019) as the degree to which authors share information about their methods, data, and conclusions so that any potential errors are found and the ability to question the inferences made is retained. To identify potential errors, question inferences, and build on previous work, the more transparent researchers are regarding their study, the easier it will be to build toward a greater understanding of the phenomenon in question. To assist authors to become more transparent, Aguinis and Solarino created 12 factors that would help to establish transparency in a qualitative study and assigned each factor as supporting a different type of replicability; the 12 factors and how they relate to replication can be seen in the appendix of this study. In arguing for greater transparency in qualitative research, Aguinis and Solarino posited that it will also enhance the trustworthiness of the study, and cited Denzin and Lincoln (1994) when they state, “Moreover, there is also a need to understand the trustworthiness, meaning, and implications of a study’s results for theory and practice and this can also be achieved more easily with a greater degree of transparency” (p. 1292). It should be important to note here that replicability and transparency are conceptualized as being supported by trustworthy techniques. When a high degree of transparency is present, which allows for a greater degree of replicability, often authors are reporting on the four elements of trustworthiness to a greater degree. Therefore, the ideas of Lincoln and Guba (1985) and Aguinis and Solarino seem to be synergistic.
Although the twin factors of replicability and transparency may be excellent ways future researchers can follow to improve the quality of their work, it is not a focus of our current study, because it would be unreasonable to assume scholars would have already adapted criteria from 2019 into the current published literature. In addition, the scholarly debate regarding replicability and transparency is fairly nascent, and nothing near a scientific consensus on best practices has been established as of yet. Therefore, to accomplish the first aim of this study, we focus on the well-established criteria of trustworthiness to first discuss the current state of qualitative research in the field of hospitality. We encourage future researchers to not only aim to make their work as trustworthy as possible but also be as transparent as possible.
A Practitioner’s Guide
Although our review and analysis focus on what future journal editors, study authors, and journal reviewers should focus on, we also provide a rough working guide for our counterparts in industry to better analyze employee exit interview data, market focus group data, or qualitative data received as part of customer and employee satisfaction surveys. Firm managers must understand their employees and customers to continually make improvements to the company as well as to scan the environment for innovative new ideas. By reviewing these qualitative research techniques, we urge managers to collect and analyze qualitative data in a scientific rigorous manner as recommended below to achieve the most accurate results possible.
Transparency and standardization of methods should be important for midsized to large companies, because, for example, each individual hotel would conduct exit interviews with resigning managers and team members. By standardizing the method in which the data are collected and analyzed and sharing that across properties, corporate managers may begin to uncover certain themes that lead to the decision to quit or about the working conditions in general. If each person conducting the exit interviews is transparent and is similarly trained to analyze the data, the company’s leaders can begin to learn in aggregate, potentially contributing to company-wide responses. However, managers who gather and analyze focus group data could share their results and processes with people who were not a part of the focus group study. If the corporate marketing team gathers customer feedback in the form of focus groups, a separate team of operational managers could review the findings as an external audit to improve the trustworthiness of the conclusions made.
The techniques outlined in this article could also be employed by managers in the field when analyzing written responses to employee satisfaction surveys. Open-ended questions in employee satisfaction surveys are not publicly available but present the same opportunities and challenges for managers as do feedback posted online in the form of reviews. The data on surveys cannot be probed, no reflexivity can be applied during the data collection, the comments are anonymous, and the organization cannot engage with the people providing the feedback, all of which are similar to publicly posted reviews. Managers can use referential adequacy to ensure the veracity of their conclusions to gauge whether their take on the comments made mesh well with unread responses. Managers have an advantage over researchers when examining this type of data, in that, they can conduct interviews with employees to triangulate the data reported on and gain greater clarity than a researcher who is constrained to what has been posted.
Lastly, an in-depth case study approach is deemed applicable to hospitality businesses that desire to understand the roles and behaviors of their star/exemplary employees. We suggest long periods of engagement and analysis for achieving prolonged engagement, multiple data sources for triangulation such as information collected from direct and persistent observations, from press releases, from scientific journals and trade magazines, and from the personal accounts, as well as external inquiry audits for improving dependability of analysis results. In fact, working with scholars and consultants trained in these specialized methods may be a particularly valuable tool for hospitality managers who are not as well versed in qualitative methods and/or who want to better understand their customer and employee satisfaction surveys, or when they feel the “big data” they are collecting are not giving them the full picture of their customer and employee needs.
Footnotes
Appendix
The 12 Factors of Transparency According to Aguinis and Solarino (2019).
| Transparency Identification Criterion | Criterion Required for Replicability | ||
|---|---|---|---|
| Exact Replication | Empirical Replication | Conceptual Replication | |
| Kind of qualitative method |
|
|
|
| Research setting |
|
|
|
| Position of the researcher along the insider–outsider continuum |
|
|
|
| Sampling procedures |
|
|
|
| Relative importance of the participants/cases |
|
|
|
| Documenting interactions with participants |
|
|
|
| Saturation point |
|
|
|
| Unexpected opportunities, challenges, and other events |
|
|
|
| Management of power imbalance |
|
|
|
| Data coding and first-order codes |
|
|
|
| Data analysis and second- and higher order codes |
|
|
|
| Data disclosure |
|
||
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, or publication of this article.
