Reliability and Validity of a Cold–Heat Pattern Questionnaire for Traditional Chinese Medicine

Abstract

Objective:

The aim of this study was to develop and evaluate a questionnaire for Cold and Heat pathologic pattern identification in the context of Traditional Chinese Medicine (TCM). This questionnaire was intended to classify subjects into Cold or Heat pattern groups, a distinction that is useful in clinical trials of both herbal and acupuncture treatments.

Methods:

A questionnaire that had been developed in a previous study was completed by 63 patients (Group A) and 64 patients (Group B) from TCM hospitals. Each patient was diagnosed by a TCM doctor as one of three patterns: Cold, Heat, and Complex. The questionnaire results were analyzed for internal reliability and validity based on the doctor's diagnosis.

Results:

Cronbach's α coefficients were 0.579 for the 10 Cold items and 0.718 for the 10 Heat items. There were significant differences in the mean questionnaire scores between the Cold and Heat groups. The classification accuracy of this questionnaire for Group A was 94.9%, and for Group B it was 92.3%.

Conclusions:

Our results suggest that the questionnaire meets certain basic and fundamental requirements and that it may be useful as an adjunct diagnostic tool. Further studies using a greater number and variety of patients will be needed to evaluate its usefulness in clinical trials and in basic physiologic research.

Introduction

T raditional Chinese Medicine (TCM) uses a unique pathologic pattern identification system as a basic diagnostic tool for treatments such as acupuncture and herbal formulae.¹ Pathologic pattern identification is a process of globally analyzing clinical data to determine the location, cause, and nature of a patient's disease.² Clinical TCM trials use this system increasingly to investigate the effects of traditionally administered acupuncture.³ Several recent studies have utilized the TCM pathologic pattern identification system for this purpose. For example, one study examined recurrent headaches and compared TCM pattern diagnoses and acupuncture point selection.⁴ Another acupuncture study suggested an improved outcome in women with recurrent cystitis following a diagnosis of kidney yang/qi xu as opposed to other TCM syndromes.⁵

However, this diagnostic method does not always meet requirements in terms of objectivity and interpractitioner reproducibility. TCM diagnosis studies have shown considerable variability across practitioners, even when they are diagnosing the same subject.^6,7 The standardization of TCM diagnosis remains an important issue in clinical trials. For pattern identification, practitioners use a mixture of reasoning based on “the four diagnostic methods,” namely, inspection, listening and smelling, inquiring, and palpation.⁸ This process of diagnosis using subjective features is related to the lack of objectivity. Therefore, diagnostic devices and questionnaires have been developed to improve the objectivity and reliability of this method.⁹ A questionnaire is a useful method for objectifying and quantifying subjective feelings and sensations. Therefore, it may be an effective TCM tool. Previous questionnaires, such as the Yin Deficiency and the Sasang Diagnosis Questionnaires, are good examples of this.^10,11

In pathologic pattern identification, there exist both detailed TCM patterns such as “qi stagnation transforming into Heat with Dampness accumulation”¹² and also broader pattern categories such as yin, yang, Cold, Heat, Excess, and Deficiency. Some advantages of this broad approach are that it may be easier for practitioners to agree on the fundamental characteristics than on more detailed evaluations and that it captures the important and essential elements of TCM diagnosis.¹³ Therefore, this basic pathologic pattern modality may be valuable both as a clinical tool and as a research tool. Cold and Heat pathologic patterns are one of the “Eight Guiding Principles,” which are four paired principles: Heat/Cold, Exterior/Interior, Excess/Deficiency, and the summary principles of yang/yin that describe the nature and location of the imbalance in the body.² Cold and Heat pathologic patterns can be caused by the interaction between yin/yang energy and cold/heat pathogenic factors.²

A study was undertaken to develop a Korean version of the questionnaire for classifying Cold–Heat, but its validity and applicability were not clinically evaluated.¹⁴ For this reason, we developed another version of the Cold–Heat Pattern Questionnaire (C-H PQ) and evaluated its internal consistency as a reliability check in a pilot study.¹⁵ However, validity was not evaluated, and the questionnaire-based reliability evaluation was inadequate because of the small number of subjects chosen from a single site. Therefore, in the present study, we evaluated the validity and internal reliability of the C-H PQ as a means of adding objectivity to the process of pathologic pattern identification.

Methods

Cold–Heat Pattern Questionnaire

Our chosen C-H PQ was developed in a previous study using a multistage process. First, symptoms to be included in the questionnaire were selected based on the TCM literature and textbooks by a panel of statisticians and experts, each of whom had more than 5 years of clinical experience. Second, a C-H PQ prototype featuring various symptoms was applied to a test group of healthy individuals. It was then reexamined by the expert panel and adjusted to maximize its validity. Through this process, the final version of the C-H PQ consisting of 20 symptom entries (10 Cold items and 10 Heat items) was developed (Appendix 1). Each question was designed to be answered either “yes” or “no” according to the patient's condition during the week preceding the test. Other studies have used either a visual analog scale or a verbal rating scale and diverse statistical methods. In this study, however, the two extreme states, namely, the presence or absence of symptoms, were the only criteria considered necessary to answer each question. We also opted for this method because simply answering “yes” or “no” is easier and more convenient for patients.

Collection of C-H PQ data

We obtained questionnaire results from a group of 63 patients (Group A) who were being treated at the Wonkwang TCM Hospital in Seoul, along with diagnoses from a TCM doctor. The same questionnaire was used with another group of 64 patients (Group B) from the Saesarang TCM Hospital in Seoul. The patients in Group A and B were recruited at one department of each hospital for 2 weeks and diagnosed by the same doctor at each hospital. All of the participants gave consent in advance to be part of the study. The chief complaints of the patients were various chronic and acute internal diseases such as digestive, respiratory, and urinary problems. The doctor's diagnosis was obtained independently of the questionnaire. Each patient was diagnosed as one of the three patterns: Cold, Heat, or Complex.

Statistical analysis

Analyses were performed with SPSS version 14.0 (SPSS Inc.). The Cronbach's α coefficient was computed to test the internal consistency of the 10 items for each pattern. Patients who exhibited a complex pattern were eliminated from our study, since our main goal was a direct comparison of Cold and Heat patterns. Cold questionnaire score (CQ Score) and Heat questionnaire score (HQ score) were calculated by adding up all the relevant item scores. The HQ score minus the CQ score was defined as the Heat–Cold questionnaire score (H-CQ score). To test the discriminant validity, which describes the degree to which the operationalization diverges from other operationalizations that it theoretically should not be similar to,¹⁶ we used independent-sample Student's t test to compare the two pattern groups. The classification line determined from Group A was applied to Group B to test the predictive validity¹⁷ (explained later). Data are presented as means ± standard deviation unless stated otherwise. The acceptable level for statistical significance was set at a p-value of 0.05.

Results

Sample characteristics

Of the 63 patients in Group A, 29 (46.0%) were diagnosed as Cold pattern and 23 (36.5%) as Heat pattern. Of the 64 patients in Group B, 34 (53.1%) were diagnosed as Cold and 25 (39.1%) as Heat. There was no significant difference in terms of either gender or age between Groups A and B (Table 1).

Table 1.

Age and Sex Distribution of Subjects

	Group A	Group B
N	63	64
Male/female	24/39	26/38
Age (years)	38.8 ± 25.2	41.4 ± 20.1

Reliability analysis

The internal consistency of the 10 items in each pattern was checked through Cronbach's α coefficients obtained from Groups A and B combined. It has been suggested that values of α > 0.5 are acceptable, although ideally scores should be >0.7.^17,18 Cronbach's α coefficients were 0.579 for the 10 Cold items and did not change significantly on removing of any of the 10 items. Cronbach's α coefficients were 0.718 for the 10 Heat items and did not change significantly on removing of any of the 10 items, also. This value indicated acceptable internal consistency (Cronbach's α coefficient >0.5).

Validity analysis

Discriminant validity was assessed by independent sample Student's t test, which was used in another previous study about health state.^10,19 We compared the means of the three types of scores (CQ, HQ, H-CQ scores) between the two pattern groups. The mean CQ score in the Cold pattern group was significantly higher than that in the Heat pattern group. The mean HQ score in the Heat pattern group was significantly higher than that in the Cold pattern group. The mean H-CQ score in the Heat pattern group was significantly higher than that in the Cold pattern group. Similar results were obtained from both Group A and Group B (Table 2).

Table 2.

Discriminant Validity of Groups A and B

	Group A			Group B
	Cold pattern group	Heat pattern group	p-value	Cold pattern group	Heat pattern group	p-value
Cold pattern score	3.90 ± 2.06	1.87 ± 1.66	<0.001	3.94 ± 1.61	1.64 ± 1.52	<0.001
Heat pattern score	2.10 ± 1.70	4.09 ± 2.66	0.004	1.38 ± 1.18	5.32 ± 1.86	<0.001
Heat–Cold pattern score	−1.79 ± 1.32	2.22 ± 1.98	<0.001	−2.56 ± 1.89	3.68 ± 1.95	<0.001

Student's t-test was used; between the groups, differences are significant if p < 0.05.

As a method of evaluating predictive validity, which refers to functional relations between a predictor and criterion events,¹⁷ we derived the classification line that was used to divide respondents of Group A into Heat or Cold categories, and we applied that line to the Group B respondents. To obtain the Classification line, we used the H-CQ score of the Group A respondents. In other words, we used the middle point of the mean H-CQ scores from each of the Heat and Cold pattern groups. If the H-CQ score was above the Classification line, we classified it as the Heat pattern group. If the score was lower than the Classification line, we classified it as the Cold pattern group. Table 3 shows the classification results for Groups A and B. The classification accuracies for Groups A and B were 92.3% and 94.9%, respectively (Table 3).

Table 3.

Results Using the Classification Line

		Group A			Group B
		Doctor's diagnosis			Doctor's diagnosis
		Cold pattern group	Heat pattern group	Total	Cold pattern group	Heat pattern group	Total
Classification result	Cold pattern group	29 (87.9%)	4 (12.1%)	33 (100%)	32 (97.0%)	1 (3.0%)	33 (100%)
	Heat pattern group	0 (0%)	19 (100%)	19 (100%)	2 (7.7%)	24 (92.3%)	26 (100%)
	Total	29 (55.8%)	23 (44.2%)	52 (100%)	34 (57.6%)	25 (42.4%)	59 (100%)

Discussion and Conclusions

In this study, we developed a C-H PQ that enabled us to classify Cold and Heat patterns, and we tested its internal reliability and validity. Reliability and validity are two important factors in designing a questionnaire. Reliability is concerned with the repeatability or reproducibility of measurements, and validity reflects the accuracy of data and ensures that responses are a true reflection of the issues of interest.²⁰ To test reliability, we evaluated the internal consistency by using Cronbach's α. Alpha equals zero when the true score is not measured at all and the data show only error or noise. Cronbach's α equals 1.0 when all of the items measure the true score alone without any error contributions. It has been indicated that 0.7 is a good reliability coefficient, but lower thresholds are sometimes used in the literature.^17,18 We combined Groups A and B for calculation purposes, and our results showed that Cronbach's α was 0.579 for the 10 Cold items and 0.718 for the Heat items, indicating that our questionnaire exhibited acceptable internal consistency. Test–retest reliability was not examined in this study because our questionnaire aimed to survey current body condition, which necessarily will change over time.

To test validity, we used two different methods: scores comparison and classification accuracy test. The mean CQ score in the Cold pattern group was significantly higher than that in the Heat pattern group, and the mean HQ and H-CQ scores in the Heat pattern group were significantly higher than those in the Cold pattern group. These results held true for both Group A and Group B, and we thereby confirmed that each score reflected the patient's pathologic pattern. The second method to compare the questionnaire results with the doctors' diagnoses showed excellent classification accuracy, reaching 92.3% and 94.9% for Groups A and B, respectively.

One of the problems with evaluating validity in these kinds of studies is that there is no “gold standard” against which to compare the questionnaire. For this reason, researchers often include only those patients whom several doctors have diagnosed as having the same pattern. In this respect, there are limitations in our study since subjects from both groups were diagnosed by just 1 doctor. We cannot diagnose Cold or Heat patterns based solely on this questionnaire because the pattern needs to be diagnosed not only by subjective symptoms but also by various objective data, such as tongue and pulse condition. In addition, this questionnaire does not discriminate between Excess Cold/Deficiency Cold and Excess Heat/Deficiency Heat. Development of another questionnaire about Excess and Deficiency may enable us to come up with a complete analysis.

Although research questions remain, the significant results in terms of the internal reliability and validity of this study suggest that this questionnaire meets certain fundamental and basic requirements. Our questionnaire might be useful as an adjunctive diagnostic tool or as a preliminary medical examination framework in clinical trials and basic physiologic research. The scores calculated by this questionnaire would add objective evidence to the subjective Cold–Heat pattern diagnosis obtained by the TCM doctor. The Classification line adjustment to the extreme level may be helpful for the collection of a “gold standard” of Cold or Heat pattern. A change in the scores may be useful in providing information on patients' conditions and therapeutic efficacy. In addition, we have introduced a means of objectifying and quantifying the previously subjective domain of TCM diagnosis. Our questionnaire-driven approach may also be useful to assess other subjective domains of TCM.

Footnotes

Acknowledgments

This work was supported by the Korea Ministry of Knowledge Economy (10028438) and a Korea Institute of Oriental Medicine (KIOM) grant funded by the Korean government (K09012).

Disclosure Statement

No competing financial interests exist.

Appendix. Cold–Heat Pattern Questionnaire

Below are conditions regarding the “Cold–Heat” pattern. Respondents were invited to answer either “Yes” or “No” based on health status during the preceding week.

References

Maciocia

. The Foundation of Chinese Medicine. Edinburgh: Churchill Livingstone, 1989.

World Health Organization (WHO). WHO Internal Standard Terminologies on Traditional Medicine in The Western Pacific Region. Manila, Philippines: WHO, Regional Office for the Western Pacific, 2007.

Hammerschlag

. Methodological and ethical issues in clinical trials of acupuncture. J Altern Complement Med, 1998; 4:159–171.

Coeytaux

, Chen

, Linemuth

et al. Variability in the diagnosis and point selection for persons with frequent headache by Traditional Chinese Medicine acupuncturists. J Altern Complement Med, 2006; 12:863–872.

Alraek

, Baerheim

. The effect of prophylactic acupuncture treatment in women with recurrent cystitis: Kidney patients fare better. J Altern Complement Med, 2003; 9:979.

Hogaboom

, Sherman

, Cherkin

. Variation in diagnosis and treatment of low back pain by Traditional Chinese Medicine acupuncturists. Complement Ther Health Med, 2001; 9:154–66.

Zell

, Hirata

, Marcus

et al. Diagnosis of symptomatic postmenopausal women by Traditional Chinese medicine practitioners. Menopause, 2000; 7:129–134.

Kaptchuk

. The Web That Has No Weaver. Understanding Chinese Medicine. Chicago: Contemporary Publishing Group, 2000.

Park

, Park

. A study on standardization of Bian Zheng by some statistical methods. J Korean Inst Orient Med Diagnosis, 2001; 5:306–330.

10.

Lee

, Park

, Lee

, Kim

. Development and validation of Yin-Deficiency Questionnaire. Am J Chin Med, 2007; 35:11–20.

11.

Yoo

, Kim

et al. Sasangin Diagnosis Questionnaire: Test of reliability. J Altern Complement Med, 2007; 13:111–122.

12.

Schnyer

, Taitano

, Allen

JJB

. Prevalence of Chinese medicine defined patterns in people with major depression. Proceedings of the 9th Annual Symposium of the Society for Acupuncture Research, 2002.

13.

Helene

, Gary

, Bonnie

et al. Yin scores and yang scores: A new method for quantitative diagnostic evaluation in Traditional Chinese Medicine research. J Altern Complement Med, 2004; 10:389–395.

14.

Kim

, Park

. Development of questionnaires for Cold–Heat patternization. J Korean Inst Orient Med Diagnosis, 2003; 7:64–75.

15.

Ryu

, Lee

, Jang

et al. A study on development of Cold–Heat pattern questionnaire. Korean J Oriental Physiol Pathol, 2008; 22:1410–1415.

16.

Campbell

, Fiske

. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychol Bull, 1959; 56:81–105.

17.

Nunnaly

, Bernstein

. Psychometric Theory. New York: McGraw-Hill, 1994.

18.

Helmstadter

. Principles of Psychological Measurement. New York: Appleton-Century-Crofts, 1964.

19.

Hatoum

, Brazier

, Akhras

. Comparison of the HUI3 with the SF-36 preference based SF-6D in a clinical trial setting. Value Health, 2004; 7:602–609.

20.

Smith

. Survey research: Instruments, validity and reliability. Research Methods in Pharmacy Practice. London: Pharmaceutical Press, 2002.