Abstract
Objectives:
The aim of this study was to evaluate the reliability between observers with regard to pulse signs that are observed by Traditional Korean Medicine (TKM) clinicians.
Methods:
A total 658 patients with stroke who were admitted into Oriental medical university hospitals from February 2010 through December 2010 were included in this study. Each patient was seen independently by 2 experts from the same department for an examination of the pulse signs. Interobserver reliability was measured using three methods: simple percentage agreement, the κ value, and the AC1 statistic.
Results:
The κ value indicated that the interobserver reliability in evaluating the pulse signs of the subjects ranged from poor to moderate, whereas the AC1 analysis revealed that agreement between the 2 experts was generally high (with the exception of slippery pulse). The κ value indicated that the interobserver reliability for assessing subjects who garnered the same opinion between the raters was generally moderate to good (with the exceptions of rough pulse and sunken pulse) and that the AC1 measure of agreement between the 2 experts was generally high.
Conclusions:
Pulse diagnosis is regarded as one of the most important procedures in TKM, despite the aforementioned limitations. This study reveals that the interobserver reliability in making a pulse diagnosis in stroke patients is not particularly high when objectively quantified. Additional research is needed to help reduce this lack of reliability for various portions of the pulse diagnosis.
Introduction
Of the four diagnostic processes, pulse diagnosis, which belongs to the palpation diagnostic processes and has been practiced for more than 2000 years, is widely regarded as a core component of the diagnostic framework of Traditional East Asian Medicine (TEAM), including Traditional Korean Medicine (TKM). 4 –6 This high regard for pulse diagnosis is rooted in the premise that the pulse assessment method is clinically reliable for formulating a diagnosis. Using the pulse diagnosis, pathological changes in a person's body condition can be detected, after which the clinician can treat the patient. 6 However, there is currently no direct evidence to either support or refute the aforementioned premise. This lack of evidence is because the clinical competence of performing a pulse diagnosis is dependent on the experience and knowledge of the clinician; moreover, the pulse diagnosis can be affected by a variety of factors such as emotion, activity, diet, biorhythm, and season. 7 Thus, although many experimental studies have detected the pulse using mechanic tools in an attempt to obtain an accurate pulse diagnosis, and although studies have attempted to assess the reliability in order to standardize and objectify the pulse diagnosis, 8 –11 the majority of these studies cannot be considered scientifically and/or quantitatively reliable.
Stroke is the second most common cause of death in Korea. 12 In Korea, many stroke patients receive traditional medical care, as Korea has its own system of traditional alternative medicine, called TKM, the role of which has been emphasized in stroke management. 3 As a part of the fundamental study for the standardization and objectification of PI in TKM for Stroke (SOPI-Stroke) to develop Korean standard differentiation of the symptoms and signs (KSDSS) of stroke, a committee that is composed of experts at Oriental medicine hospitals and researchers at the brain disease research center of the Korea Institute of Oriental Medicine (KIOM) has participated in the study since 2005. 13 –15 A number of studies have shown that pulse diagnosis in particular plays important roles in both the treatment and prognosis for patients with stroke. 16 –21 The reliability of TKM pulse diagnosis was planned as one of the subdivisions of the KSDSS of stroke.
In this study, the reliability of TKM pulse diagnosis was investigated in stroke patients by evaluating the interobserver reliability in measuring the pulse sign by TKM practitioners.
Methods
The data for this study were collected as part of a multicenter study for the standardization of stroke diagnosis in Korea. Stroke patients who were admitted to nine Oriental medical university hospitals participated in this study from February 2010 through December 2010. Each patient provided informed consent to undergo procedures that were approved by the respective institutions' Institutional Review Boards (IRB). Stroke patients were enrolled within 30 days of the onset of their symptoms, provided their diagnosis was confirmed by an imaging diagnosis such as computerized tomography or magnetic resonance imaging. Patients with traumatic stroke such as subarachnoid, subdural, and epidural hemorrhage were excluded from the study. This study was approved by the IRB of the KIOM and by each Oriental medical university hospital's IRB.
Each patient was seen by 2 experts at the same department within each site. A total of 18 experts who were well trained in standard operation procedures (Appendix) were participating in this study. The experts had at least 3 years of clinical experiences with stroke after finishing regular college education in the subject of TKM for 6 years. Each patient received an examination of the status of the pulse, pulse location (i.e., floating pulse or sunken pulse), pulse rate (i.e., slow or rapid), pulse force (i.e., strong or weak), and pulse shape (e.g., string-like, slippery, fine, rough, or surging). The examination parameters were extracted from portions of a case report form for the standardization of stroke diagnosis that was developed by an expert committee organized by the KIOM. Theses assessments were conducted individually without discussion between the 2 experts and made on the same day without delay to minimize the time difference between the former and the latter diagnosis. Early morning was chosen to be the best time for pulse-taking, and the patients were allowed to rest at least 10 minutes for a stable pulse-taking. The grading of the severity of each variable was based on the following scores: 1=very significant, 2=significant, 3=not significant. Furthermore, as suggested by the KIOM, the clinicians were required to measure the stroke PI of each patient according to the Fire-Heat pattern, the Dampness-Phlegm pattern, the Blood Stasis pattern, the Qi Deficiency pattern, or the Yin Deficiency pattern. 3,13 –15
A total of 658 stroke patients were enrolled in the study. Thirty (30) patients were excluded from the analysis because of a PI that was omitted by one of the 2 TKM clinicians. A total of 452 stroke patients received a PI assessment with the same resulting opinions by the raters with the following distribution: Fire-Heat pattern (n=147), Dampness-Phlegm pattern (n=158), Yin Deficiency pattern (n=80), and Qi Deficiency pattern (n=66). The Blood Stasis pattern was excluded because the sample size for this PI was too small (n=1) (Fig. 1).

Flow diagram of patients enrolled in the study. PI, pattern identification.
Interobserver reliability was measured using the following three methods: simple percentage agreement, Cohen's κ coefficient and Gwet's AC1 statistic (and their corresponding confidence intervals). The κ value is typically used to measure the level of agreement beyond that which would be expected by chance and provides a measure of interobserver reliability. 11 In general, definitive κ interpretations have been proposed. 22 –27 However, for most purposes, a value ≤0.40 represents “poor” agreement, a value between 0.40 and 0.75 represents “moderate” to “good” agreement, and a value ≥0.75 indicates “excellent” agreement. 27 The AC1 statistic is not vulnerable to the well-known paradoxes that plague κ. 28 –30 First, interobserver reliability for pulse signs among all of the subjects was calculated using simple percentage agreement, Cohen's κ coefficient and Gwet's AC1 statistic. Next, interobserver reliability regarding PI that had the same opinion between the raters was calculated in the same way. The data were statistically analyzed with SAS software, version 9.1.3 (SAS Institute Inc., Cary, NC).
Results
The interobserver reliability results with regard to pulse signs for all of the subjects (n=628) are presented in Table 1. The κ value measure of agreement between the 2 experts ranged from “poor” (κ=0.19) to “moderate” (κ=0.49). In contrast, the AC1 measure of agreement between the 2 experts was generally high for pulse signs, ranging from 0.65 to 0.93 (with the exception of slippery pulse, which had an AC1 of 0.38). In most cases, agreement as assessed by the κ values was considerably lower than agreement as assessed by the AC1 values.
CI, 95% confidence interval.
The results of the interobserver reliability for subjects with a pattern that yielded the same opinion between the raters are presented in Table 2. The κ measure of agreement for pulse signs for the subjects of pattern generally ranged from moderate to good, with κ values ranging from 0.40 to 0.49; two exceptions were rough pulse and sunken pulse, which yielded κ values of κ=0.17 and 0.34, respectively. Moreover, the AC1 measure of agreement between the 2 experts was generally high for pulse signs, ranging from “moderate” (AC1=0.41) to “excellent” (AC1=0.94).
CI, 95% confidence interval.
The interobserver reliability results for the subjects of each PI are presented in Table 3. The κ measure of agreement for the subjects with the Fire-Heat pattern was generally low (κ=0.18–0.39) with regard to pulse location (i.e., floating pulse or sunken pulse) and pulse shape, with the exception of fine pulse (κ=0.55). However, the AC1 measure of agreement between the 2 experts was generally quite high for pulse signs, ranging from “moderate” (AC1=0.45) to “excellent” (AC1=0.94). In addition, the κ measure of agreement for the subjects with the Dampness-Phlegm pattern was generally low (κ=0.04–0.38), with the exception of surging pulse (κ=0.56). However, the AC1 measure of agreement between the 2 experts was generally quite high for pulse signs, ranging from “moderate” (AC1=0.46) to “excellent” (AC1=0.98). The AC1 measure of agreement for the subjects with the Qi Deficiency pattern was generally quite high (AC1=0.51–0.97), with the exception of weak pulse (AC1=0.31).
PI, pattern identification; CI, 95% confidence interval; FH, Fire-Heat pattern; DP, Dampness-Phlegm pattern; QD, Qi Deficiency pattern; YD, Yin Deficiency pattern.
Discussion
In TEAM, including TKM, it is generally believed that the wrist pulse conveys important information regarding an individual's health status, and the pulse diagnosis has long been used. However, the practice of pulse diagnosis has caused confusion in the modern context because there is little evidence that is based in clinical fact, and there is a shortage of precision with regard to the historical pulse literature as a reliable means for the interpretation of pulse.
Pulse diagnosis has played a prominent role in the diagnosis and subsequent treatment of stroke and has attracted increasing attention in Oriental medicine. In China, a review by Su 16 discussed the important role that pulse diagnosis plays in the diagnosis and treatment of stroke by explaining the string-like pulse, slippery pulse, and fine pulse in the diagnosis of stroke. Liu 17 and Cui 18 reported the frequency of several pulse types in patients with stroke, including string-like pulse; string-like pulse plus moderate, fine, rapid, or slippery pulse; intermittent plus bound pulse; slippery pulse; and sunken plus fine pulse. In Korea, Cho et al., 19 in seeking important factors that affect the prognosis of stroke, observed the pulse location, pulse rate, and irregularity in 132 stroke patients within 30 days of onset. Shin et al. 20 used a pulse analyzer in an attempt to objectively classify pulse signs by analyzing the pulse wave in 43 stroke patients within 7 days of onset. Lee et al. 21 analyzed the distribution of pulse indicators with regard to PI in 764 stroke patients to evaluate the value of using pulse diagnosis as an indicator for the classification of the PI in stroke patients. These results revealed a meaningful relationship between the pulse diagnosis and the PI of stroke. Kim et al. 3 attempted to standardize the Oriental medical PI for stroke patients using logistic regression. Interestingly, they found that all of the patterns in their study essentially included pulse and tongue diagnosis in their final equations.
However, traditional pulse diagnosis has many limitations that stem from the clinical skill of pulse diagnosis, which depends on the clinician's experience and knowledge, and environmental factors have a large influence on the clinician's willingness to obtain diagnostic results from the pulse, which is more heavily affected by ephemeral influences than either the tongue diagnosis or other forms of diagnosis. Specifically, the pulse diagnosis can be transiently affected by emotion, pernicious influences, acute illness, severe activity, medication, diet, a full bladder, an imminent or current menstrual flow, biorhythm, the season of the year, and even the time of day. Therefore, it is essential to establish an objective diagnostic standard for pulse diagnosis among clinicians. However, there is currently little agreement among clinicians with regard to analysis.
Cole et al.* asserted that the reliability and validity of pulse diagnosis are generally poor. In contrast, King et al. 10 found that when using a standardized pulse-taking procedure with clear operational definitions, the agreement between 2 practitioners was higher than 80% for 10 of 16 pulse categories. Similarly, the conclusions of various studies vary widely and include a low level of reliability of pulse diagnosis, moderate agreement, or extremely high agreement. In a general review, O'Brien et al. 31 suggested that as the level of complexity of pulse detection increases, the reliability of pulse diagnosis decreases. The subjects in the studies regarding interobserver reliability of pulse diagnosis that were reviewed included hypercholesterolemia and cystic fibrosis patients and primarily normal groups. 11,31 However, few studies have investigated the reliability of pulse diagnosis in patients with stroke.
The data for the present analysis were collected as part of a multicenter study of the standardization of stroke diagnosis in Korea. In this study, to evaluate interobserver reliability in the pulse status in stroke patients (which was assessed by TKM clinicians), interobserver reliability in total subjects (or subjects of pattern with the same opinion between the 2 raters) was calculated as a simple percentage agreement, κ value and AC1 statistic. When investigating agreement between observers, clinicians have long used κ and other chance-adjusted measures together with a commonly used scale to interpret κ that was derived by Landis and Koch in 1977. 23 However, the suitability of κ as a measure of agreement has recently been debated. 29,30 The AC1 statistic is a relatively new measure that has been suggested by Gwet to adjust for chance in agreement studies. 28,32
As a result of interobserver agreement among all of the subjects, we determined that seven items had poor values, whereas four items had moderate to good values (Table 1). However, five of the seven items—including floating, slow, string-like, slippery, and surging pulse—were close to a κ value of 0.4. In particular, rough pulse had an extremely poor κ value relative to the other items but did not have a poor value in the agreement percentage or AC1 value. It was determined that many of the clinicians checked “3=not significant” because of the difficulty in detecting a rough pulse and its low-frequency appearance. Therefore, contrary to the κ value, the agreement percentage and AC1 values were high (93.29% and 0.93, respectively). In the interobserver agreement for the subjects who were classified into the same pattern between the raters, slightly higher κ values were observed. Only 2 of the 11 items had poor values, and the others had moderate to good values. However, the rough pulse still had an extremely poor value (Table 2). In the interobserver agreement of each pattern, the 4 of the 11 items that had moderate to good values included the Fire-Heat pattern, one Dampness-Phlegm pattern, six Qi Deficiency patterns, and five Yin Deficiency patterns (Table 3).
Pulse diagnosis has historically been regarded as one of the most important procedures in TKM, despite the limitations discussed above. The current study shows that the interobserver reliability of pulse diagnosis in stroke patients is poor when objectively quantified. Additional research is needed to help reduce this lack of reliability for various portions of the pulse diagnosis through detailed-oriented criteria and better training of the clinicians. The authors believe that the results of this study will be useful to clinicians in diagnosing stroke.
Conclusions
Pulse diagnosis is regarded as one of the most important procedures in TKM, despite the aforementioned limitations. In this study, to evaluate interobserver reliability in the pulse status in stroke patients who were assessed by TKM clinicians, interobserver reliability was calculated as a simple percentage agreement, κ value, and AC1 statistic. This study reveals that the interobserver reliability in making a pulse diagnosis in stroke patients is not particularly high when objectively quantified. Additional research is needed to help reduce this lack of reliability for various portions of the pulse diagnosis.
Footnotes
Acknowledgment
This research was supported by a grant from the Korea Institute of Oriental Medicine (K11131).
Disclosure Statement
No competing financial interests exist.
*
Cole P. Acupuncture and pulse diagnosis in Great Britain [unpublished Ph. D. thesis]. University of Sussex, 1975.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
