Abstract
BACKGROUND:
There are 1.8 million lung cancer deaths worldwide, accounting for 18% of global cancer deaths, including 710,000 in China, accounting for 23.8% of all cancer deaths in China.
OBJECTIVE:
To explore the out-of-set association rules of lung cancer symptoms and drugs through text mining of traditional Chinese medicine (TCM) treatment of lung cancer, and form medical case analysis to analyze the experience of TCM syndrome differentiation in its treatment.
METHODS:
The medical records of all patients diagnosed with lung cancer in Nanjing Chest Hospital from January to December 2018 were collected, and the out-of-set association analysis was performed using the MedCase v5.2 TCM clinical scientific research auxiliary platform based on the frequent pattern growth enhanced association analysis algorithm.
RESULTS:
In terms of TCM treatment of lung cancer, the clinical symptoms with high correlation included cough, expectoration, chest distress, and white phlegm; and the drugs with high correlation included Pinellia ternata, licorice root, white Atractylodes rhizome, and Radix Ophiopogonis; with the prescriptions based on Erchen and Maimendong decoctions.
CONCLUSION:
This analytical study of the medical cases of TCM treatment for lung cancer was performed using data mining techniques, and the out-of-set association rules between clinical symptoms and drugs were analyzed, including the understanding of lung cancer in TCM. Moreover, the essence of experience in drug use was gathered, providing significant scientific guidance for the clinical treatment of lung cancer.
Background
According to the Latest Global Cancer Burden Data in 2020 released by the International Agency for Research on Cancer of the World Health Organization [1], 2.2 million new cases of lung cancer occurred globally in 2020, including 820,000 in China. There are 1.8 million lung cancer deaths worldwide, accounting for 18% of global cancer deaths, including 710,000 in China, accounting for 23.8% of all cancer deaths in China.
Non-small cell lung cancer accounts for 80%–85% of lung cancer. Since most patients have no obvious clinical symptoms in the early stage, about 30%–55% of patients are found at a locally advanced stage, most of whom have lost the best opportunity for surgical treatment [2]. Currently, a multidisciplinary comprehensive diagnosis and treatment model has been launched for the treatment of malignant tumors. The combined therapy, including surgery, radiotherapy, chemotherapy, targeted therapy, PD-1/PD-L1 inhibitors (immunotherapy), anti-angiogenesis, and hyperthermia, has significantly improved the five-year survival rate for patients with a variety of tumors. However, regardless of the ground-breaking efficacy, the comprehensive treatment of Western medicine may also cause accumulated toxic reactions, which will be life-threatening in some severe cases. It has always been a focus for oncologists to find a way to prevent treatment-related adverse reactions during anti-tumor therapy.
Lung cancer is often categorized as “lung stagnation,” “inveterate weakness,” “lung amassment,” “cough,” “hemoptysis,” or “chest pain” in traditional Chinese medicine (TCM). Based on the holistic concept and the principle of syndrome differentiation and treatment, TCM has certain advantages in adjusting physical state, alleviating symptoms, reducing the toxic side effects of Western medicine, improving the quality of life, prolonging the survival period, and inhibiting tumor metastasis [3, 4, 5, 6, 7, 8].
The rules of association between clinical symptoms of lung cancer and drugs were analyzed from the perspective of out-of-set association through systematic mining of medical cases of patients with lung cancer treated by TCM clinical experts. This produced significant guidance for the standardized treatment of lung cancer in contemporary Chinese medicine and an improvement in the survival rate and quality of life of patients.
Materials and methods
Data source and processing
The medical records of all patients diagnosed with lung cancer in Nanjing Chest Hospital from January 2018 to December 2018 were extracted. A primary Excel database was constructed to further import the data to the MedCase v5.2 TCM clinical research auxiliary platform, and 289 diagnosis and treatment protocols were extracted to establish the lung cancer clinical research database. This study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of Nanjing University of Traditional Chinese Medicine. Written informed consent was obtained from all participants.
Data inclusion and exclusion criteria
The inclusion criteria were: (1) patients clearly diagnosed with “non-small cell lung cancer” in the medical records; (2) patients adopted TCM treatment intervention plans; (3) patients with complete medical records, including clinical manifestations, diagnosis, pathogenesis and/or treatment methods, and medication.
The exclusion criteria were: (1) no clear core diagnosis in the medical records; (2) repeated visits or incomplete information in the medical records.
Data preprocessing
Because of varying descriptions of the same drug, misspellings, colloquial medical terms, and simplified notes in the medical records, it was necessary to conduct preprocessing. This included the standardization of drug names and medical terms and the correction of misspellings in the original medical record data before data mining and statistical analysis. For noise reduction and data optimization, source-tracing preprocessing was performed for the non-research data noise found in the medical records in the lung cancer database and in the extracted text data, such as obvious wrongly written characters and misspellings, and errors and omissions of units and doses for the symptoms, pathogenesis, treatment methods, medications, physical and chemical examinations, and other data sources. After the preprocessing, a logical and repeated check was undertaken to select relatively accurate and complete medical record data.
Data standardization
Data standardization was performed for the overall data of the preprocessed protocol study database according to the different types of research and analysis algorithms. The standardization of TCM terms was carried out in itemsets, of which the symptom set, tongue picture set, pulse manifestation set, TCM syndrome type set, and treatment set were standardized by referring to the Diagnostics of Traditional Chinese Medicine [9]. The drug set was standardized by referring to the Traditional Chinese Pharmacology [10], and the proprietary Chinese medicine set was standardized by referring to the National Essential Medicine List (2018 edition). Data standardization was performed according to the Data Standardization Standard for TCM Research of Clinical Medical Case Data Mining (QB/GL MCT 102-2019). All data were double-checked and double-marked.
Data analysis
The formatting and encoding of data analysis in this study were performed using the XMiner v1.0 TCM data mining platform [11], a subsystem of MedCase v5.2 TCM Clinical Research Auxiliary Platform, followed by the calculation of data weight according to the text features. After that, data dimension reduction, extreme value processing, standard value parameter adjustment, and mining operation analysis were performed. The visual expression of data was provided according to the Data Analysis Operation Standard For TCM Clinical Case Data Mining Research Data (QB/GL MCT 202-2019).
Data mining results
Study baseline distribution
A total of 289 diagnosis and treatment protocols, comprising 215 persons and 289 medical visits, met the inclusion criteria of this study. These comprised 121 males (56.28% of cases), who made 168 visits (58.13% of visits); and 94 females (43.72% of cases), who made 121 visits (41.87% of visits). A total of 147 types of clinical drugs were involved.
| Serial number | Clinical symptoms | Frequency | Frequency- amplitude | Rate | Serial number | Clinical symptoms | Frequency | Frequency- amplitude | Rate |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Cough | 137 | 0.6954 | 0.2518 | 16 | Vertigo | 5 | 0.0254 | 0.0092 |
| 2 | Expectoration | 61 | 0.3096 | 0.1121 | 17 | Hemoptysis | 5 | 0.0254 | 0.0092 |
| 3 | White sputum | 54 | 0.2741 | 0.0993 | 18 | Abdominal pain | 4 | 0.0203 | 0.0074 |
| 4 | Chest distress | 52 | 0.2640 | 0.0956 | 19 | Shortness of breath | 4 | 0.0203 | 0.0074 |
| 5 | Chest pain | 38 | 0.1929 | 0.0699 | 20 | Viscous sputum | 3 | 0.0152 | 0.0055 |
| 6 | Asthma | 38 | 0.1929 | 0.0699 | 21 | Less sputum | 3 | 0.0152 | 0.0055 |
| 7 | Blood-stained sputum | 21 | 0.1066 | 0.0386 | 22 | Cervical mass | 3 | 0.0152 | 0.0055 |
| 8 | Fever | 18 | 0.0914 | 0.0331 | 23 | Excessive phlegm | 2 | 0.0102 | 0.0037 |
| 9 | Yellow sputum | 15 | 0.0761 | 0.0276 | 24 | Night sweat | 2 | 0.0102 | 0.0037 |
| 10 | Dry cough | 10 | 0.0508 | 0.0184 | 25 | Joint pain or swelling | 2 | 0.0102 | 0.0037 |
| 11 | Asthenia | 9 | 0.0457 | 0.0165 | 26 | Ostealgia | 2 | 0.0102 | 0.0037 |
| 12 | Swallowing obstruction | 8 | 0.0406 | 0.0147 | 27 | Edema of both lower limbs | 2 | 0.0102 | 0.0037 |
| 13 | Hoarseness | 7 | 0.0355 | 0.0129 | 28 | Emaciation | 2 | 0.0102 | 0.0037 |
| 14 | Back pain | 6 | 0.0305 | 0.0110 | 29 | Poor appetite | 2 | 0.0102 | 0.0037 |
| 15 | Vaginal bleeding | 6 | 0.0305 | 0.0110 | 30 | Shoulder pain | 2 | 0.0102 | 0.0037 |
Distribution of TCM syndrome differentiation drugs for advanced NSCLC (frequency
Rule itemset of out-of-set association between clinical symptoms and drugs (support
After all eligible cases were included in the lung cancer database study, the symptoms with a frequency
Frequency spectrum distribution of drugs
After all eligible cases were included in the lung cancer database study, the drugs with a frequency
Out-of-set association between clinical symptoms and drugs
After all eligible cases were included in the lung cancer database study, there were 33 out-of-set association rules obtained when the support was 0.067 and the confidence was 0.3. There were 8 rules highly associated with the cough itemset: white Atractylodes rhizome, Pinellia ternata, barbed skullcap herb, dried tangerine peel, Chinese Angelica, Poria cocos, licorice root, and Radix Ophiopogonis. There were 7 rules highly associated with the expectoration itemset: white Atractylodes rhizome, Pinellia ternata, barbed skullcap herb, Chinese Angelica, licorice root, Scutellaria baicalensis, and Radix Ophiopogonis. There were 7 rules highly associated with the chest distress itemset: white Atractylodes rhizome, Pinellia ternata, barbed skullcap herb, Chinese Angelica, Poria cocos, licorice root, and Radix Ophiopogonis. There were 6 rules highly associated with the white sputum itemset: white Atractylodes rhizome, Pinellia ternata, barbed skullcap herb, Chinese Angelica, licorice root, and Radix Ophiopogonis (Table 3).
Discussion
Clinical fitting analysis
In the era of big data, artificial intelligence technology has been continuously applied to the medical field, improving the efficiency and accuracy of disease diagnosis and treatment [12, 13, 14]. This study utilized the Medcase Ver5.2 diagnostic Chinese medicine clinical research assistance platform to establish a lung cancer clinical research database. According to the symptom distribution of the lung adenocarcinoma database mining study (Table 2), after all eligible cases were included, the symptoms with a frequency
According to the drug distribution of the lung adenocarcinoma database mining study (Table 2), after all eligible cases were included, the drugs with a frequency
There were 33 out-of-set association rules of clinical symptoms and drugs (see Table 3), and the main symptoms for which the rules were generated included cough (8 rules), expectoration (7 rules), chest distress (7 rules), and white sputum (6 rules). The high-association rules were as follows: cough (licorice root, Pinellia ternata, and white Atractylodes rhizome); and expectoration, chest distress, and white sputum (licorice root and Pinellia ternata). The association rules indicated that licorice root, Pinellia ternata, and white Atractylodes rhizome showed an exact efficacy on the common symptoms of lung cancer, especially licorice root and Pinellia ternata.
Licorice root, sweet in taste and neutral in nature, acts on the heart, lungs, spleen, and stomach meridians, and can invigorate the spleen and replenish qi, expel phlegm to arrest coughing, relieve spasms and pain, and coordinate the drug actions of a prescription. Acting gently on the lung meridian, it tonifies qi and moistens the lungs, and is applicable for all new or chronic coughs, regardless of indications of cold, heat, deficiency, or excess.
Pinellia ternata, acrid in taste and warm in nature, acts on the spleen, stomach, and lung meridians, and can eliminate dampness to reduce phlegm, calm the adverse-rising energy to arrest coughing, and dissolve lumps and resolve masses. It is effective for phlegm-damp, cold-phlegm syndrome, such as the accumulation of phlegm-dampness in the lungs, coughs, and sputum, thin or thick white sputum, cough due to cold fluid retention in the lungs, and copious clear thin phlegm. According to Collected Notes to Canon of Materia Medica, Pinellia ternata can eliminate the binding depression of phlegm-heat and qi, and coughs with dyspnea [15]. The Interpretation of Sheng Nong’s Herbal Classic states that Pinellia ternata, like licorice root, can treat wind-phlegm with shortness of breath, indicating that Pinellia ternata and licorice root together have a certain efficacy on coughs and asthma. The TCM master Zhou Zhongying pointed out that the treatment of lung cancer with Pinellia ternata can eliminate airway secretion of sputum and dissipate the visible mass of phlegm-static coagulation of cancer toxins.
White Atractylodes rhizome, sweet and bitter in taste and warm in nature, acts on the spleen and stomach meridians [16]. It can invigorate the spleen and supplement qi, eliminate dampness, and alleviate water retention.
Discussion on the law of syndrome differentiation
There is no mention of lung cancer in ancient Chinese medicine books [17], but “lung retention,” “inveterate weakness,” “lung amassment,” “cough,” “hemoptysis,” and “chest pain” recorded in the ancient Chinese medicine books have similar clinical manifestations to those of lung cancer. Classic on Medical Problems in the Qin and Han Dynasties (221 BCE to 220 CE) records that “Lung retention, also called lung amassment, is located below the right thorax, as big as a cup, for a long time, causing fever with cold aversion, shortness of breath, and cough, and lung congestion.” Yuan-Fang Chao of the Sui Dynasty (581–618 CE) stated in the General Treatise on the Cause and Symptoms of Diseases that “Amassment is caused by the fight between wind evils and qi of zang-fu organs due to the disharmony of yin and yang, and organs’ weakness,” of which the pathogenesis was pathogen detention due to organs’ weakness. The Recipes for Saving Lives indicates that “Everyone has sadness, worry, anger, and joy, but excessive emotion will impair the internal organs, and cause the pulse to go against the four seasons, resulting in the stagnation of qi activity in five organs,” emphasizing the influence of the four seasons on the occurrence of tumors. During the Jin and Yuan Dynasties (1115–1368 CE), He-Jian Liu used Baizhu pills to treat “pneumonectasis, adverse circulation of vital energy in the wrong direction below the costal region, chest distress and shortness of breath, and chest pain on breathing.” Yuan-Su Zhang (ca. 1131–1234 CE) of the School of Yishui said in the Medical enlightenment: “A healthy man has no amassment, but a weak man has it. Spleen and stomach weakness, deficiency of qi and blood, and susceptible at four seasons, all can cause accumulation and stagnation,” which is similar to the view of Dong-Yuan Li (1180–1251 CE) of the School of Invigorating the Earth, attaching importance to the spleen and stomach. Zi-He Zhang’s (ca. 1156–1228 CE) Confucians’ Duties to Their Parents records that “Accumulation and stagnation is either caused by anger, joy, grief, worry, fear, or impairment by overeating sour, bitter, sweet, pungent, and salty food, or stagnation of warm, cool, hot, cold fluid, or the six evils of wind, cold, heat, dampness, fire, dryness,” which comprehensively expounded the etiology of this disease and advocates attacking pathogens in its treatment. Danxi’s Experiential Therapy: Accumulation of Lump in the Abdomen states that “A lump in the middle abdomen is due to phlegm and retained fluid, a lump in the right abdomen is due to food retention, and a lump in the right abdomen is due to stagnation of blood; qi cannot be lumped into aggregates, lumps are tangible things, formed due to phlegm, food retention, and stagnation of blood,” so it is necessary to dispel phlegm and eliminate stagnation, promote blood circulation and remove stasis. Jing Yue’s Complete Work
Professor Ying-Jie Jia [18] had an understanding of lung cancer from the perspective of carbuncles and believed that the surface of tumors was uneven with corrosive ulcers. As pointed out in the Renzhai Zhizhi Fanglun, “Cancers, high and deep, are in the shape of rock caves, drooping in clusters, with deep toxin roots.” The doctrine of epidemic febrile disease was usually used for treatment, and the evil expelling method was established based on triple energizer and defense-qi-nutrient-blood syndrome differentiation. If the lung cancer is in the upper energizer, the lung qi of the upper energizer will be blocked with unresolved phlegm, and it is necessary to open the inhibited lung-energy to eliminate the pathogenic factors with sputum excretion. If the evil qi is in the middle energizer, resolving damp with aromatics, drying dampness with bitter-warm, and inducing diuresis with a bland drug are recommended. If evil qi dwells in the lower energizer, purifying and diminishing lung qi can eliminate pathogenic factors with stools.
Jia-Xiang Liu, a master of TCM [4, 5], strongly supported the view of the ancients that “The deficiency of vital qi will lead to the formation of tumors” and believed that lung cancer was caused by the opportunistic invasion by pathogenic factors due to the lack of vital qi, the imbalance of yin and yang, and the dysfunction of zang-fu organs. In addition, he also recommended the treatment methods of nourishing yin to clear away the lung-heat, tonifying qi and yin, replenishing qi to invigorate the spleen, warming the kidney, and nourishing yin after syndrome differentiation based on the theory of “self-elimination of stagnation by nourishing the vital qi” and the different clinical manifestations in patients.
Professor Pei-Wen Li [19] has in-depth insights into the treatment of lung cancer based on the theory of lung and kidney mutual growth. He believes that the core pathogenesis of lung cancer is the deficiency of both lung and kidney, and the accumulation of pathogenic toxins. In terms of treatment, he attaches importance to coordinating the growth of the lungs and kidneys, regulating and tonifying yin and yang, and commonly uses Baihe Gujin Tang, Qingzao Jiufei Tang, Liuwei Dihuang Tang, Suzi Jiangqi Tang, etc. In terms of lung and kidney regulation and tonifying, it is often combined with soil cultivation and liver nourishing.
Zhong-Ying Zhou, a TCM master expert [20], established a lung cancer differentiation and treatment system with cancerous toxin theory as the core and throughout the whole process. He took cancerous toxin as the key to morbidity, development and treatment of lung cancer. Cancerous toxin and phlegm, blood stasis, heat and other pathological factors mutually produced and bound with each other, consuming qi and yin, which formed the basic pathogenesis of lung cancer-the bind of heat, cancerous toxin, phlegm and blood stasis, and deficiency of qi and yin [21]. In the course of syndrome differentiation and treatment, Ten tumor treatment methods were suggested: regulating qi and eliminating stagnation, dispelling phlegm and stasis, eliminating wind-evil and toxins, relieving internal heat and counteracting toxins, counteracting toxins and eliminating mass, dissipating dampness for diuresis, softening hardness and moistening dryness, tonifying yang for eliminating the abundance of yin, tonifying qi and yin (blood), and strengthening the spleen and stomach.
Hai-Bo Chen [22] proved the experiential effective prescription of Zhong-Ying Zhou: Xiaoai Jiedu decoction could prevent the occurrence and development of colorectal tumors by regulating the expression of Mfsd2a and Ccdc85c and reducing the infiltration of B cells in the microenvironment of colorectal tumor in a mouse model.
Gui-Zhi Sun [23] believed that long-term lung dryness and persistent stagnation would inevitably lead to canceration. Lung cancer can be divided into three types for treatment. The modified Qingzao Jiufei decoction is for patients with impaired lung depuration and damage of body fluid by dryness-heat. The modified Qianjin Weijing decoction is for patients with phlegm-heat storage and burns of collaterals in the lungs. The modified Baihe Dihuang decoction is for patients with dryness-heat impairing the lungs and yin deficiency of the lungs and kidney.
Through the out-of-set association mining of the core rules of association between lung cancer symptoms and drugs, the common clinical symptoms of lung cancer included cough, expectoration, white sputum, chest distress, chest pain, and asthma; and the medications mainly involved licorice root, Pinellia ternata, white Atractylodes rhizome, barbed skullcap herb, and Radix Ophiopogonis. The retrospective analysis of prescriptions indicated that the addition and reduction of TCM adjuvant therapy in patients with advanced lung adenocarcinoma mainly involved the Maimendong decoctions and Erchen decoctions.
Pinellia ternata, a compatible drug (acrid in taste and warm in nature) in the Maimendong decoction, is the corrigent that can prevent the dampness stagnation of yin tonics, nourish but not cause dampness, and warm but not cause dryness. It can check the upward adverse flow of qi and relieve the symptoms of cough and asthma in patients with lung cancer. Furthermore, it can dry dampness to resolve phlegm, as stated in the New Compilation of Materia Medica: “it is the main drug for treating damp phlegm.”
Pinellia ternata and licorice root are two herbs used in the Erchen decoction. The Supplements to Danxi’s Experiential Therapy states, “In this prescription, Pinellia ternata can eliminate phlegm and dry dampness, tomentose pummelo peel can eliminate phlegm and regulate qi, Poria cocos can depress qi and excrete dampness, and licorice root can tonify the spleen and harmonize the spleen and the stomach. Tonifying the spleen to prevent dampness, drying the dampness to prevent phlegm, and regulating vital energy and depressing qi to eliminate phlegm, so it could be said to give both considerations to essence-function, and address both symptoms and root causes.”
The general principle of Dan-Xi Zhu’s treatment of phlegm syndrome is “The treatment of phlegm: tonify spleen soil and dry spleen dampness is the basis of treatment” … The key of Dan-Xi Zhu’s application of this method was that white Atractylodes rhizome, sweet in taste and warm in nature, and of bitter and dryness, could invigorate qi, strengthen the spleen, and dry the moisture, and the spleen prefers dryness to dampness. Most patients with lung cancer suffered from cough and sputum, and white sputum mainly due to the insufficiency of splenogastric qi. So White Atractylodes rhizome showed a high association with all symptoms.
Conclusions
The analytical study of the medical cases of TCM treatment for lung cancer was performed using data mining techniques, and the rules of out-of-set association between clinical symptoms and drugs were analyzed, thus providing scientific guidance for the clinical syndrome differentiation treatment of lung cancer. The highly associated clinical symptoms of lung cancer included cough, expectoration, chest distress, white sputum, and the highly associated drugs included Pinellia ternata, licorice root, white Atractylodes rhizome, Radix Ophiopogonis, and barbed skullcap herb. Through the data mining study, this study analyzed the understanding of lung cancer in TCM, extracted the essence of TCM drug use, and discussed the clinical thinking of ancient and modern doctors, thus providing scientific guidance for the clinical treatment of lung cancer. In order for traditional Chinese medicine to move towards intelligence, it is necessary to have data and evidence support. This study aims to change the diagnosis and treatment mode of TCM and achieve the quantification of TCM. Further it is expected to promote clinical transformation in multiple aspects, and expand research fields outward. We will also further study the dose-effect relationship of drugs and promote the application of QSAR research in drug design. We can apply sequence pattern mining to TCM clinical experience mining, predict the development mode of diseases based on the mining of multiple diagnosis and treatment information of patients, and thus targeted prevention of certain diseases [24].
This study also has certain limitations. The sample size selected in this study is relatively insufficient and does not consider patients with different syndrome types in different regions; This study evaluates the efficacy of the disease through symptom improvement, but lacks relevant auxiliary examination results, making it difficult to comprehensively and systematically evaluate it; This study analyzed the commonly used TCM, drug compatibility, and associations between drug and symptom, without considering drug dosages and drug associations of different stages. In future research, the collection time of medical records and the sample size can be extended. In addition, the medication characteristics of Professor Hong Mei’s treatment of non-small cell lung cancer can be more comprehensively summarized through more mature data mining techniques or TCM inheritance systems.
Funding
The study was supported by the Mechanism of HIF-a/PDK-mediatiated Autophagy Regulation in Radiation-induced Cognitive Impairment (YKK18163), the Six Talent Summit Projects in Jiangsu Province (RJFW-40), the “333 High-Level Talent Training Project” in Jiangsu Province (2018III-0121), the Technology Innovation Fund of Jiangsu Science and Technology Enterprises (BC2015022) and the Horizontal Project of Nanjing University of Chinese Medicine (2019035).
Competing interests
The authors have no conflicts of interest to declare.
