Abstract
Sentiment analysis is widely used to retrieve the hidden sentiments in medical discussions over Online Social Networking platforms such as Twitter, Facebook, Instagram. People often tend to convey their feelings concerning their medical problems over social media platforms. Practitioners and health care workers have started to observe these discussions to assess the impact of health-related issues among the people. This helps in providing better care to improve the quality of life. Dementia is a serious disease in western countries like the United States of America and the United Kingdom, and the respective governments are providing facilities to the affected people. There is much chatter over social media platforms concerning the patients’ care, healthy measures to be followed to avoid disease, check early indications. These chatters have to be carefully monitored to help the officials take necessary precautions for the betterment of the affected. A novel Feature engineering architecture that involves feature-split for sentiment analysis of medical chatter over online social networks with the pipeline is proposed that can be used on any Machine Learning model. The proposed model used the fuzzy membership function in refining the outputs. The machine learning model has obtained sentiment score is subjected to fuzzification and defuzzification by using the trapezoid membership function and center of sums method, respectively. Three datasets are considered for comparison of the proposed and the regular model. The proposed approach delivered better results than the normal approach and is proved to be an effective approach for sentiment analysis of medical discussions over online social networks.
Keywords
Introduction
Dementia is not a specific disease. It is a group of symptoms or medical conditions due to which the cognitive functioning of the brain gets deteriorated. Memory, behavior, day-to-day activities (social activities), and thinking gets worse in due course. According to the World Health Organization 1 (WHO), fifty million people across the globe have Dementia. Alzheimer’s disease is a significant form, and it takes 50–60 percent stake under Dementia, affects older people. Dementia creates havoc not only to the affected person but also to the caretakers, families of the affected, and society, with physiological, psychological, social, and economic impact at large. Various Diseases, injuries that affect the brain, such as stroke or Alzheimer’s, are the sources for Dementia. Symptoms and signs of Dementia are categorized into three stages: early, middle and late stages in which the Quality of Life (QoL) goes to a situation where the affected person will not be able to walk on his own and will become entirely dependent on a caretaker.
There are many forms of Dementia, such as MCI (Mild Cognitive Impairment), Alzheimer’s, vascular, Lewy Body, Parkinson’s, Mixed, and Frontotemporal dementia. Human trials are happening worldwide to find mediation to this global problem, rising year by year. Despite having no cure for Alzheimer’s or any form of Dementia, there are a few preventive measures suggested by the Food and Drug Administration (FDA). Eating nutritious food containing vitamin E and Omega 3 s, having a good sleep, stress management by doing yoga or having a-chai, cognitive activities, being social, maintaining vascular health and a regular check for healthy hearing are few among them. Western countries like the USA and England follow their dementia policies for the public. The United States formed a particular category of clinics named “Memory Assessment Clinic” [1] and helped society. In contrast, the United Kingdom follows “England’s National Dementia Strategy” [2] to help dementia people overcome the problems caused by Dementia. Dementia is one of the Neurodegenerative disorders which require more attention.
A Group of a few social media platforms where people can exchange their views or opinions regarding various activities is called Online Social Networks (OSN). At this juncture, OSN can be used as a social sensor to monitor emerging topics and trends such as topic detection, verification of the trend (spam detection), identifying the potential influencers in the society, identifying people’s emotions (sentiment analysis) towards various trends. OSN as a social sensor works is reusable, modular, and manageable. OSN reduces the time required from the traditional process for data analysis. OSN allows us to analyze in real-time, discover clusters-automatic searches; handles text, image, video, and audio; and the content can be organized using influence, time, location, and sentiment. Table 1 shows the various social networking platforms that people use and share their feelings with the community. OSN has become the only means for daily communications. Political announcements, Entertainment news, sports bulletin’s and many other significant announcements/decisions are circulated in OSN for a faster reach., Twitter stands in fourth position among the most commonly used social media platforms, according to Statista, one of the famous portals for statistics [3].
Popular Social Media Platforms
Popular Social Media Platforms
Along with the regular users for OSN, health care professionals started to consider OSN as a platform for providing care in 2015. Ibbara [4] explained the opportunities and challenges involved in using OSN for patient-doctor procedures and introduced a concept named “Health 2.0” [5].
Bellander [6] discussed the issues involved in gaining knowledge over online social networks from medical consultants and patients. Discussion over OSN might help doctors or healthcare professionals understand the patient views and understand the reasons for some misinterpretations among the general public. Understanding the peoples’ perspectives before attending is essential for health care professionals. Sentiment Analysis (SA) is used to summarize their attention on the statements over OSN. SA is a natural language processing technique to identify the hidden intent in the text. Kumar [7] defined sentiment analysis as quintuple (e i , a ij , S ijkl , h l , t l ). ‘e i ’ is an i th entity of the aspects in ‘a ij ’ given by ‘h‘ at the time ‘t l ‘ which helps to identify the hidden sentiment in the text. Sentiment analysis helps to track and help the patient community by the medical authorities based on the content over OSN [8, 9]. Twitter is the most commonly used platform to share thoughts with friends, family, and the outside world. Twitter is being used as a social media platform to share the medical teams’ information and get the general public’s views [10–17].
Fuzzy relations are best used to avoid ambiguity while classifying [5]. Explain the importance of fuzzy in decision-making in any architecture. A large number of practical and real-world problems are solved using fuzzy in linguistic problems. Machine learning architecture and fuzzy can handle high, low, and medium fuzzy values in a better way [18, 19].
The novelty of this article lies in three steps. (1) To create a proper feature dictionary out of raw data from Twitter (2) To collect the feature library features, split them (3) using a new feature union make the input ready for any machine learning approaches, leading to better accuracy.
The contribution of the article is as follows: Pre-processing needs to be done according to the tweets subjected to predict sentiment in the text. To do that, we designed a domain-specific pre-processing methodology for patient authored text (text written by patient or acquaintance). Proper dictionary with related features was prepared to make the feature engineering process effective. Features were extracted based on HashTag’s and URL present in the textual content and processed with Count Vectorizer, which is different for regular text with term frequency and inverse document frequency (TF-IDF). Different Features obtained out of different feature extraction techniques have to be clubbed together (feature union) by using pipeline concept to assemble several transformers; that can be utilized as a supply for any machine learning approach, resulting in better performance compared to the standard machine learning algorithms. This article’s motivation is to propose a specific pre-processing framework for sentiment analysis over medical text over online social networks.
The rest of this work comprises of following sections. Section 2 provides the related work. Section 3 explains the complete architecture, which includes two phases of Feature engineering in the proposed approach: pre-processing [20], Feature split, and Feature Union through a pipeline to prepare the data for any kind of machine learning approach to obtain better results. Section 4 describes the datasets involved, results obtained, discussion on it, and Section 5 concludes the work.
This section discusses the papers on sentiment analysis of medical discussion on online social networks:
Alamoodi [21–23] discussed how sentiment analysis is used effectively to handle covid situations across the country. The authors discussed the step-by-step process to use sentiment analysis to better deal with infectious diseases.
Patrick [15] explained how Twitter could be leveraged to discover the ache’s multidimensional and qualitative sides. Two important points are considered - know about the tweets’ sentiment and context related to pain and differentiate the social networks’ connectivity based on Twitter. The percentage of tweets with superb sentiment ranging from 14% in Manila to 54% in Los Angeles is discovered. The percentage of tweets with advantageous sentiment ranging from 24% in 1300 to 38% in 2100, with the median as 32%, was discovered. The work shows how seasonal modifications and geopolitical occasions influence tweeters.
Daniulaite [24] developed different techniques of machine learning for the platform of e Drug Trends. Tweets are classified into different types based on the verbal exchange: retail, personal, and official and based on the sentiment as neutral, positive, and negative. The generated tweets are represented in synthetic cannabinoids and cannabis. At present, Twitter is a famous site on the internet, and the posts on Twitter reach up to one billion in just three days. The information regarding the temporal trends is due to the usage of Twitter in large numbers, and it is used in identifying the geographical trends. To classify alcohol, cannabis, and other tweets related to drugs by content and sentiment, manual coding was used. The machine learning models used include Support Vector Machines (SVM), Logistic Regression (LR), and Naïve Bayes (NB). VADER (Valence Aware Dictionary for Sentimental Reasoning) is used to compare the performance of sentimental classifiers.
Bahadorreza [25] shows that monitoring Twitter microblogs can seize probable outbreaks by detecting emotional shifts in consumer tweets. Ebola has been analyzed in London by taking all the emotions from all the tweets. The labels of emotion that are expected to be distributed are based on six of Ekman’s primary feelings. In addition to three non-emotions as sarcasm, news-related, and criticism.
Anorexia [8] used a public sentiment in social media to understand cancer screening tests. People are unscreened due to failure in the diet requirement for some specific tests like a colonoscopy. The other reasons for unscreening include discomfort, embarrassment, and cost. Endemiological studies analyzed the importance of social media concerning health attributes. It offers a deep awareness of health issues and surveys. It also analyses the need to improve screening tests and increase their adherence. Twitter had become one of the excellent tools to create awareness effectively. Tweets were classified as Positive and negative. 20% to 30% were used for testing purposes. 75% of tweets were labeled as training and 25% for testing. The Twitter users pursuing screen tests were less than 45 years of age in equal proportions of males and females. Still, more male users commented on colonoscopy and female users on mammography.
Oksanen [26] identified that Eating disorder is the common name for anorexia from OSN. Pro-anorexia(proana) is a community across Online Social Networks (OSN) forums like Facebook, Pinterest, Flicker, Instagram, and Twitter. Proana always encourages people to go for unhealthy and harmful weight loss methods. While Anti-pro-anorexia is another community that opposes Proana posts and information, both communities exchange war of words about their methods over OSN. They analyzed 12161 comments posted by 800+ registered users over 395 videos posted in the topmost 50 viewed YouTube channels. They mainly focused on three aspects. (a) the characteristics of both proana and anti-proana video uploaders and their videos (b) the comments associated with the video backgrounds like length, upload time, (c) strength of emotional feedback in the comments posted across the videos around either in positive or negative. The search pattern in work for words like “thinspiration”, “anti-thinspiration”, “thinspo”, “proana”, “pro-anorexia”, “anti-thinspo” cover most of the comments considered. They provided insights into the location, gender, age group, and emotional behavior of the samples’ respective video comments. Each of the comments is rated/ categorized by three separate raters, and they are checked for inter-rater agreement called Cohen’s kappa. To calculate the positive intent and negative intent present in the videos’ comments, SentiStrength Tool is used and noted that SentiStrength and raters agreed over 95%. It is noted that the age of these video uploaders ranges from 9 to 102, and most of them are around 27 (mean), which enlightens us that all the video uploaders are youngsters. It is also observed that they do not comment on videos uploaded by others. This work intends to provide information for people who wanted to educate youngsters regarding the sentiments in social media entities over OSN [27, 28]. Few other related works and the concentration (disease selected) of authors were mentioned in Table 2.
Few related Articles: Their approach and topic of discussion considered
Few related Articles: Their approach and topic of discussion considered
In this section, we present the related definitions of the proposed approach and problem definition. Each RSS Feed or Tweet expresses the feelings/sentiment of the writer on a certain topic. People started to post their current medical situations to give a heads up to others, seek others’ experience on a particular medical issue. Time and again, information over Online Social Networks spread by sharing personal feelings on an open platform, leading to influencing others in tackling their situation. This phenomenon changes the perception of personal care or assistance in people who need attention to daily needs. These texts online have to be analyzed to identify users’ sentiments, which helps health care professionals like doctors and medical officers. In this article, we considered tweets from Twitter as sources for conducting experiments. Let ‘T’ be the tweets with medical terms considered T=t1,t2,t3, ... ,t n where these tweets are again classified into tweets with hashtag T # =t # 1,t # 2,..,t # m and URL TU=tu1,tu2,tu3, ... , tup where T # and TU are subsets of T. We infer the sentiments to be S=s1,s2, ... ,s n where s i belongs to neg, neu, pos of T. We believe hashtags and URLs users try to incorporate in their tweets to express their sentiment
Preprocessing
The text from Online Social Network platforms has much noise as the users write to express their feelings colloquially with internet slang, spelling mistakes, acronyms, and other noise. A special Pre-processing strategy is required to remove as they contain terms with medical terms. To reduce the unnecessary features to reduce which makes the proposed feature engineering strategy better, this is explained in section 4.1.
Feature engineering
Feature Engineering consists of two phases (1) Feature Extraction and (2) Feature Union. Feature extraction was based on Hashtag and URL present in the T and extracting features of T, T # , and TU, which results in f T , f # , and f U different set of features channelized into a pipeline to assemble FE. They are subjected to cross-validation and applied for any machine learning approach. In our experience, we observed a significant change of accuracy with this approach.
FFE-MS
Fuzzy-based feature engineering approach uses T to extract features fT, f # , and fU combining to F, which is given to any ‘M’ Machine learning approach and Fuzzified to observe S’s better prediction.
Fuzzy based feature engineering method for sentiment analysis of medical discussion over OSN
This section explains the proposed architecture for feature engineering to yield better results for Sentiment Analysis. The proposed approach collects tweets from the Twitter feed based on a keyword. The feature engineering is done in two phases: (1) removing noise from the tweets collected from the API. (2) Extract, transform and combine the features using feature union; feed these features to any machine learning models for sentiment analysis of medical discussion over online social networks. The pre-processing model is explained in section 4.1.1. Figure 2. Demonstrate the model that includes stemming, lemmatization, Normalization, CSpell, and other techniques involved -in removing noise from the tweets. In our pre-processing phase, the inputs are not changed to the lower case because tweets use a capital letter to stress the words expressing sentiments. After removing noise from the data, we extract three features as explained in the second phase of the Feature engineering.

Proposed Architecture for SA with medical text over OSN.

Phase I of Data Engineering.
This section discusses the importance of feature engineering for the machine learning model and explains the process of feature engineering in our approach. The overall architecture is explained in Fig. 1. According to Forbes [34], every data scientist works 80% of the time on data engineering (data readiness) before building the model. The well-known techniques for Feature Engineering include binning, grouping operations, feature split, extracting date, scaling, log transform, outlier handling, one-hot encoding, and imputation. Before machine learning, feature engineering’s main purpose is to make the dataset(input) ready to develop the model’s better performance. In this approach, we retrieve tweets from the Twitter feed and pre-process the tweets. The noise-free tweets are passed for the subsequent phases of the feature engineering that uses the feature-split technique.
Phase I of feature engineering
In this phase, we are trying to remove the noise or unwanted data in the tweets downloaded from the Twitter API, as shown in Fig. 2. All applications of Natural Language Processing (NLP) require pre-processing of the data. In our pre-processing approach, we do not convert the entire tweet/text into a lower case, wherein most of the methods convert the text to a lower case. We believe that the users use capital letters in their writings when they mean it, identifying the tweets’ sentiment. Our pre-processing activity includes two steps: Basic cleaned tweets and Noise Removed Tweets.
We check for punctuation marks, symbols, emoticons, and extra lines existence in the tweets. If they exist, we remove them from the original tweet/input to reduce the noise. In this process, we are not changing the tweet to lower case, as mentioned earlier. To achieve noise-removed tweets, we check for Internet slang, Stopwords and then remove them. Internet slang will not represent the tweets’ sentiment, so we are removing it from the tweet. People use many acronyms in the feed, including medical abbreviations, acronyms, and synonyms. These are not present in the regular dictionary that comes through NLTK 2 , a widely used dictionary among researchers working on python. Chang [35] created a medical dictionary that comprises abbreviations and acronyms related to medicine updated every year by MEDLINE. Stemming helps in decreasing the size of distinct word forms to reduce redundancy. Stopwords do not carry any kind of sentiment with them. For example, words like ‘is’, ‘in’, ‘of’, ‘but’, ‘again’ etc., in the input reduce accuracy as the number of vectors increases. Hence, we remove stop-words using the NLTK library. After internet slang removal, acronym expansion, stemming, and stopword removal, the basic cleaned tweets are normalized. Normalization makes the text into standard formats. Typing mistakes, typing shortcuts, or spellings mistakes can change the complete meaning of the sentiment. If the word “stomach” is written as “stmach” there will be no meaning attached to the word. To overcome this hurdle, we used CSpell [36]. “CSpell handles non-word errors, real-words errors, word boundary infraction, punctuation errors, informal expression, and combinations of the above and result in high F1 score and real-time performance”.
Phase II of feature-engineering
The input to this phase includes the noise-removed tweets. This phase uses the pipeline concept to perform feature transformation and feature union. The pipeline allows us for easy data handling and reduces the time complexity in the positive or negative classification. Real-time data requires faster processing than the static approach. Feature transformation and feature union are performed as mentioned in Fig. 3. After feature transformation, follows the vectorization. The vectors obtained are merged back in the feature union step. The resulting vectors are fed to the machine learning model. Our proposed model attained better results than the popular machine learning models, as shown in the results section.

Phase II of Data Engineering Process.
The given Features out previous section all the features were exposed to a machine learning algorithm. Features are the input for a Machine Learning algorithm. After the value is attained, they have to be fuzzified with the trapezoid membership function to form the fuzzy rules used to classify them into Si sentiment values for each Ti. Each step of the proposed methodology was explained in Fig. 3.
Experiment & results
This section discusses the experimental setup, datasets used, and the results obtained through the proposed approach.
Experimental setup
Experiments are carried out using Google’s Colab 3 , which uses python 3.0, Google Compute Engine Backend with 12.72GB of RAM, and 110GB of disk space. Colab is accessed via a local host with a 2.3 GHz Quad-Core Intel Core i7 processor with 16 GB of RAM and 512 GB of a solid-state drive.
Datasets
Three datasets are considered in this work. Table 3 shows the datasets’ description used Dataset A and dataset C are crawled from Twitter and are manually annotated. Dataset B is taken from the relab repository and is pre-annotated.
Dataset Description
Dataset Description
As discussed in the earlier section, among the three datasets: A, C datasets are crawled with specific hashtags from Twitter and are manually annotated by two annotators. Annotation accuracy is verified using kohen’s kappa [37]. The best kappa achieved is 81%, and the worst is 78%. Kappa is calculated using Equation 1, where po is observed to be a relative agreement, and pe is the chance of hypothetical probability of agreement. Dataset B is crawled from Twitter from relab repositories with their Twitter ids provided in the relab repository. Dataset descriptions are mentioned in Table 3.
Linear Regression (LR), SVM, RF & MNB were applied in the proposed methodology. In this work, LR is considered for comparison with the proposed architecture. Proposed architecture compared with uni-gram, bi-gram, and tri-gram approaches on LR. Tables 4, 5, 6 represent the Precision(P), Recall (R), and F-1(F) scores between the proposed and LR for all the three datasets with n-grams (uni, bi, tri-grams) while considering a different number of features ranging from 10 to 1810. In the tabular representation in Tables 4, 5, 6, proposed model trained to show the values attend of P, R, and F for all three datasets considered in Table 3. Accuracy plotted for the datasets A and C in Fig. 4, 5. Precision, Recall, F-1, and Accuracy (classification metrics) are calculated using Equations 2, 3, 4, and 5, respectively. In this article, the performance metrics considered are precision, recall, and F1.
Precision Values Uni, Bi, Tri-grams with w.r.t Novel Proposed Approach and Machine Learning model
Recall Values for Uni, Bi, Tri-grams with w.r.t Novel Proposed Approach and Machine Learning model
f1 Measure for Uni, Bi, Tri-grams with w.r.t Novel Proposed Approach and Machine Learning model

Accuracy Comparison between Proposed and Machine Learning Model (LR) with Dataset A. (a) uni-gram. (b) bi-gram. (4c) tri-gram.

Accuracy Comparison between Proposed and Machine Learning Model (LR) with Dataset C. (a) uni-gram. (b) bi-gram. (c) tri-gram.
Accuracy for the proposed framework was 84.07, and for regular ML, it was observed as 79.62, where 1410 features were considered for dataset A from Table 3 while uni-gram features were considered. 83.33% of accuracy was observed for bi-gram features with the proposed approach, and 81.11 were observed for the LR model applied on dataset A. When 1210 features were considered Linear Regression Model showed the difference of 4% with the proposed model; significant improvement when considered with 81.11 for ML and 83.33 for the proposed model for bi-grams are considered as featured; when 810 features were considered, the LR model was with accuracy of 78.88 and proposed outperformed it with 82.89. With fewer features with tri-gram (n-gram) considered, dataset size increasing the proposed model showed study growth of accuracy. Accuracy is measured based on P, R and F1 attained from the confusion matrix obtained after classification. The accuracy achieved by the proposed model over Logistic Regression can be observed in Figs. 4 and 5 over Dataset A and Dataset C, respectively. Precision, Recall, and F-1 Scores of the proposed model and Logistic Regression were represented.
A sentiment analysis framework over medical discussion over online social networks is achieved in the proposed framework by deploying a specific pre-processing methodology and a novel Feature Engineering technique. The Key findings obtained in this approach are (1) by considering number features best features higher accuracy. As mentioned in sections 6 and Figs. 3, 4, 5, the number of features considered is high, and accuracy is attained. (2) Creating a special dictionary for vectorizers allowed the increase the precision in predicting the sentiment score correctly, as mentioned in section 4.1.2 and which is depicted in Table 4. (3) As the size of the dataset is increasing the proposed model shown study increase all the performance metrics as per Tables 4, 5.
Our approach yielded better results because of implementing the proposed feature engineering approach before passing it as the machine learning model’s input. Domain-specific Pre-processing was done by using a proper dictionary with medical terms and considering hashtag & URL as special features. These features are subjected to a different Count Vectorizer (CV) over text with Term Frequency and Inverse Document Frequency (TF-IDF). We were combining features obtained from CV and TF-IDF using the feature union mechanism with the pipeline’s help, which enabled faster data reusability during the model’s training phase than the regular approach with better accuracy, precision, and recall. The results proved the benefits of the proposed approach over the regular machine learning approach for sentiment analysis of medial discussion/ chatter on online social networks. Linear Regression showed lower accuracy than other machine learning approaches—the proposed data engineering framework and fuzzy classification directed towards a better performance.
Conclusion
Sentiment analysis in the medical domain is widely used in patient-care owing to the reason that people started to use social media as a platform to discuss their medical problems with similar people with the same problems. The individuals are also publishing negative statements or wrong statements without knowing the severity of their writings. People might get carried away by such statements. It is challenging for healthcare providers to track these kinds of falsified information and act on them accordingly, and this urges a novel sentiment analysis approach that uses a specific dictionary to provide better features for the machine learning classifier. Sentiment analysis is used to analyze the hidden intent of the writer. This work proposed a Fuzzy based novel Feature engineering architecture for sentiment analysis of medical discussions across the social network resulted in minimum increase of 3 % to 4 % better accuracy than the machine learning approach. The proposed approach is faster than the current machine learning approach because of pipelining, which gives the text sentiment faster in real-time applications. The results proved that the proposed method achieves better results than the existing machine learning approaches. In this approach, we have considered general text vectorizers TF-IDF which considers all the terms like general English terms where bio text vectors may produce efficient results. Bio text vectors like BioSenVec, BioWordVec, and Bio-GloVe with deep learning models can be studied in the future.
