Sentiment classification using hybrid feature selection and ensemble classifier

Abstract

This paper presents a Hybrid Feature Selection Technique for Sentiment Classification. We have used a Genetic Algorithm and a combination of existing Feature Selection methods, namely: Information Gain (IG), CHI Square (CHI), and GINI Index (GINI). First, we have obtained features from three different selection approaches as mentioned above and then performed the UNION SET Operation to extract the reduced feature set. Then, Genetic Algorithm is applied to optimize the feature set further. This paper also presents an Ensemble Approach based on the error rate obtained different domain datasets. To test our proposed Hybrid Feature Selection and Ensemble Classification approach, we have considered four Support Vector Machine (SVM) classifier variants. We have used UCI ML Datasets of three domains namely: IMDB Movie Review, Amazon Product Review and Yelp Restaurant Reviews. The experimental results show that our proposed approach performed best in all three domain datasets. Further, we also presented T-Test for Statistical Significance between classifiers and comparison is also done based on Precision, Recall, F1-Score, AUC and model execution time.

Keywords

Classification sentiment analysis genetic algorithm support vector machine machine learning

1 Introduction

Sentiment Classification is the method of extracting information from textual data and classifying it into its respective polarity (positive or negative). Sentiment Analysis (SA) plays a vital role in different aspects of our lives. Even in the US Presidential Elections of the year 2016, SA played a crucial role in developing the election campaign strategy [1]. SA is also a common technique to analyze the customer’s reaction to a product, and in the business sector, companies tend to change their strategy based on customer feedback, which eventually increases their sales. The two most popular ways to carry out the Sentiment Classification task are Dictionary-based (DB) and Corpus-based (CB) methods [2]. The DB approach employs prebuilt dictionaries of words such as WordNet [3], and CB methods use the semantic relation between the words that appear in the corpus dataset.

Apart from these methods, Machine Learning (ML) classification algorithms have also been used to classify the text into different polarities. Recent research shows the use of the Hybrid approach [4] in Sentiment Classification.

In this paper, we have proposed a Hybrid Feature Selection Method using UNION Set Operation [5] and Genetic Algorithm [6]. We have considered UCI ML Dataset [7] of three domains: Movies, Products, and Restaurants. There are 1000 reviews present in each dataset with equal distribution of positive and negative reviews. Each of these are well labeled datasets, and hence we are classifying them using Supervised ML Classifiers. Below are the key contributions of the work carried out in this paper:

We propose a Hybrid Feature Selection Method (FSM) using pre-existing techniques and Optimization by Genetic Algorithm.

We analyzed the proposed FSM using four variants of Support Vector Machine (SVM), namely: Linear SVM (LSVM), Quadratic SVM (QSVM), Fine Gaussian SVM (FGSVM), and Medium Gaussian SVM (MGSVM).

We also proposed an Ensemble Classification Approach considering all three domain datasets. In this, only two SVM variants are used, which showed a minimum error rate for individual polarity class.

The rest of the paper is structured as follows: Section 2 shows related work in Sentiment Analysis domain. Section 3 gives a comprehensive summary of the proposed Hybrid Feature Selection methodology. Experimental setup information is provided in Section 4 and Section 5 describes results and discussion. Finally, Section 6 concludes the paper with future work.

2 Related work

This section gives a brief overview of recent research works built on Supervised Machine Learning, Sentiment Classification Algorithms, and Feature Selection techniques used in Sentiment Analysis [8]. Pang et al. [9] were one of the pioneers in using ML for Sentiment Classification and introduced techniques such as N-Gram and Bag of Words (BOW). In [10], the authors used Term Frequency Inverse Document Frequency (TF-IDF) to convert a text file into numerical vector space, and classifiers were executed using the N-Gram approach. The authors in [11] conducted a comprehensive survey on Sentiment Analysis using various classifiers. The survey highlighted recent applications, improvements in Sentiment Analysis using Transfer Learning, Resource Building, and Emotion Detection.

The authors in [12] considered word features and POS Tagging to classify the review into respective polarity using Naïve Bayes Classifier. Geetika et al. in [13] used the unigram approach to extract adjectives of the word and used both (POS Tagging and Word Feature) as the final feature set. In [14], Naïve Bayes Multinomial (NBM), SVM, & Maximum-Entropy (ME) classifiers have been used for sentiment classification using Unigram, Bigram, and Hybrid N-Gram feature set. In [15], Dave et al. used Bigram and Trigram feature set and trained the model using SVM and NB classifiers on CNET and Amazon Reviews dataset.

In [16], the authors used a novel sentiment classification of sentences using a Rule-Based Approach. Sentiment classification can also apply in various other fields such as Sarcasm Detection, and authors in [17] worked on this application area. They considered slang and emojis present in sentences to classify offensive content. The results they produced show that the use of slangs and emojis in the feature set increases sarcasm detection accuracy. In [18], Taboda et al. proposed the Semantic Orientation Calculator (SO-CAL) that used dictionaries and other factors such as POS, Negations, etc. to find the orientation of the sentiment.

Melville et al. in [19] extracted features using Lexicon methods and used Pooling Multinomial classifier to classify text into respective classes. Nowadays, Twitter is one of the leading data sources for Sentiment Analysis, and authors in [20] used the Lexicon and ML approach to perform Twitter Sentiment Analysis. The results show that using a combination of features such as N-Gram, Lexicon, Punctuations, etc. improves efficiency. Aggarwal et al. in [21] also worked on Twitter Sentiment Analysis. They used POS features and tree kernel approach to classify the tweets and showed that their approach was better than existing baseline classification approaches. Table 1 summarizes past research works done on different domain datasets for sentiment classification.

Table 1
Summary of recent work on sentiment classification

Reference Dataset Approach Classifier Accuracy (%)

[6] IMDB Genetic Algorithm based feature selection for sentiment classification using supervised algorithms NB 75.8

Amazon 77.9

YELP 74.0

[22] IMDB Unigram + overall opinion polarity (OvOp) concept NB 79.5

[23] IMDB Unigram + Linearly combinable paired feature NB 65.9

[24] Amazon Proposed SAIG FSM NB 75.7

IMDB 60.0

[25] IMDB Feature Selection and Feature Weighing using CHI2 and TFIDF SVM 67.9

[26] IMDB Word interaction, context and position information is encoded into a set of sentiment-oriented word interaction vectors PFM 78.9

[27] IMDB Information of both user and product is incorporated into neural network UPNN 43.5

Reference	Dataset	Approach	Classifier	Accuracy (%)
[6]	IMDB	Genetic Algorithm based feature selection for sentiment classification using supervised algorithms	NB	75.8
	Amazon			77.9
	YELP			74.0
[22]	IMDB	Unigram + overall opinion polarity (OvOp) concept	NB	79.5
[23]	IMDB	Unigram + Linearly combinable paired feature	NB	65.9
[24]	Amazon	Proposed SAIG FSM	NB	75.7
	IMDB			60.0
[25]	IMDB	Feature Selection and Feature Weighing using CHI2 and TFIDF	SVM	67.9
[26]	IMDB	Word interaction, context and position information is encoded into a set of sentiment-oriented word interaction vectors	PFM	78.9
[27]	IMDB	Information of both user and product is incorporated into neural network	UPNN	43.5

3 Methodology

This paper proposes a novel Feature Selection approach using Feature Union from IG, CHI and GINI Index and Genetic Algorithm. Our Feature Selection approach runs in two parts. In the first phase, features are selected from Review Dataset using existing feature selection techniques namely: Information Gain (IG), Chi Square (CHI) and Gini Index (GI). Then the selected features are combined with Union Set Operation. Then we applied Genetic Algorithm (GA) to choose best possible features for Sentiment Classification task. In this paper, four variants of SVM namely, Linear SVM (LSVM), Quadratic SVM (QSVM), Fine Gaussian SVM (FGSVM) and Medium Gaussian SVM (MGSVM) are used to train and test dataset from UCI ML repository. To check the effectiveness of our Hybrid Feature Selection approach, IMDB Movie Review, Amazon Product Review and Yelp Restaurant Review datasets are used. The process flow of proposed feature selection approach is shown in Fig. 1 and algorithm 1. Our methodology has five steps: Data Collection, Data Preprocessing, Feature selection using proposed Hybrid approach, Optimizing Feature Selection using Genetic Algorithm, and Classification using SVM variants.

Fig. 1

Process Flow Diagram.

3.1 Data collection and preprocessing

3.1.1 Collection of Data

In this paper, UCI ML Labelled datasets are used to test the proposed FSM using four different SVM classifiers. Before proposed FSM is applied on the dataset, we have applied various preprocessing techniques as discussed in the next subsection.

3.1.2 Data Preprocessing

Following data preprocessing techniques are applied on the dataset to remove irrelevant and noisy entities.

Stop Words Removal

Lovins Stemmer Stemming is done

Tokenizing Sentence (TF-IDF Calculation)

Sentiment Score Calculation using Vader API

3.2 Combining features using union set operation

Sentiment Classification requires a precise FSM to increase the accuracy of the process. The authors in [5] proposed the use of feature combination technique. In our proposed approach, we are using IG, CHI and GI FSM to extract features subsets. We are using UNION Set operation to merge the extracted feature subsets into combined feature set. Let f (f₁, f₂, ... f_n) be the original feature sets extracted after data preprocessing from Review Dataset D. Then IG, CHI and GI FSM are applied on F to extract feature subset namely, f_s1 (f₁₁, f₁₂, ... f_n), f_s2 (f₁₁, f₁₂, ... ... f_n), f_s3 (f₁₁, f₁₂, ... ... f_n) respectively as shown in Equation 1. $F_{set} = F_{s 1} \cup F_{s 2} \cup F_{s 3}$ (1)

Then the combined Feature Set F_SET is passed to Genetic Algorithm to select the optimized features for Sentiment Classification task.

3.3 Optimize feature selection using genetic algorithm

We have selected the GA technique to optimize FSM due to its evolutionary nature, which is appropriate for non-polynomial time problems. The reduced feature set obtained in the above step is then considered input for the Genetic Algorithm step. The above step reduces features to a considerable amount but still scalability remains a problem in large datasets. The use of Genetic Algorithm solves this scalability issue to a large extent. From equation (1), F_SET signifies the number of words present in the corpus and we need to optimize that using GA. Major steps followed to optimize the feature set are as follows:

3.3.1 Population Initialization

The Collection of randomly generated n strings is known as population in GA. In this paper, we have selected population size as 50 and Pi value is set to 0.1.

3.3.2 Selection

In this step, classification accuracy is used as a fitness function to evaluate each generated solution’s quality. In this paper, we have used Tournament method as the selection scheme and size of the Tournament is used as 0.05.

3.3.3 Crossover

This step helps in the production of new off-springs with information exchange process. Crossover is performed between two chosen individuals based on Crossover Probability Pc which is set as 0.6 in this study. After Crossover, the Mutation process is carried out with Mutation Probability Pm, which is set as 0.01 in this work.

4 Experimental work

To perform the experimental work, entire process is converted into three phases namely, Optimized Feature Selection, Training of SVM Classifiers and Testing. The first phase includes Data Collection from UCI ML Repository, preprocessing of data, and reducing features using proposed FSM. We are using datasets from variety of domains (IMDB, AMAZON, YELP) to check the effectiveness of our proposed approach. Before applying FSM, data is preprocessed by Tokenizing, Removal of Stop Words, Stemming by Lovins Stemmer and Generation of SentiScore from Vader API, TF-IDF creation. Once the data is converted into appropriate TF-IDF form, Hybrid Feature Selection approach is applied to obtain reduced feature set. The experiments are carried out on three different UCI ML repository using tenfold (k = 10) cross validation. In 10-fold cross validation, dataset is partitioned into two sets, where 9 folds (k-1) for training the model and 1-fold for testing the model.

Algorithm 1: HYBRID_FEATURE_SELECTION
Input: UCI ML Dataset with Tokens T = (t1, t2, …… tn) and labelled Sentiment value S.
Output: Optimized Feature Set(F_o)
F_s1 = IG (T)
F_s2 = CHI (T)
F_s3 = GI (T)
F_set = F_s1 ∪ F_s2 ∪ F_s3
Let P be the random initial seeded population and g be the number of generations
noofgenerations ← g
count ← 0
whilecount < noofgenerationsloop
GenerateNextGeneration (P, F_set, S)
end loop
return F_o

4.1 Evaluation parameters used

In this paper, evaluation of 4 different SVM Classifier is carried out using Confusion Matrix from which we computed the Accuracy, Precision, Recall and F1 Score metrics. Four entities that are used to calculate the evaluation metrics are: True-Positive (TP), False-Positive (FP), True-Negative (TN), False-Negative (FN).

4.1.1 Accuracy (A)

It is described as the portion of testing dataset that is accurately classified by the classifier. $A = \frac{TP + TN}{TP + TN + FP + FN}$ (2)

4.1.2 Precision (P)

It is defined as the proportion of correctly predicted positive to total predictive positive. $P = \frac{TP}{TP + FP}$ (3)

4.1.3 Recall (R)

It is defined as percentage of correctly predicted positive observations to all the observations in that class. $R = \frac{TP}{TP + FN}$ (4)

4.1.4 F1 Score (F1)

It is defined as the weighted average of Precision and Recall. $F 1 = 2 \frac{P . R}{P + R}$ (5)

4.1.5 ROC(AUC)

It is a plot that presents the classification model performance on all thresholds. X axis of the ROC curve contains False Positive Rate and Y axis contains True Positive Rate. Our classification problem is an example of Binary classification, where higher Area Under the Curve (AUC) value means better classification.

5 Results and discussions

This section gives an in-depth analysis of the proposed Hybrid FSM using SVM Classifier variants: LSVM, QSVM, FGSVM, and MGSVM. The experiment was conducted on three different domain datasets and evaluation results are shown in Table 2. The following subsection discusses the results obtained for each dataset in detail.

Table 2
Evaluation metrics comparison

Dataset ML Classifier Accuracy (%) Precision Recall F1 Score Execution Time (s)

Amazon LSVM 81.2 0.7740 0.8377 0.8046 3.82

QSVM 79.1 0.7600 0.8102 0.7843 104.32

FGSVM 77.8 0.8220 0.7555 0.7874 2.84

MGSVM 80.9 0.7720 0.8337 0.8017 3.00

IMDB LSVM 78.6 0.7600 0.8017 0.7803 50.95

QSVM 70.5 0.5700 0.7808 0.6590 476.62

FGSVM 74.0 0.7540 0.7335 0.7436 3.42

MGSVM 76.4 0.7480 0.7727 0.7602 3.21

YELP LSVM 77.5 0.7540 0.7871 0.7702 103.89

QSVM 62.0 0.3620 0.7479 0.4879 487.60

FGSVM 74.1 0.8020 0.7148 0.7559 3.82

MGSVM 77.4 0.7480 0.7890 0.7680 3.12

Dataset	ML Classifier	Accuracy (%)	Precision	Recall	F1 Score	Execution Time (s)
Amazon	LSVM	81.2	0.7740	0.8377	0.8046	3.82
	QSVM	79.1	0.7600	0.8102	0.7843	104.32
	FGSVM	77.8	0.8220	0.7555	0.7874	2.84
	MGSVM	80.9	0.7720	0.8337	0.8017	3.00
IMDB	LSVM	78.6	0.7600	0.8017	0.7803	50.95
	QSVM	70.5	0.5700	0.7808	0.6590	476.62
	FGSVM	74.0	0.7540	0.7335	0.7436	3.42
	MGSVM	76.4	0.7480	0.7727	0.7602	3.21
YELP	LSVM	77.5	0.7540	0.7871	0.7702	103.89
	QSVM	62.0	0.3620	0.7479	0.4879	487.60
	FGSVM	74.1	0.8020	0.7148	0.7559	3.82
	MGSVM	77.4	0.7480	0.7890	0.7680	3.12

5.1 Results obtained from Amazon dataset

For Amazon Product Review Dataset, confusion matrix is shown in Fig. 2 and ROC Plot in Fig. 3. It has been observed from Table 2 and Fig. 3 that LSVM achieves the maximum Accuracy, Recall and F1 Score of 81.2%, 0.8377 and 0.8046 respectively. However, in case of Precision, best results are obtained by FGSVM with score of 0.8220. Figure 4 depicts the Accuracy Per Class. It has been observed that FGSVM resulted in lowest error of 17.8% for Positive class and LSVM with 15% for Negative class.

Fig. 2

Confusion Matrix for Amazon Dataset.

Fig. 3

ROC Curve for Amazon Dataset.

Fig. 4

Error per Class for Amazon Dataset.

5.2 Results obtained from IMDB dataset

For IMDB Movie Review Dataset, it has been observed from Table 2 that LSVM obtained maximum Accuracy, Precision, Recall and F1 Score of 78.6%, 0.76, 0.8017 and 0.7803 respectively. However, in case of Precision, best results are obtained by FGSVM with score of 0.8220. Confusion matrix is shown in Fig. 5 which helps in identifying the Accuracy of each SVM classifier for both classes i.e. Positive and Negative. ROC Curves and Error Rates for each class is shown in Figs. 6 and 7 respectively. Figure 7 clearly shows that Minimum Error Rate for Positive Class is shown by LSVM with 24% Error while QSVM achieves the lowest error rate of 16% for Negative Class.

Fig. 5

Confusion Matrix for IMDB Dataset.

Fig. 6

ROC Curve for IMDB Dataset.

Fig. 7

Error Per Class for IMDB Dataset.

5.3 Results obtained from Yelp Dataset

For Yelp Restaurant Review dataset, it has been observed from Table 2 that LSVM achieved maximum Accuracy and F1 Score of 77.5%, and 0.7702 respectively. However, in case of Precision, best results are obtained by FGSVM with score of 0.8020. MGSVM Classifier obtains the best score of 0.7890 for Recall evaluation metric. Confusion matrix is shown in Fig. 8 which helps in identifying the Accuracy of each SVM classifier for both classes i.e. Positive and Negative. Figure 9 shows ROC Curve and Error Rate for each class is shown in Fig. 10, which clearly shows that Minimum Error Rate for Positive Class is shown by FGSVM with 19.8% Error and QSVM with lowest error rate of 12.2% for Negative Class.

Fig. 8

Confusion Matrix for Yelp Dataset.

Fig. 9

ROC Curve for Yelp Dataset.

Fig. 10

Error per Class for Yelp Dataset.

We also compare the execution time of each ML Classifier and found out that Gaussian Kernel based SVM Classifiers are faster than LSVM and QSVM. The execution time of FGSVM Classifier is best for Amazon Dataset with 2.84 s. MGSVM executes only in 3.21 s and 3.12 s for IMDB and Yelp Dataset respectively. The results are presented in Figs. 11 and Table 2.

Fig. 11

Execution Time Comparison.

Further, we also carry out statistical significance test using T-Test between various classifiers. The null and alternate hypothesis are:

Both classifiers perform similarly.

One of the classifiers performs differently.

Let a1 and a2 be the Accuracies obtained from two classifiers c1 and c2 respectively and n be the number of samples present in dataset. To perform T-Test, we need total number of correctly identified instances. Let y1 and y2 be the number of correctly identified instances of c1 and c2. $a 1 = \frac{y 1}{n} and a 2 = \frac{y 2}{n}$ (6)

T –Test Statistic is given by following formula: $Z = \frac{a 1 - a 2}{\sqrt{\frac{2 . a (1 - a)}{n}}} where a = \frac{y 1 + y 2}{2 . n}$ (7)

To compare the two classifiers rejection region is selected as Z< -Zα′ where Zα′ is found out from Standard Normal Distribution with significance level of α′= 0.5. The chosen value of α′ helps in identifying the statistical significance of one classifier over other. For α′= 0.5 if value of Z< –1.645 than it can be said with 95% confidence that second classifier is more efficient than first classifier.

Table 3 shows the t-test results for each classifier for the various dataset. T-Test hypothesis shows that LSVM works better than other SVM classifiers for all datasets. For Amazon dataset, we observed that MGSVM is working better than FGSVM with confidence of 95% as per standard normal distribution considered for the study. The table shows that MGSVM and FGSVM are better than QSVM for both IMDB and Yelp datasets.

Table 3

T-Test hypothesis comparison

Classifier B
Dataset	ML Classifier(A)	LSVM	QSVM	FGSVM	MGSVM
AMAZON	LSVM		1.1772	1.8832	0.1711
	QSVM			0.7069	–1.0062
	FGSVM				–1.712^*
IMDB	LSVM		4.1581	2.4188	1.1780
	QSVM			–1.7478^*	–2.9875^*
	FGSVM				–1.2426
YELP	LSVM		7.5453	1.7750	0.0535
	QSVM			–5.8025^*	–7.4932^*
	FGSVM				–1.7216^*

5.4 Results obtained using proposed ensemble classifier of FGSVM and QSVM

This section presents results obtained from the proposed Voting Ensemble Classifier approach to classify the text into sentiment polarity, as shown in Fig. 12. FGSVM and QSVM are used in the proposed approach as they obtained a minimum average error rate of 20.73% and 15.33% as shown in Table 4 for Positive and Negative class respectively. LSVM has shown best results using our proposed hybrid FSM and is chosen as the base classifier to test Voting Ensemble Approach using FGSVM and QSVM. Table 6 summarizes the results obtained using Proposed Ensemble classifier. Figure 13 shows the Accuracy values of our proposed ensemble approach with base classifier LSVM. The results show that our approach gives better results than LSVM. Figures 14, 15 and 16 show the ROC comparison. We have also compared our proposed approach with LSVM using statistical significance T-Test which is shown in Table 5. Statistical Test Hypothesis shows that the proposed Ensemble Approach works better than baseline classifier LSVM and for IMDB and Yelp dataset. Comparing our proposed Hybrid FSM using Voting Ensemble Classification (QSVM, FGSVM) with some of the earlier work (Table 1) carried out in Sentiment Classification shows significant improvement in accuracy.

Fig. 12

Proposed Ensemble Classifier.

Fig. 13

Accuracy Comparison between LSVM and Ensemble Approach.

Fig. 14

Amazon - ROC Comparison of LSVM and Ensemble Approach.

Fig. 15

IMDB - ROC Comparison of LSVM and Ensemble Approach.

Fig. 16

Yelp - ROC Comparison of LSVM and Ensemble Approach.

Table 4

Average Error Rate

ML classifier	Average Error Rate (%)
	Positive class	Negative class
LSVM	23.73	18.06
QSVM	43.6	15.33
FGSVM	20.73	28.66
MGSVM	24.4	19.13

Table 5

T-Test Hypothesis Comparison with Proposed Ensemble Approach

		Classifier B
Dataset	Classifier(A)	Proposed Ensemble Classifier
AMAZON	LSVM	–1.1094
IMDB	LSVM	–0.4945^*
YELP	LSVM	–4.9839^*

Table 6

Results obtained using Proposed Ensemble Classifier (FGSBM, QSVM)

Dataset	Accuracy (%)	Precision	Recall	F1-Score
Amazon	83.1	0.8071	0.8700	0.8373
IMDB	79.5	0.8186	0.7580	0.7871
Yelp	86.1	0.8932	0.8200	0.8551

6 Conclusion

This paper aims to improve Sentiment Classification’s efficiency by proposing the Optimized Sentiment Classification Model using Novel Hybrid Feature Selection Method and Ensemble Classifier. This study explores the combination of three FSM namely, IG, CHI, and GINI. Features selected from each FSM are combined using UNION Set Operation. We further optimized the features chosen by using Genetic Algorithm. To test the efficiency of sentiment classification, four SVM variants: LSVM, QSVM, FGSVM, and MGSVM, are used in this paper. The results show that LSVM outperforms all other classifiers with an accuracy of 81.2%, 78.6%, and 77.5% for Amazon, IMDB and Yelp dataset respectively. Execution Time comparison shows that Gaussian Kernel SVM is faster than LSVM and QSVM. The execution time of FGSVM Classifier is best for Amazon dataset with 2.84 s. MGSVM executes only in 3.21 s and 3.12 s for IMDB and Yelp dataset respectively.

We further improved the accuracy of the sentiment classification task by proposing an Ensemble Classification approach. Our classification approach using FGSVM and QSVM outperforms LSVM which is selected as base classifier with increase of 2.33%, 1.14%, and 11.09% for Amazon, IMDB and Yelp dataset respectively.

References

Ahmad

, Pervaiz

, Mannan

and Zaffar

, Aspect Based Sentiment Analysis for Large Documents with Applications to US Presidential Elections 2016, Social Technical and Social Inclusion Issues (SIGSI) (2017), 13.

Liu

, Sentiment analysis and opinion mining, Synthesis Lectures on Human Language Technologies 5(1) (2012), 1–167.

Miller

G.A.

, WordNet: An electronic lexical database. MIT press. (1998).

Govindarajan

, Sentiment analysis of movie reviews using hybrid method of naive bayes and genetic algorithm, International Journal of Advanced Computer Research 3(4) (2013), 139.

Ghosh

and Sanyal

, An ensemble approach to stabilize the features for multi-domain sentiment analysis using supervised machine learning, Journal of Big Data 5(1) (2018), 44.

Iqbal

, Hashmi

J.M.

, Fung

B.C.

, Batool

, Khattak

A.M.

, Aleem

and Hung

P.C.

, A hybrid framework for sentiment analysis using genetic algorithm based feature reduction, IEEE Access 7 (2019), 14637–14652.

(2015). UCI ML Repository Sentiment Analysis Dataset. Accessed: Dec. 8, 2019. [Online]. Available: http://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences.

Catal

and Nangir

, A sentiment classification model based on multiple classifiers, Applied Soft Computing 50 (2017), 135–141.

Pang

, Lee

and Vaithyanathan

, Thumbs up? Sentiment classification using machine learning techniques, arXiv preprint cs/0205070. (2002).

10.

Tripathy

, Agrawal

and Rath

S.K.

, Classification of sentiment reviews using n-gram machine learning approach, Expert Systems with Applications 57 (2016), 117–126.

11.

Medhat

, Hassan

and Korashy

, Sentiment analysis algorithms and applications: a survey, Ain Shams Eng J 5(4) (2014), 1093–1113.

12.

Mubarok Adiwijaya

M.S.

and Aldhi

M.D.

, Aspect-based sentiment analysis to reviewproducts usingNaïve Bayes. In AIP Conference Proceedings (Vol. 1867, No. 1, p. 020060). AIP Publishing LLC. (2017).

13.

Gautam

and Yadav

, (2014), Sentiment analysis of twitter data using machine learning approaches and semantic analysis, In 2014 Seventh International Conference on Contemporary Computing (IC3) (pp. 437–442). IEEE.

14.

Boiy

, Hens

, Deschacht

and Moens

M.F.

, Automatic Sentiment Analysis in On-line Text. In ELPUB (pp. 349–360). (2007).

15.

Dave

, Lawrence

and Pennock

D.M.

, Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In Proceedings of the 12th international conference on World Wide Web (pp. 519–528). (2003).

16.

Khan

, Baharudin

and Khan

, Sentiment Classification Using Sentence-level Lexical Based, Trends in Applied Sciences Research 6(10) (2011), 1141–1157.

17.

Prasad

A.G.

, Sanjana

, Bhat

S.M.

and Harish

B.S.

, Sentiment analysis for sarcasm detection on streaming short text data. In 2017 2nd International Conference on Knowledge Engineering and Applications (ICKEA) (pp. 1–5). IEEE. (2017).

18.

Taboada

, Brooke

, Tofiloski

, Voll

and Stede

, Lexicon-based methods for sentiment analysis, Computational Linguistics 37(2) (2011), 267–307.

19.

Melville

, Gryc

and Lawrence

R.D.

, Sentiment analysis of blogs by combining lexical knowledge with text classification. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1275–1284). (2009).

20.

Kolchyna

, Souza

T.T.

, Treleaven

and Aste

, Twitter sentiment analysis: Lexicon method, machine learning method and their combination, arXiv preprint arXiv:1507.00955. (2015).

21.

Agarwal

, Xie

, Vovsha

, Rambow

and Passonneau

R.J.

, Sentiment analysis of twitter data. In Proceedings of the workshop on language in social media (LSM 2011) (pp. 30–38). (2011).

22.

Salvetti

, Lewis

and Reichenbach

, Automatic opinion polarity classification of movie reviews, Colorado Research in Linguistics (2004), 17.

23.

Beineke

, Hastie

and Vaithyanathan

, The sentimental factor: Improving review classification via human-provided information, In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04) (pp. 263–270). (2004).

24.

Ong

B.Y.

, Goh

S.W.

and Xu

, Sparsity adjusted information gain for feature selection in sentiment analysis, In 2015 IEEE International Conference on Big Data (Big Data) (pp. 2122–2128). IEEE. (2015).

25.

Larasati

U.I.

, Much Aziz Muslim

I.U.

, Riza Arifudin

I.U.

and Alamsyah

I.U.

, Improve the Accuracy of Support Vector Machine Using Chi Square Statistic and Term Frequency Inverse Document Frequency on Movie Review Sentiment Analysis, Scientific Journal of Informatics 6(1) (2019), 138–149.

26.

Wang

, Zhou

, Fei

, Chang

and Liu

, Contextual and position-aware factorization machines for sentiment classification, arXiv preprint arXiv:1801.06172. (2018).

27.

Tang

, Qin

and Liu

, Learning semantic representations of users and products for document level sentiment classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 1014–1023). (2015).