Aspect-based sentiment analysis of mobile reviews

Abstract

E-commerce websites provide an easy platform for users to put forth their viewpoints on different topics-ranging from a news item to any product in the market. Such online content encourages authors to express opinions on various aspects of an entity. Aspect based sentiment analysis deals with analyzing this textual content to look for the aspect in question. After locating the aspects, corresponding sentiment bearing words are looked for. This paper describes an integrated system that generates the opinionated aspect based graphical and extractive summaries from a large set of mobile reviews. The system focuses on three tasks (a) identification of aspects in given field, (b) computation of sentiment polarity of each aspect, and (c) generates opinionated aspect based graphical and extractive summaries. The system has been evaluated on three mobile-reviews dataset and obtains better precision and recall than baseline approach. The system generates summaries from reviews without any training.

Keywords

Aspect-based sentiment analysis extractive summary sentiment summarization

1. Introduction

With the advancement of social media platforms (such as Twitter, Facebook) and e-commerce sites (such as Amazon, Flipkart, etc.), the generation of unstructured data increased rapidly. According to the International Data Corporation (IDC) assessment, 90% of data generated today is accounted as unstructured data. Due to the availability of massive volumes of unstructured data, there is a major challenge for industries and organizations to analyze and to extract meaningful insights out of it. A large volume of unstructured data is human generated in the form of reviews, blogs, forums, pictures, and videos, and other media which do not follow any structure. Generally, reviews are about different items ranging from product and movie reviews to holiday trips and hotel services. The growth of reviews, ratings, online expressions and opinions has turned businesses into a kind of virtual currency to sell the products/services, and identify new flaws to maintain reputation in the market. Many firms and enterprises are now looking into the field of sentiment analysis to understand the voice of the customer, to automate the process of filtering noise and to identify appropriate content to make an appropriate organizational decision.

Sentiment analysis is a natural language processing (NLP) task that utilizes an algorithm formulation to classify a subjective piece of text into either positive, negative or neutral class. Sentiment analysis is achieved at three levels (a) sentence level (to label the sentiment polarity of a sentence), (b) document level (to assign the sentiment label to the whole document) and (c) aspect level (to compute the sentiment polarity of a particular aspect). It becomes more challenging when a review of product contains opinions about multiple aspects of a specific product. The buyers of product are more swayed by the previous consumers’ opinions of various aspects rather than to glance only at star rating of the product. Formerly, large volume of sentiment analysis research have been executed at document level and sentence level. In recent times, sentiment analysis researchers are more motivated towards aspect level as it is more granular. Aspect-level sentiment analysis guides the prospective buyers along with each specific dimension that the product in question might have. A reviewer or user might express his outlook about multiple aspects in an item, while being positive for one and negative for the other. For instance, a customer may be strongly happy with the service quality of a restaurant but might not be happy with the food quality of the restaurant.

However, analyzing sentiments at this granular level is not a straightforward process. Performing aspect-level sentiment analysis presents some challenges. Firstly, there may be comparative opinions, for example A is ‘more good’ than B, or C is ‘better than’ D. Searching these comparison words and then analyzing the sentiment is an interesting process. Secondly, the opinion sentences may be conditional. The user might be providing his suggestion that “if A would have something like this, then it would be really useful”. The challenge is to identify whether the reviewer is positive about the aspect or he is putting some condition forward. Tracing negation is yet another challenge. Generally, layman users don’t express their views via antonym words, and use negative words in conjunction with the positive ones such as ‘not good’, ‘not upto the mark’. Finally, aggregating the sentiments is a useful and important case. Providing aspect level rating or profile is yet not completely subjective. A typical user who reads and writes the reviews, will search for his answers intuitively. There comes the role of text summarization which helps to provide a gist of sentiments. Text summarization is a process to generate concise form of a particular text which presents valuable information to a user in a brief manner. Generally, there are two techniques for text summarization; (a) abstractive summary (understanding and processing the text using advanced natural language techniques to generate the summary) and (b) extractive summary (detecting important fragment from the text to generate summary). Sentiment summarization is quite similar to text summarization, only the difference is, it requires an opinionated text to generate a summary.

In this paper, we have worked on aspect-based sentiment analysis on mobile reviews and generated opinionated aspect based extractive summary of mobile phone. The mobile reviews are crawled from Amazon. An integrated system has been implemented which takes review as input, preprocesses the review, identifies different aspects, computes the sentiment polarity of each aspect and generates the graphical and extractive opinionated aspect based summary of mobile.

The rest of the paper has been organized in the following manner: Section 2 discusses the existing work in literature. Section 3 presents the problem statement. Section 4 describes the proposed system architecture. Section 5 presents the experimental work. The conclusion and future work is presented in Section 6.

2. Related work

Overall, the combined approaches to detect aspects and corresponding sentiment bearing words can be categorized into the following approaches-supervised, unsupervised, hybrid and rule-based. Each of these approaches has its own pros and cons. Some of these approaches have been briefly discussed in this section. Various machine learning (ML) and natural language processing (NLP) approaches exist in literature for aspect identification. Hu and Lu [1, 2] proposed unsupervised approach by employing association rule mining algorithm to extract aspects explicitly. Supervised methods such as Conditional random Field (CRF) and Hidden Markov Model [3, 4], Naïve Bayes [5] and Maximum entropy (ME) [6] have also been used for aspect identification. Two-phase co-occurrence association rule mining method [7] has been applied for matching implicit aspect with explicit aspects. The most common NLP technique which is used commercially for aspect detection is to detect all noun phrases (NP) from a review and select those aspects whose occurrence frequency is greater than the predefined threshold [8]. Agarwal et al. [9] have incorporated this concept with the most popular approach pointwise mutual information (PMI) proposed by Turney [10]. The main purpose is to identify all the NP from text then for each NP, PMI values have been computed from the specified reference set of phrases which is closely associated with the product domain. Only those aspects were selected whose PMI value is greater than a predefinedthreshold [11].

The next task is to identify opinion bearing words from a piece of text and compute their sentiment polarity. Various lexical dictionaries are available to calculate sentiment score of the words. The most popular lexical dictionary is SentiWordNet (SWN) [12], it is publicly available for use. SWN contains three numerical scores: positive, negative and objective associated with each term occurring in WordNet. Several linguistic approaches have been implemented for computation of sentiment score based on SWN [13, 14]. Venumbaka has incorporated SWN with Stanford dependency parser for finding sentiment orientation of words [15]. New lexicon [16] have been developed by merging SWN and MPQA subjectivity lexicon for computation of sentiment orientation of words. Singh et al. proposed two schemes (adverb-adjective combination (AAC) and adverb adjective adverb-verb combination AAAVC) based on SWN [17]. Khan et al. implemented a framework that extracts mutual information from SentiMI developed from SWN [18]. Tweet smiley and hashtags can also be used as sentiment label to reduce manual annotation work, Davidov et al. have used this concept to categorize into either positive class or negative class [19].

Text summarization is a natural language processing task to generate a summary of an entity from a set of documents associated with the entity. Some past work takes an integrated approach to develop summarization system [20, 21]. Gu and Kim have developed an integrated summarization system which classifies reviews aspect-wise and then generate summary [22]. Topic modeling can also be used for summarization [23 –25]. Sentiment Summarization is quite similar to text summarization, but the only difference here is that the set of document holds an opinion about the entity. Dabholkar et al. [26] have proposed the framework to detect and extract important sentences from document, then to compute the sentiment analysis of extracted sentences and generate coherent summary. Li et al. have used user credibility and sentiment quality factors to generate sentiment summary [27]. Various machine learning techniques have been employed for event-based soccer match [28].

This paper proposes to develop an integrated system that extracts reviews from the web, identifies features/aspects about the product, and does sentiment analysis at the aspect level. The system focuses on aspect-based sentiment analysis and opinionated aspect-based extractive summary generation; it involves three subparts (a) identification of aspect from a piece of text (b) computation of sentiment polarity of each identified aspect, and (c) generation of opinionated extractive summary for each aspect from opinionated sentences. Our proposed work is different from existing work in various aspects. First of all, ontology has been created from GSMArena 1 , which is then incorporated with aspect vector to detect the aspects from a given piece of text. The system thus created takes review and mobile name as input, crawls the mobile’s metadata if it is not already existing in the system, preprocesses the review, identifies the aspect using aspect vector and mobile metadata, computes the sentiment score and generates graphical and opinionated aspect based extractive summaries of a particular mobile.

3. Problem statement

In this paper, the work done focuses on aspect-based sentiment analysis of mobile reviews and generation of opinionated aspect based extractive summary of mobile. The problem can be described as follows:-

For each mobile, R = {r₁, r₂, …, r_n}, is the set of reviews.

Each review may contain opinion about different aspects and different aspects can be denoted/described by various terms or phrases in the review text.

For this purpose, we have considered AC_k = {at_1l, at_2l, at_3l, …, at_ml} is the set of aspect terms and AC = {AC₁, AC₂, …, AC_l} is the set of aspect categories.

For instance, a reviewer might mention the word ‘screen’, ‘size’ or ‘touchscreen’, while he actually want to talk about the display of the mobile.

So, ‘display’ can be taken as the aspect category, and words ‘screen’, ‘size’, ‘touchscreen’ are associated aspect terms to display category.

The problem is to identify aspect-based opinion polarity summary for each review. This summary can be generated by aggregating aspect based sentiment score (SS) of each review. SS of an aspect category AC_l (sentiment score at aspect level) in a review is defined as:

$\begin{matrix} SS ({AC}_{l}) = \sum_{j = 1}^{ml} p_{{at}_{j}}, \\ where p_{{at}_{j}} \in [- 1, + 1] \end{matrix}$ (1)

The aspect-level sentiment summary (ASS) for a mobile can be defined as by aggregating SS (AC_l over the set of review R:

$\begin{matrix} {ASS}_{p} ({AC}_{l}) = \frac{1}{n} \sum_{i = 1}^{n} g_{p} (SS ({AC}_{li})), \\ where g_{p} (x) = {\begin{matrix} 1, & x > 0 \\ 0, & otherwise \end{matrix} \end{matrix}$ (2)

and,

$\begin{matrix} {ASS}_{n} ({AC}_{l}) = \frac{1}{n} \sum_{i = 1}^{n} g_{n} (SS ({AC}_{li})), \\ where g_{n} (x) = {\begin{matrix} - 1, & x < 0 \\ 0, & otherwise \end{matrix} \end{matrix}$ (3)

For each review, sentiment score can be computed by aggregating polarities of all aspect categories found in that review. The sentiment score of review (SS) can be defined as:

$SS (r_{i}) = \sum_{k = 1}^{11} SS ({AC}_{k})$ (4)

The review level sentiment summary (RSS) for a mobile can be generated as:

$\begin{matrix} {RSS}_{p} = \frac{1}{n} \sum_{i = 1}^{n} g_{p} (SS (r_{i})), \\ where g_{p} (x) = {\begin{matrix} 1, & x > 0, \\ 0, & otherwise \end{matrix} \end{matrix}$ (5)

And,

$\begin{matrix} {RSS}_{n} = \frac{1}{n} \sum_{i = 1}^{n} g_{n} (SS (r_{i})), \\ where g_{n} (x) = {\begin{matrix} 1, x > 0, \\ 0, otherwise \end{matrix} \end{matrix}$ (6)

The objective is to detect aspect based opinionated graphical summary. An integrated system has been implemented which takes review as input, preprocess the review, identifies different aspects, computes the sentiment polarity of each aspect and generates the graphical and extractive opinionated aspect based summary of mobile. This graphical summary is created by summing up the sentiment polarity of each aspect present in all reviews.

4. Proposed system architecture

Figure 1 shows the architectural block diagram of the proposed system. The system takes the mobile name, and reviews as input, crawls the metadata from GSMArena site if it doesn’t exist in the system, preprocess the review by removing repeated punctuation marks and breaking the review into sentences. POS tagging module tags sentences with parts of speech and passes it to aspect identification module which identifies the aspects from sentences by using mobile metadata and aspect vector. The final step is to compute the sentiment of the polarity of identified aspect and generate the opinionated aspect-based extractive summary of a particular mobile.

Fig.1

Architectural block diagram.

4.1. Construction of aspect vector

As our work is focused on the mobile domain, first we have identified aspect categories of mobile by looking at various mobiles’ metadata/ specification from GSMArena. We have observed around 500 mobile specifications from GSMArena. The names of identified aspect categories are: network, body, display, platform, performance, memory, camera, sound, communications, features, battery, overall and accessories. Next, the words that are indicating mobile aspects are added as aspect terms to the specific aspect category. These aspects terms are gathered from 1000 mobile review dataset. Table 1 shows the final aspect vector. We have also created the ontologies from metadata/ specification of mobile. The ontologies contain the name of mobile, brand of mobile, processor names.

Table 1
Aspect Vector

Aspect Category Aspect Terms

NETWORK GSM / CDMA / HSPA / EVDO / LTE, Network Call Quality, connectivity, network signals, call recording

BODY Dimensions, dimension, build, body, design, designed, weight, build quality, weighs, weighted, iphone design, model, build quality

DISPLAY Type, size, resolution, multitouch, glass, screen, display, touchscreen, touch screen, LED, LCD, touchpad, touch pad, display quality, touch, screen size, touch phone, HD, UI, screen display quality, screen quality

PLATFORM Chipset, os, operating system, CPU, GPU, ios, processing, android

PERFORMANCE Processor, clock speed, cache, startup, bootup, boot up, start up, boots up, starts up, performing, Operating System, performance, run, runs, perform, performs, speed, respond, responds, response, use, keyboard, navigate, navigation, battery performance, phone performance, touch response, processor speed, speed processor, processing, speed performance

MEMORY Card slot, internal, external, RAM, ROM, GB, microsd

CAMERA Primary Camera, Secondary Camera, cam, video, image, images, pic, pic, photo, photos, flash, HDR, panorama, front camera, back camera, cameras, night mode, camera quality, photo quality, iphone photos, phone pics, pictures, dual camera, photography, video recording quality, recording quality, video quality, slow-motion recording, photographs, front camera quality, video quality, picture quality, mobile camera, camera features, portrait mode, portrait

SOUND Alert types, loudspeaker, speaker, speakers, 3.5mm jack, Vibration, proprietary ringtones, stereo, mic, headphone, headphones, audio, sound, voice, microphone, sound quality, audio quality, phone mic, speakers quality, loudspeaker sound. Sound effect

COMMS WLAN, Bluetooth, Wifi, Hotspot, AirDrop, 3G, 4G, LTE, VoLTE, GSM, 2G, WCDMA, TD, FDD, NFC, Radio, USB, GPS, hotspot, Bluetooth connectivity

FEATURES Sensors, messaging, browser, Dual SIM, Fingerprint sensor, Facial Unlock, Digital compass, Ambient light sensor, Accelerometer, Status indicator, E-mail, features, finger print, sensor, finger print sensor, fingerprint sensor

BATTERY Stand-by, standby, backup, back up, power, charger, battery, battery life, adapter, battery backup, battery backup, battery quality, charging, charges, battery charges

Overall Colors, price, budget, price, cost, money, Model, mobile, phone, cellphone, cell phone, system, appearance, update, updates, windows, product, shopping, handset, looks, look, device, smartphone, mobile phone, looking, smart phone, pricing, Lenovo

Accessories Pouch, back cover, flip cover, screen guard, tempered glass, cover, ear piece, packaging

Aspect Category	Aspect Terms
NETWORK	GSM / CDMA / HSPA / EVDO / LTE, Network Call Quality, connectivity, network signals, call recording
BODY	Dimensions, dimension, build, body, design, designed, weight, build quality, weighs, weighted, iphone design, model, build quality
DISPLAY	Type, size, resolution, multitouch, glass, screen, display, touchscreen, touch screen, LED, LCD, touchpad, touch pad, display quality, touch, screen size, touch phone, HD, UI, screen display quality, screen quality
PLATFORM	Chipset, os, operating system, CPU, GPU, ios, processing, android
PERFORMANCE	Processor, clock speed, cache, startup, bootup, boot up, start up, boots up, starts up, performing, Operating System, performance, run, runs, perform, performs, speed, respond, responds, response, use, keyboard, navigate, navigation, battery performance, phone performance, touch response, processor speed, speed processor, processing, speed performance
MEMORY	Card slot, internal, external, RAM, ROM, GB, microsd
CAMERA	Primary Camera, Secondary Camera, cam, video, image, images, pic, pic, photo, photos, flash, HDR, panorama, front camera, back camera, cameras, night mode, camera quality, photo quality, iphone photos, phone pics, pictures, dual camera, photography, video recording quality, recording quality, video quality, slow-motion recording, photographs, front camera quality, video quality, picture quality, mobile camera, camera features, portrait mode, portrait
SOUND	Alert types, loudspeaker, speaker, speakers, 3.5mm jack, Vibration, proprietary ringtones, stereo, mic, headphone, headphones, audio, sound, voice, microphone, sound quality, audio quality, phone mic, speakers quality, loudspeaker sound. Sound effect
COMMS	WLAN, Bluetooth, Wifi, Hotspot, AirDrop, 3G, 4G, LTE, VoLTE, GSM, 2G, WCDMA, TD, FDD, NFC, Radio, USB, GPS, hotspot, Bluetooth connectivity
FEATURES	Sensors, messaging, browser, Dual SIM, Fingerprint sensor, Facial Unlock, Digital compass, Ambient light sensor, Accelerometer, Status indicator, E-mail, features, finger print, sensor, finger print sensor, fingerprint sensor
BATTERY	Stand-by, standby, backup, back up, power, charger, battery, battery life, adapter, battery backup, battery backup, battery quality, charging, charges, battery charges
Overall	Colors, price, budget, price, cost, money, Model, mobile, phone, cellphone, cell phone, system, appearance, update, updates, windows, product, shopping, handset, looks, look, device, smartphone, mobile phone, looking, smart phone, pricing, Lenovo
Accessories	Pouch, back cover, flip cover, screen guard, tempered glass, cover, ear piece, packaging

4.2. Aspect identification

For aspect identification, first, preprocess the review by removing repeated punctuation marks and then breaking it into sentences. Now, to identify the aspect from the sentence, we have used algorithm 1. The pseudocode of algorithm is described below:

1. For each sentence s, s ∈ S

1.1. AspectList = {}

1.2. PS = Parsed s through Stanford POS Tagger

1.3. For each word exist in PS

1.3.1. if(word detect in Aspect Vector(AV))

1.3.1.1. AspectList.add(w, position)

1.3.2. if (word is Noun)

1.3.2.1. Extract Noun Phrase (NP)

1.3.2.2. If ((NP detect in AV) && (NP absent in

AspectList))

1.3.2.2.1. AspectList.add(NP, position)

1.3.2.3. Else If ((NP identified as Aspect through

OD)&&(NP not found in AspectList))

1.3.2.3.1. AspectList.add(NP, position)

1.4. Extract CDP through Stanford Parser

1.5. For each cdp∈CDP:

1.5.1. if ((cdp exists in AspectVector) && (cdp

not found exist in AspectList)

1.5.1.1. AspectList.add(cdp, position)

Here S, OD and CDP represent the set of Sentences, online dictionary, and compound dependency relation respectively. TechTerms 2 used as an online dictionary. We have also used CDP from Stanford dependency parser for aspect identification.

4.3. Sentiment polarity computation of aspects

The main objective is to compute sentiment score of each aspect from the review. We have used algorithm 2 to perform this task. The pseudo-code of algorithm 2 is given below:

1.	For each sentence s, s ∈ S
1.1.	Identify the aspects and locate its position (Algorithm 1 is used)
1.2.	If s contains one aspect
1.2.1.	Traverse five words to the left side of aspect and five words to the right side of aspect in search of an opinion word
1.2.2.	If the encountered word is adjective, AAC schemes [17] is used to compute the sentiment score of an aspect. For example consider two sentence: 1. “Camera is awesome.” and 2. “Touch is very smooth”. Sentence one contain only adjective “awesome” whereas sentence 2 contain adverb “very” and adjective “smooth”. Sentence 1 is talking about “camera” aspect whereas sentence 2 about “touch”. Thus, this rule will be applied to both sentences to compute sentiment polarity of aspects.
1.2.3.	Else if the encountered word is a verb, AVC schemes is used to compute the sentiment score of an aspect. For example consider a sentence “This phone is highly recommended”. This sentence contain adverb “highly” and verb “recommended”. So, this rule will be applied to compute the sentiment polarity of the aspect “phone”.
1.2.4.	Else if the encountered word is adverb then compute the sentiment score from sentiment lexicon. For example “There is no space for memory card”. This sentence contain only adverb “no”. So, this rule will be applied to this sentence to compute the sentiment polarity of aspect “memory card”.
1.3.	If s contains more than one aspect
1.3.1.	For each aspect Ai
1.3.1.1.	Follow steps from 1.2.1 to 1.2.4

We have used generic lexicon [16] derived from MPQA subjectivity lexicon 3 and SentiWordNet.

4.4. Extractive sentiment summary generation

For opinionated extractive summary evaluation, we have created manually five positive and five negative reference summaries for the camera and battery aspects. Rouge-L [33] algorithm has been used for evaluation of summaries. Table 5 shows the opinionated aspect-based extractive summary results. It can be observed from the table, we have achieved 0.435 and 0.415 F-measure values for the camera and battery aspect respectively.

For extractive sentiment generation, cluster the sentences aspect wise. The clustering for each aspect will separate positive sentences into one cluster and negative sentences into another cluster. After clustering, next step is to do a ranking of sentences, for this LexRank [29] algorithm has been used. For each aspect, positive and negative summary is generated from top 5 ranked sentences. Example of positive summary for camera aspect is given below:

“I will suggest this phone to anyone who wants a great camera and performance phone under 15K budget. Camera is good. Best camera to take photos in high quality. Great display and best camera quality in the range. Looks very nice, Smooth performance and Camera Quality also Superb.”.

5. Experimental work

5.1. Dataset

We have carried out our experimental work on three mobile datasets. These datasets are created from mobile reviews which have been crawled from Amazon. Three independent annotators have been engaged to manually annotate the entries of each dataset and finally, reviewed by two annotators. The annotators were asked to annotate each aspect with sentiment polarity identified in the dataset. The quality of annotation has been measured by two standard agreement parameters: Inter-Indexer Consistency (IIC) [30] and Cohen’s Kappa [31]. The datasets are named as D1, D2, and D3 respectively. Table 2 shows the dataset details.

Table 2
Dataset detail

S. No. Dataset #Sentences #Aspect IIC Cohen’s Kappa

1 Apple iPhone Plus Black 128GB (D1) 350 492 96.07% 86.78%

2 Lenovo-Venom-Black-System (D2) 500 738 96.38% 91.96%

3 Honor Blue (D3) 500 670 98% 90.21%

S. No.	Dataset	#Sentences	#Aspect	IIC	Cohen’s Kappa
1	Apple iPhone Plus Black 128GB (D1)	350	492	96.07%	86.78%
2	Lenovo-Venom-Black-System (D2)	500	738	96.38%	91.96%
3	Honor Blue (D3)	500	670	98%	90.21%

5.2. Evaluation

The performance of system is evaluated through standard performance measures: precision (P), recall (R), and F-measure. The macro-averaging method has been used for computation of precision. Table 3 shows the aspect level results. From the table it is observed that the proposed aspect identification algorithm achieved nearly 77% recall, 78% precision and 0.80 F-measure for all three datasets better than baseline approach. The baseline approach takes sentence as input and identify Noun Phrases from sentence as aspects [32]. Table 4 shows the sentiment level results. It is observed that for D1, D2 dataset sentiment computation algorithm achieves nearly 80% precision, 80% recall, and 0.81 F-measure which is comparatively greater than the D3 dataset. In Table 4, Z refers to total number of aspects whose sentiment polarity is found correctly by sentiment computation algorithm. Figures 2–4 are showing the graphical aspect level summary of dataset D1, D2, and D3 respectively. The red bars on the left side depict the negative reviews, while the green bars on the right side depict the positive reviews. Figure 2 depicts that reviewers were completely satisfied and happy with the platform i.e. the operating system of Dataset 1 (iOS). On the other hand, they were not at all satisfied with the accessories that the vendor offered. This kind of graphical summaries is easier to comprehend rather than reading all the reviews and reaching to a conclusion.

Table 3
Accuracy of aspect detection

S. No. Dataset Approaches #Aspect (X) #Observed Aspect Correctly Detected Aspects (Y) P (%) R (%) F-measure

1 D1 Baseline (Noun Phrases) 492 576 329 57.12 66.87 0.616

Proposed Approach 492 473 438 92.60 89.02 0.908

2 D2 Baseline (Noun Phrases) 738 765 413 53.99 55.96 0.550

Proposed Approach 738 697 567 81.35 76.83 0.790

3 D3 Baseline (Noun Phrases) 670 789 421 53.36 62.84 0.577

Proposed Approach 670 727 566 77.85 84.48 0.810

S. No.	Dataset	Approaches	#Aspect (X)	#Observed Aspect	Correctly Detected Aspects (Y)	P (%)	R (%)	F-measure
1	D1	Baseline (Noun Phrases)	492	576	329	57.12	66.87	0.616
		Proposed Approach	492	473	438	92.60	89.02	0.908
2	D2	Baseline (Noun Phrases)	738	765	413	53.99	55.96	0.550
		Proposed Approach	738	697	567	81.35	76.83	0.790
3	D3	Baseline (Noun Phrases)	670	789	421	53.36	62.84	0.577
		Proposed Approach	670	727	566	77.85	84.48	0.810

Table 4

Sentiment level result

S. No.	Dataset	(Y)	(Z)	Confusion Matrix		Correctly Detected Polarity (Z)		Total	P (%)	(R = Z/Y) (%)	F-measure
						P	N
1	D1	438	359	Actual	P	287	20	307	79.97	81.96	0.810
					N	15	37	52
2	D2	567	453	Actual	P	295	30	325	82.97	79.89	0.814
					N	33	95	128
3	D2	566	474	Actual	P	417	25	442	71.05	83.75	0.769
					N	10	20	35

Table 5

Rouge – L aspect level extractive summary result

S. No.	Aspect	Recall	Precision	F-Score
1	Camera	0.486	0.435	0.435
2	Battery	0.367	0.630	0.415

Fig.2

Graphical aspect level summary for dataset D1.

Fig.3

Graphical aspect level summary for dataset D2.

Fig.4

Graphical aspect level summary for dataset D3.

6. Conclusion

Aspect based sentiment analysis provides analysis of users’ opinions at a more granular level. In this paper an integrated system has been developed that generates the opinionated aspect based graphical and extractive summaries from a large set of mobile reviews. The system so developed extracts implicit aspects in a given field, computes sentiment polarity of each aspect, and generates opinionated visual summaries. The system has been evaluated on three mobile reviews dataset. The system generates summaries from reviews without any training. The main objective was to develop such a system that fetches real time user reviews and decipher aspect level sentiments from the reviews. However, the system as of now doesn’t deal with opinion shills that might be a part of reviews collected. This work can be extended by collecting more data about the products that can boost up the sentiment summary. For instance, data about the product sales throughout all the territories can be amalgamated with the sentiment analysis. This would further help to harmonize the sentiment summary results with the actual performance of the product in the market. Similarly, expert opinions about the fidelity of the product by leading technocrats in the field may be gathered by technical reports released about that product. These supplementary information can further make the system more comprehensive.

Footnotes

References

Hu and

Liu , Mining and summarizing customer reviews, in Proceedings SIGKDD, Seattle, WA, USA, 2004, pp. 168–177.

Hu and

Liu , Mining opinion features in customer reviews, in: Proceeding of American Association for Artificial Intelligence Conference, 2004, pp. 755–760.

T.L.

Wong and

Lam , Learning to extract and summarize hot item features from multiple auction websites, Knowledge Information System14(2), lexical and syntactic features. In: Proceeding of the IEEE International, 2008, pp. 143–160.

T.L.

Wong and

Lam , Hot item mining and summarization from multiple auction websites, in Proceedings 5th IEEE ICDM, Washington, DC, USA, 2005, pp. 797–800.

C.C.

Yang ,

Y.C.

Wong and

C.-P.

Wei , Classifying web review opinions for consumer product analysis, In Proceedings of the 11th International Conference on Electronic Commerce-ICEC ’09, 2009, pp. 57–63.

Somprasertsri and

Lalitrojwong , Automatic product feature extraction from online product reviews using maximum entropy with lexical and syntactic features, 2008 IEEE International Conference on Information Reuse and Integration, 2008, pp. 250–255.

Hai ,

Chang and

Kim , Implicit feature identification via co-occurrence association rule mining, Computational Linguistics and Intelligent Text Processing (2011),393–404.

Cambria ,

Schuller ,

Xia and

Havasi , New avenues in opinion mining and sentiment analysis, IEEE Intelligent Systems28(2) (2013), 15–21.

Agarwal ,

Mittal ,

Bansal and

Garg , Sentiment analysis using common-sense and context information, in Computational Intelligence and Neuroscience2015 (2015), 1–9.

10.

P.D.

Turney , Thumbs up or thumbs down?In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL ’ 02, 2001, pp. 417–424.

11.

Jianxing ,

Z.J.

Zha ,

Wang and

T.S.

Chua , Aspect ranking: Identifying important product aspects from online consumer reviews, in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Association for Computational Linguistics, 2011, pp. 1496–1505.

12.

Esuli and

Sebastiani , Sentiwordnet: A publicly available lexical resource for opinion mining, Proceedings of LREC6 (2006), 417–422.

13.

Q.T.

Ha ,

T.T.

Vu ,

H.T.

Pham and

C.T.

Luu , An upgrading feature-based opinion mining model on Vietnamese product reviews, In International Conference on Active Media Technology, Springer Berlin Heidelberg, 2011, pp. 173–185.

14.

Jmal and

Faiz , Customer review summarization approach using Twitter and Senti WordNet, Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics, ACM, 2013, pp. 33–40.

15.

Venumbaka , An Enhanced Feature-based Sentiment Analysis System. Doctoral Dissertation Texas A& M University-Corpus Christi, 2013(Thesis).

16.

T.T.

Thet ,

J.-C.

Na and

C.S.G.

Khoo , Aspect-based sentiment analysis of movie reviews on discussion boards, in Journal of Information Science36(6) (2010), 823–848.

17.

V.K.

Singh ,

Piryani and

Uddin , Sentiment analysis of movie reviews: A new feature-based heuristic for aspect-level sentiment classification, In Automation, Computing, Communication, Control and Compressed Sensing (iMac4s), International Multi-Conference on, IEEE, 2013, pp. 712–717.

18.

F.H.

Khan ,

Qamar and

Bashir , SentiMI: Introducing pointwise mutual information with SentiWordNet to improve sentiment polarity detection, Applied Soft Computing39 (2016), 140–153.

19.

Davidov ,

Tsur and

Rappoport , Enhanced sentiment learning using twitter hashtags and smileys, Proceedings of the 23rd International Conference on Computational Linguistics: Posters Association for Computational Linguistics, 2010.

20.

Priya and

Umamaheswari , Ensemble based Parallel k means using Map Reduce for Aspect Based Summarization, Proceedings of the International Conference on Informatics and Analytics - ICIA-16, 2016.

21.

Zhou ,

Wan and

Xiao , CMiner: Opinion extraction and summarization for chinese microblogs, IEEE Transactions on Knowledge and Data Engineering28(7) (2016), 1650–1663.

22.

Gu and

Kim , What Parts of Your Apps are Loved by Users?’ (T), 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2015.

23.

S.Halima

Banu and

Chitrakala , Trending Topic Analysis using novel sub topic detection model, 2016 2nd International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEE-ICB), 2016.

24.

S.H.

Banu and

Chitrakala , Tweet specific extractive summarization framework towards trending topic analysis, 2016 International Conference on Information Communication and Embedded Systems (ICICES), 2016.

25.

Meng ,

Wei ,

Liu ,

Zhou ,

Li and

Wang , Entity-centric topic-oriented opinion summarization in twitter, Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining -KDD ’12, 2012.

26.

Dabholkar ,

Patadia and

Dsilva , Automatic Document Summarization using Sentiment Analysis, Proceedings of the International Conference on Informatics and Analytics - ICIA-16, 2016.

27.

Y.-M.

Li and

T.-Y.

Li , Deriving Marketing Intelligence over Microblogs, 2011 44th Hawaii International Conference on System Sciences, 2011.

28.

Jai-Andaloussi ,

I.E.

Mourabit ,

Madrane ,

S.B.

Chaouni and

Sekkaki , Soccer Events Summarization by Using Sentiment Analysis, 2015 International Conference on Computational Science and Computational Intelligence (CSCI), 2015.

29.

Erkan and

D.R.

Radev , Lexrank: Graph-based lexical centrality as salience in text summarization, Journal of Arti-ficial Intelligence Research22 (2004), 457–479.

30.

Rolling , Indexing consistency, quality and efficiency,Information Processing Management17(2) (1981), 69–76.

31.

Byrt , How good is that agreement? Epidemiology7(5) (1996), 561.

32.

A.M.

Popescu and

Etzioni , Extracting product features and opinions from reviews, Natural Language Processing and Text Mining, Springer, London, 2007, pp. 9–28.

33.

Lin , Rouge: A package for automatic evaluation of summaries, Proceedings of the ACL-04 Workshop8 (2004).

Aspect-based sentiment analysis of mobile reviews

Abstract

Keywords

1. Introduction

2. Related work

3. Problem statement

4.3. Sentiment polarity computation of aspects

4.4. Extractive sentiment summary generation

5. Experimental work

5.1. Dataset

Table 2 Dataset detail S. No. Dataset #Sentences #Aspect IIC Cohen’s Kappa 1 Apple iPhone Plus Black 128GB (D1) 350 492 96.07% 86.78% 2 Lenovo-Venom-Black-System (D2) 500 738 96.38% 91.96% 3 Honor Blue (D3) 500 670 98% 90.21%

Footnotes

References

Table 2
Dataset detail

S. No. Dataset #Sentences #Aspect IIC Cohen’s Kappa

1 Apple iPhone Plus Black 128GB (D1) 350 492 96.07% 86.78%

2 Lenovo-Venom-Black-System (D2) 500 738 96.38% 91.96%

3 Honor Blue (D3) 500 670 98% 90.21%