Abstract
Probabilistic topic models are statistical methods whose aim is to discover the latent structure in a large collection of documents. The intuition behind topic models is that, by generating documents by latent topics, the word distribution for each topic can be modelled and the prior distribution over the topic learned. In this paper we propose to apply this concept by modelling the topics of sentences for the aspect detection problem in review documents in order to improve sentiment analysis systems. Aspect detection in sentiment analysis helps customers effectively navigate into detailed information about their features of interest. The proposed approach assumes that the aspects of words in a sentence form a Markov chain. The novelty of the model is the extraction of multiword aspects from text data while relaxing the bag-of-words assumption. Experimental results show that the model is indeed able to perform the task significantly better when compared with standard topic models.
1. Introduction
Sentiment analysis is the computational study of people’s opinions, attitude, emotion or appraisal pertaining to topics, objects, products, services, organizations, individuals and events or their attributes. In the past few years, sentiment analysis for online customer reviews has attracted a great deal of attention from researchers in the fields of data mining and natural language processing [1–22]. From an application perspective, the interest in the topic can be explained by the information needs of both potential customers and commercial parties. For example, potential customers want to know the general opinion of other users before they use a service or buy a product, while companies want to track and monitor the way their products are commented on in order to adjust their offers and/or the marketing of their products. As the number of customer reviews expands, it becomes harder to obtain a comprehensive view of opinions of previous customers about various aspects of services or products through a manual analysis. Proper analysis and summarization of customer reviews can further support potential users and enable companies to check previous positive and negative opinions about specific features or aspects [1]. Therefore it is highly desirable to produce an automatic analysis or summary of customer reviews. One of the main challenges in sentiment analysis is to detect the precise feature or attribute of an object for which an opinion is expressed. In other words, the field requires methods for finer-grained sentiment analysis, which is commonly referred to as aspect-based sentiment analysis [1, 2].
‘disappointing’ and ‘expensive’ in the sentences ‘I found shots with this camera very disappointing’ and ‘It was too expensive for the shots that I got’ are negative sentiment orientations. The sentence ‘I did a good month’s worth of research before buying this rather than similar priced digital cameras’ is an un-opinionated sentence and its sentiment orientation is neutral.
The core of this task is the extraction of pairs of aspect and sentiment. Figures 1 and 2 represent two types of illustration of aspect-based sentiment analysis systems. These figures illustrate two examples of aspect-based sentiment summary modelling. These examples summarize all the reviews of a particular cellphone. For each example we can see which aspects have been taken into account, plus the number of positive and negative review sentences. In Figure 1 we have also highlighted individual review sentences.

Example 1 of an aspect-based sentiment summary.

Example 2 of an aspect-based sentiment summary.
Aspect-based or fine-grained sentiment analysis techniques typically solve the task in two phases [1–3, 7, 8, 10, 11, 14, 15, 17, 18, 20]. The first phase attempts to detect the aspects of object and the second phase classifies and summarizes sentiment over each of these aspects. In this paper we focus on the improvement of a model for the first phase: aspect detection from customer reviews.
Existing aspect detection methods can be broadly classified into two major approaches: supervised and unsupervised [1, 2]. Supervised aspect detection approaches require a set of pre-labelled training data and, although the supervised approaches can achieve reasonable effectiveness, building sufficient labelled data is often expensive and needs much human labour. Since labelled data are not usually available, it is desirable to develop a model that works with unlabelled data. Additionally, owing to the wide range of products and services being reviewed on the internet, supervised, domain-specific or language-dependent models are often not practical. Therefore the framework for aspect detection must be robust and easily transferable between domains or languages. Unsupervised topic modelling using approaches [17–22] such as Probabilistic Latent Semantic Indexing model (pLSI) and Latent Dirichlet allocation (LDA) [23–28] have enjoyed considerable popularity as a way to model latent aspects and topics in textual data. The basic idea in topic modelling is that documents are represented as random mixtures over latent topics, where each topic is characterized by a distribution over words [25]. Although topic models seem to benefit from the correlation between words and topics, the assumption in these models that the order of words in a sentence can be ignored is an oversimplification hypothesis [29–31]. Relaxing the bag-of-words assumption is expected to produce better models for inferring latent aspects, in particular when the structure of sentences in a review document is taken into account. In this paper, sentence structure is covered by combining the information for word order and the semantic relation between words (e.g. co-occurrence patterns, etc.), and by the attribution of more importance to multiword aspects.
In this paper, we present a novel unsupervised approach based on topic modelling which addresses the core tasks necessary to detect aspects from review sentences in a sentiment analysis system. The proposed model is a generative topic model which incorporates the structure of review sentences for detecting aspects. The proposed Aspect Detection Model is based on Latent Dirichlet Allocation (ADM-LDA). However, unlike LDA, it eliminates the ‘bag-of-words’ assumption and differs from existing techniques in that it requires no labelled training data. As it is an unsupervised approach the ADM-LDA model can easily be transformed between domains or languages.
In the remainder of this paper, detailed discussions of existing work on aspect detection will be given in Section 2. Section 3 proceeds by reviewing the formalism of LDA. Section 4 describes the proposed ADM-LDA model, including the overall architecture and specific design aspects. Subsequently we describe an empirical evaluation and discuss the major experimental results in Section 4. Finally we conclude with a summary and some future research directions in Section 5.
2. Related work
Various approaches have been proposed for aspect detection from textual data [2–4, 10, 11, 13–15, 17, 18, 32, 33]. Previously proposed work is based on double propagation [8], unsupervised aspect detection [11, 14, 15] and supervised learning methods [4]. These approaches have the limitation that they do not group semantically related aspect expressions together [1]. Supervised methods, additionally, are often not practical owing to the fact that creating sufficient volumes of labelled data is often expensive and needs much human labour. In contrast, unsupervised topic modelling approaches for identifying aspect words have been shown to be effective [17–20]. Probabilistic topic models consist of a suite of algorithms whose aim is to extract latent structure from large collection of documents. These models all share the idea that documents are mixtures of topics and each topic is a distribution over words. In this paper we propose a new model for identifying aspect words based on the ideas presented using unsupervised topic modelling approaches.
Current topic modelling approaches are computationally efficient and also seem to capture the correlations between words and topics, but they have two main limitations. The first limitation is that they assume that words are generated independently of each other. This is known as the bag-of-words assumption. In other words, topic models only extract unigrams for topics in a corpus. The second limit for current topic modelling approaches is the assumption that the order of words can be ignored. This is an unrealistic simplification. In the past few years several researchers studied the problem to overcome these two limitations. Wallach [30] developed a bigram topic model on the basis of the hierarchical Dirichlet language model, using a hierarchical Bayesian model that integrates bigram-based and topic-based techniques to document modelling. Wallach’s model does not consider unigram aspects and always generates bigrams, while our proposed model extracts unigrams aspects along with bigrams and n-gram phrases. Steyvers and Griffiths [26] proposed a model named the LDA Collocation model. It introduces a new set of parameters to decide whether to generate a unigram or a bigram aspect, although their model does not always generate a reasonable topic for a word or phrase. The proposed model in this paper has the advantage of using structure of sentences in a document meaning co-occurrences and order of words over LDA Collocation model. Wang et al. [31] improved LDA the Collocation model by presenting the Topical n-grams model (TNG) to make it possible to decide whether to form an n-gram for the consecutive words, depending on the nearby context and co-occurrences. Our proposed model improves this model by working with sentences rather than documents and also extracting multiword aspects considering n-grams and assigning the same topic to consecutive words in a sentence. All of these models assume that the words in documents are generated by a latent topic assignment with regard to the n-previous words in the document. Gruber et al. [29] use the same line of research, as well as assuming Markovian relations between latent aspects. They propose a hidden topic Markov model in which all words in the same sentence have the same topic and successive sentences are more likely to have the same topics. Although by using the order of words in the hidden topic the Markov model successfully extracts topics, it does not work well for detecting n-gram aspects. We follow this promising line of research by extending existing topic models for aspect detection in sentiment analysis. We believe that a topic model that considers unigrams and phrases is more realistic and would be more useful in applications. The commonly used approaches for topic modelling assume that the subsequent words in a document or a sentence have different aspects, which is not a well-founded assumption. In our model, in addition to extracting unigrams and phrases for aspects, we assume that the aspects of words in a sentence form a Markov chain and that subsequent words are more likely to have the same aspect. Therefore, in this paper we propose a new topic modelling approach that can automatically extract aspects using the structure of sentences in reviewed documents. The proposed model captures multiword aspects from text data, and relaxes the ‘bag-of-words’ assumption from topic modelling. Our work on aspect detection is designed as an unsupervised model, so as to make it transferable through different domains, as well as across languages.
3. Latent Dirichlet Allocation
Topic models are based upon the assumption that documents are mixture of topics, where a topic is a probability distribution over words. A topic model is a generative model for documents where it specifies a probabilistic procedure based on probabilistic sampling rules that describe how words in documents might be generated on the basis of random variables. Hofmann [27] introduced the probabilistic topic model pLSI to represent documents. The pLSI model is a useful probabilistic modelling of text, but incomplete in that it provides no probabilistic model at the level of documents [23]. This issue of the model is that, from pLSI, each document is represented as a list of numbers, where there is no generative probabilistic model for these numbers. This leads to several issues, such as the number of parameters in the model increasing linearly with the size of the corpus, which causes serious problems with overfitting. Also the pLSI procedure is not clear on how to assign probability to a document outside of the training set [24]. One other issue with the pLSI model is that it does not make any assumptions about how the mixture weights are generated, making it difficult to test the generalizability of the model to new documents [26]. Blei et al. [23] extended this model by introducing Dirichlet priors on parameters of the model, calling the resulting generative probabilistic model Latent Dirichlet Allocation. LDA is a well-defined generative model that generalizes easily to new documents and overcomes pLSI issues by treating the topic mixture weights as hidden random variables [23, 24]. LDA is one of the most popular topic models where its probabilistic procedure connects parameters of documents via a hierarchical generative model. Hence, this section describes LDA from the principles of generative probabilistic models as a basic model for the proposed aspect detection model of sentiment analysis.
Figure 3 shows the graphical model of the LDA. In this graphical notation, nodes are random variables and edges indicate conditional dependencies between variables. Shaded and unshaded variables indicate observed and latent (i.e. unobserved, hidden) variables, respectively, while plates refer to repetitions of sampling steps with the variable in the lower-right corner referring to the number of samples [23–26].

The graphical model for the LDA.
Given a corpus with a collection of D documents, LDA assumes that each word w is associated with a latent topic z. Each of these topics

Definition of generative process in LDA.
Each document in the corpus is a sequence of Nd words and the total corpus length is N. The procedure in Figure 4 implies a joint distribution over the random variables (w, z, φ, θ), which is given by:
where
The goal of LDA is to find a set of model parameters, topic proportions and topic-word distributions. Standard statistical techniques can be used to invert the generative process of LDA, thus inferring the set of topics that were responsible for generating a collection of documents. The exact inference in LDA is generally intractable, therefore approximate inference algorithms are needed for posterior estimation. The most common approaches that are used for approximate inference are Expectation-Maximization, Gibbs Sampling and Variational method [1, 2, 23–34].
4. ADM-LDA: aspect detection model for sentiment analysis
4.1. Model description
ADM-LDA is a novel generative topic model that aims to extract aspects from online reviews. ADM-LDA is an extension of LDA that incorporates the structure of review sentences for mining aspects from the corpus. This model extracts latent aspects from reviews by making use of information of documents as well as the order of words in each document. The ADM-LDA model is similar to the LDA model in tying together parameters of different documents via a hierarchical generative model, but unlike the LDA model it does not assume that documents are a ‘bag-of-words’. In other words, in LDA the positions of individual words are neglected for topic inference [23]. Rather, ADM-LDA assumes that the topics of words in a document form a Markov chain, and that subsequent words are more likely to have the same topic. A Markov chain is a random process that undergoes transitions from one state to another by a transition probability on a state space. It is collection of random variables

The proposed model for aspect detection for sentiment analysis.
ADM-LDA, in addition to the two sets of random variables z and w, introduces a new set of variables x to detect aspects from the review. ADM-LDA assumes that the topics in a sentence form a Markov chain with a transition probability that depends on θ, a distribution
With defining new set of variables x, it is assumed that we have a corpus of D review documents denoted by
Following that, one chooses an aspect

Formal definition of the generative process in ADM-LDA.
The hyperparameters α and β in ADM-LDA can be treated as the prior observation counts for the number of times a an aspect j is sampled in a document, and the number of times words are sampled from single-word aspect j, respectively, before having observed any actual words from that document. Similarly, the hyperparameters τ and δ can be interpreted as the prior observation counts for the number of times a word forms a multiword aspect j with its previous word and the number of times words are sampled from a multiword aspect j, respectively, before any word from the corpus is observed.
The notations of ADM-LDA model are explained in Table 1. Based on procedure in Figure 6 after model parameters have been determined, given a document d, the posterior probability of document d about the latent topic and status variable of each word is defined as:
Notation used in this paper.
There are four sets of latent variables that we need to infer in ADM-LDA, that is, the per-document aspect distribution θ, the per-corpus single-word aspect-word distribution φ, the per-corpus multiword aspect-word distribution σ and the per-word distribution of status variables ψ with regard to previous aspect and word.
4.2. Model inference
In order to estimate the distribution of θ, φ, σ and ψ, we first use Gibbs sampling to estimate the posterior distribution over z and x. According to the Gibbs sampling, each latent variable will be sequentially drawn with a probability distribution conditioned on current assignments for all other latent variables and the observed data [33]. Specifically, in ADM-LDA, the aspect assignment
Letting the subscript
and
The pseudo-code for the Gibbs sampling procedure of ADM-LDA model is shown in Figure 7.

Gibbs sampling procedure of ADM-LDA model.
The Gibbs sampling procedure can be run until a stationary state of the Markov chain has been reached [25, 26, 33]. Markov chain samples are then used to approximate ADM-LDA model parameters. The approximate probability of aspect z in review document d,
The approximate probability of word w in a single-word aspect z is:
The approximate probability of word v in a multiword aspect z given the previous word w is:
The approximate probability of the status variable x = k given the previous word w and the aspect z is:
5. Experimental results
In this section, we describe the evaluation of the proposed ADM-LDA model in a variety of settings, and compare the model with the original LDA model both qualitatively and quantitatively. For the qualitative analysis we present a number of aspects generated by both LDA topic model and our new model ADM-LDA model to show that the ADM-LDA aspects are more informative, coherent and better correlated with the features of an object. For the quantitative analysis we will show that the aspects generated from the ADM-LDA topic model can significantly improve the performance over standard topic model LDA. In the following, data collection, evaluation measure and important experimental results will be discussed.
5.1. Data collection
We employed datasets of Customer Reviews for five products with aspect annotations for the purposes of our evaluation [3]. This dataset focuses on different domains of electronic products – Apex AD2600 Progressive-scan DVD player, Canon G3, Creative Labs Nomad Jukebox Zen Xtra 40 GB, Nikon Coolpix 4300, and Nokia 6610 – and has been widely used by researchers for opinion mining. Table 2 shows the number of reviews, the number of review sentences and the number of manually tagged product aspects for each product in this dataset. Since these five datasets are small for aspect detection in review mining, we crawled many other product reviews from Amazon.com and cnet.com. The details of each dataset are given in Table 3. Newly extracted product reviews are from the same domain as Table 2; the difference is that they are not from the same specific product but from similar series of the product [11].
Summary of customer review dataset.
Detailed information of the four review datasets.
Since the product features in the customer review datasets in Table 2 have already been annotated by human annotators, these annotated product features are grouped manually to form a gold standard for the corresponding domain and we use them as reference values for each dataset. For example, for the cellular phone dataset, the aspects published by Google products are adopted, and all the features are grouped into nine different categories for aspects.
5.2. Evaluation measure
The performance of ADM-LDA model is evaluated using Rand Index measure, which has been used by several researchers [22, 35]. The Rand Index in data clustering is a measure of the similarity between two data clusterings. It calculates agreement between the review aspect category construction results and the annotated group of features. Therefore for each model the accuracy of detected aspects is evaluated using the Rand Index, a standard measure of clustering similarity:
This metric allows for a measure of agreement between two partitions,
For quantitative analysis of the experimented models we compute the Rand Index of a word clustering for a topic based on that analysis with respect to the gold standard. A Rand Index of 1 indicates that the clustering is identical to the gold standard and lower indices indicate worse agreement.
5.3. Accuracy of aspect detection
In our experiments, after preprocessing and extracting the sentences from the textual datasets, and after stop word removal, we used the Gibbs sampling algorithm both for ADM-LDA and LDA and ran the chain for 1000 iterations to produce a sample of latent variables for each of the experiments. Previous studies have shown that topic models are not sensitive to hyperparameters and can produce reasonable results with a simple symmetric Dirichlet prior [18, 26, 32]. During the Gibbs sampling we used empirical values for the symmetric priors
For the qualitative analysis, the top extracted words of the models are presented in Table 4, with five randomly selected aspects, shown as one aspect per line. This table shows the results of aspects for the reviews of digital cameras as an example. We manually assigned labels to coherent aspects to reflect our interpretation of their meaning.
Top words from ADM-LDA and LDA aspects for digital camera reviews.
All top words in five different aspects from the results of ADM-LDA model seem to correspond to rateable aspects, while the results of LDA are of relatively low quality. For example for the aspect ‘appearance’, words such as ‘feels, instances, recommend, and love’ are not so related or coherent. Also LDA could not find the aspects such as ‘shutter’, ‘optical zoom’, ‘manual function’ and ‘battery life’. These aspects are specific details that people evaluate about digital cameras. Hence, the aspects found by LDA tend to be more general and less coherent. This difference comes from ADM-LDA’s assumption, which relaxes the bag-of-words representation for a document and assumes that each sentence exhibits one aspect. Therefore aspects of ADM-LDA are more informative, coherent and better correlated with the features of the object, and the aspects of LDA either seem not semantically coherent or represent the digital camera’s features or attributes.
For the quantitative analysis, we compared four models comprising LDA, ADM-LDA with only unigram aspects, ADM-LDA with only bigram aspects, and ADM-LDA with unigram, bigram and trigram aspects, which we call ADM-LDA with n-gram. All the models were evaluated using Rand Index to measure the average distance between predicted results and manually labelled data. The models were run using different number of aspects, 20, 40, 60, 80, 100 and 120, in four domains.
Figure 8 shows the Rand Index results of ADM-LDA models and the LDA model for the cellular phone, digital camera, DVD player and MP3 player’s reviews. As can be seen from the figure, ADM-LDA with n-grams shows a significant improvement over all other models in all of the datasets. From this figure, we can see that ADM-LDA considering only bigrams has the worst results and ADM-LDA with all combination of features has the best outcomes. Specifically, as an example, in Figure 8a, the results for cellular phone reviews for LDA model goes from 71.36 to 76.61%, whereas it goes from 75.43 to 76.96% for the ADM-LDA model considering only unigram aspects. This figure shows the range of changes for ADM-LDA when considering bigram aspects from 56.74 to 58.09%. Finally, for the ADM-LDA with n-grams the outcome for Rand Index is from 78.17 to 85.85%. Figure 8 shows that, in almost all of the settings for smaller numbers of aspects, the improvements were more than 2% for the digital camera’s reviews, around 2% for the cellular phone corpus, around 1% for the DVD player corpus and around 3% for the MP3 player dataset. With more aspects, the improvements were slightly less, but still 3% for ADM-LDA with n-grams for all of the datasets. After reaching to 100 aspects, the performance of the models dropped slightly. Therefore for almost all cases, 80–100 aspects seem to be optimal.

The test results of Rand Index measure of LDA model and ADM-LDA with three conditions of using unigrams, bigrams and n-grams in four domains of reviews: cellular phone, digital camera, DVD player and MP3 player.
When comparing the models, three observations can be made. Firstly, ADM-LDA with n-gram aspects outperforms other models in all review datasets with multiple aspect settings. Secondly, the performance of ADM-LDA when including only unigrams is better than LDA results. This stems from the assumption in ADM-LDA that the order of words is important for lexical meaning. Finally, LDA has better performance that ADM-LDA with bigrams in all of the datasets. This suggests that using a model with unigrams is more realistic than a bigram topic model which always generates bigrams. After all, unigrams are the major components in a review document.
By analysing Figure 8, the best Rand Index results of different settings for ADM-LDA topic models and the LDA model for four domains extracted in Table 5 can be determined. This table shows that the aspects generated from the ADM-LDA topic model can significantly improve the performance over standard topic model LDA.
Best Rand Index results of aspect detection for the model compared with LDA model.
According to Table 5, comparing the datasets, the highest performance among the models compared is found for the MP3 player review dataset. It is notable that all models perform better when the size of the training dataset is larger. In Table 5 the lowest Rand Index is achieved by ADM-LDA with bigrams and the highest score is for ADM-LDA with n-grams. Overall, from the results it can be concluded that the complete form of ADM-LDA model outperforms other methods in performance.
5.4. Comparing the model with standard topic models
In order to compare the proposed model with the previous standard topic models for the aspect detection problem in sentiment analysis, we utilized Rand Index results of TNG model, LDA Collocation model and Wallach’s model. Wallach’s model is a bigram topic model based on LDA by incorporating the concept of aspect into bigram models. The LDA Collocation model develops an n-gram topic model on the basis of LDA which tries to improve Wallach’s work by extracting unigrams and bigrams. The TNG model is an addition to Wallach’s model and the LDA Collocation model. The TNG model can extract both unigram and bigram aspects, making it possible to decide whether to form a bigram for the same two consecutive words depending on their co-occurrences. The ADM-LDA model is an improvement over TNG in extracting unigrams, bigram and n-grams using the structure of review sentences.
Figure 9 illustrates the comparative evaluation results when we set empirical values for the symmetric priors

The test results of Rand Index measure of ADM-LDA, TNG, LDA Collocation and Wallach’s model in four domains of reviews: cellular phone, digital camera, DVD player and MP3 player.
From Figure 9, the results of using n-gram aspects indicate more effectiveness compared with the results using only bigrams in Wallach’s model. The best result for the Wallach’s bigram model is 60.11% while the best results for n-gram models are 87.58, 82.83 and 80.07% for the proposed ADM-LDA model, TNG model and LDA Collocation model, respectively.
It is interesting to note from Figure 9 that, when the number of extracted aspects is higher, the results are better. As can be seen from this figure, the worst Rand Index results are for extracting 20 aspects and the best results are gained by extracting around 100 aspects. Table 6 shows the Rand Index results of aspect detection for different models in extracting 100 aspects. From Table 6 we observe that the best accuracies are for the proposed ADM-LDA model, which is an improvement over other topic model approaches.
Rand Index of aspect detection for different models in extracting 100 aspects.
From the Rand Index results in Figure 9 and Table 6, it can be observed that using more training data with the methods in the domain of customer textual reviews affects significantly the performance of output results. These results show that the best results among the four experimented domains are for the MP3 player reviews. The best result for the MP3 player reviews is 87.58%. The DVD player dataset has the lowest results among the domain while it has many un-opinionated review sentences in the corpus.
These comparative evaluations show that the proposed model, ADM-LDA, is more effective for aspect detection in a sentiment analysis system. The existing LDA topic model assumes that documents are bag-of-words and could be represented as a random mixture of topics. This assumption causes LDA to capture more global aspects from review sentences. ADM-LDA relaxes this assumption by using co-occurrences of the words within sentence boundaries, assuming that each sentence represents an aspect and that consecutive words may form a multiword aspect. For example, our model is effective in detecting aspects such as ‘shutter’, ‘digital zoom’ and ‘battery charging system’, while the probabilistic LDA model failed to extract these aspect denoting phrases. Additionally, we can tune the parameters in our model to extract aspects with fewer or more words, for example, aspect ‘canon power shot g3’ can be found by the model. Hence, the results show that using a completely unsupervised approach for aspect detection in sentiment analysis could achieve promising performances. Finally, the proposed ADM-LDA model suggests that using the structure of review sentences, that is, the combination of involving word order, inter-relation information between words and the attribution of more importance to multiword aspects, outperforms the bag-of-words models.
6. Conclusions
In this paper we have studied a model for aspect detection problem in sentiment analysis for review documents. When dealing with mining reviews, it is often expensive to produce labelled data and so is desirable to develop a model that does not require labelled training data. Therefore in this paper we proposed an unsupervised probabilistic topic model for detecting aspects from reviews. The proposed model, named ADM-LDA, is an extension of LDA topic model. ADM-LDA model detects aspects form review documents by considering the underlying sentence structure of a document. In other words ADM-LDA models the aspect distributions with a Markov chain and releases the assumption that the aspect distribution within a document is conditionally independent. The proposed model is able to deal with three major bottlenecks, the need for labelled data, using the structure of reviews and identifying coherent aspects. Our experimental results indicate that ADM-LDA model is effective in performing the task and outperforms the basic LDA topic model.
There are several ways that the research described here can be extended. One direction is to further improve and refine the proposed model for aspect detection to extract opinion words and identify opinion words orientations. In other words the model can be developed to extract aspect and sentiment jointly from review documents. Another variant is to evaluate sentiment analysis systems in online social media domains such as Twitter to investigate the performance for large-scale data processing. Finally, we aim to build a language-independent sentiment summary system.
