Abstract
In this paper we propose a Hidden Markov Model in order to predict the sentiment of soccer fans based on information regarding the result of matches. The model was constructed by data collected from a social network where fans of a soccer team periodically expressed feelings towards their team. We show that the choice of a HMM is justified due to the fact that the change in a fan’s sentiment is analogous to a Markovian process of change of state through time. A comparative evaluation will be performed between variations of the proposed models and also between the most accurate of them and classification algorithms. Second order HMM, considering the match results and fan’s gambling information, is the most accurate model even though the models are constructed from results from different kind of championships.
Introduction
The interactions performed in digital media, which characterize the era of Web 2.0, allow for studies previously unimaginable of human social behavior. The extraction of patterns from social networks, for instance, provides affordable and reliable means to analyze crowd sentiment. Not surprisingly, researchers explore this subject from multiple fields such as Information Retrieval [21], Natural Language Processing [6,17,23] and Machine Learning [19], among others.
Our work falls into this general context of learning crowd sentiment through data provided by users of a determined social network. Three differences, however, must be adverted. The first is related to our research goal itself, which is to construct a predictive model rather than extract sentiments from interactions of a group of people. It is common to find research that study the impact of the crowd’s mood at an event in society. A recent example can be viewed in [4] who studied the influence of the sentiment expressed by Twitter users in protests that took place in 2011 in Egypt. The correlation between the strength of the riots and a number of negative tweets on a particular day was demonstrated. On another context, it has been shown that the change of the sentiment from a crowd also can impact acceptance of a product on the market. In fact It is possible to say that the financial success of a product is also related to the sentiment expressed by people about a particular product. For example, [16] shows that positive or negative review in blogs about movies immediately before the release of the movies affects their financial successes. Despite the benefit of these approaches, they are limited because people’s posts on social networks are not always done in real time or immediately after the event affecting their mood. Soccer fans, for instance, participate in social networks with varied intensity according to their team’s victories and defeats. To infer their sentiments right after a match, even when knowing the final result, might not be a viable option. Consequently, this could reduce inference power. It’s difficult for Law Enforcement authorities, for instance, to use information about the mood of fans to avoid violence from the crowd after a sport event.
The second difference relates to our basic research hypothesis. We have considered that people’s mood modify, continually over time, vis-à-vis to events that affect them. This may be modeled as an evolutionary process of change of state over time. Differently from sentiment analysis works, which generally use classification models, we defend the importance of considering that sentiments felt at a determined time are influenced by sentiments previously held.
Lastly, the third difference, from a practical standpoint, originates from the fact that, in order to build our model, we will use data from a social network, but will not apply Natural language techniques to treat semi-structured data. This will be unnecessary since we will use the database originated from a social network called FootyCrowd1
The data collected from FootyCrowd and our basic research hypothesis have lead us to propose a model to represent fan group’s sentiment which evolves over time and is influenced by the results of matches in official championships. This data structure has naturally led us to choose Hidden Markov Models (HMM) as a modeling instrument. In the Markov model, the fan’s sentiment is represented as latent variables while the results of matches and the characteristics of the teams participating are considered observations, or visible variables. The results obtained with HMM in several variations, both in quantity of states and in the model’s order, have shown that second order HMM, considering the match results and fan’s gambling information, is a more accurate model than other variants. Comparisons performed with classification algorithms have also demonstrated the advantages of using HMM.
The rest of this article will be structured as following. First, we will present a brief revision on Hidden Markov Models and a few of the algorithms we will use to explore the sentiment model created for the fans. Afterward we will describe how HMM was used to model the sentiment of soccer fans. We will describe the different modeling options with a varied degree of complexity depending on the amount of information collected on the matches.
A comparative evaluation will be performed between variations of the proposed models and also between the most accurate of them and classification algorithms.
Hidden Markov model
“A HMM is a doubly stochastic process with an underlying stochastic process that is not observable (it is hidden), but can only be observed through another set of stochastic processes that produce the sequence of observed symbols.” [22]
Formally, we have a set O of observations
Figure 1 presents a first order Markov chain. In order to estimate a following state,

Markov chain.
From both variables, state and observation, the Markov model is formed by a transition model,
In short, a Markov model is composed of the following:
A set of states
A set of observations
A vector π with initial probabilities for each state
A transition matrix A, with a transition model
An observation matrix B, representing the observation model
Such model allows for the finding of probabilistic inferences through algorithms such as Baum–Welch [22], which adjusts the model’s matrix according to the observation sequence, and the Viterbi algorithm [24], which infers a more probable sequence of states.
Sentiment extraction from social network logs is an important research area and an important tool for social scientists. Different practical applications are emerging as has been shown by [18]. The authors connect measures of public opinion, which are collected from polls, with sentiment estimated from tweets, and highlight the potential of tweets as a complement for traditional polling. Based on sentiment extraction from twitter, [2] indicate that the accuracy of stock market predictions can be improved by the inclusion of specific public mood dimensions. These works basically identify the polarization of tweets using a dataset of terms with their corresponding polarity. Several methods in Natural Language Processing use a similar strategy [13,20].
Sentiment classification has been used to study the readers’ emotions [12]. Emoticon (emotion labels) has also been used for sentiment classification: [25] built a system called MoodLens, in which 95 emoticons are mapped into four categories of sentiments from Chinese tweets in Weibo. [10] designed an approach for classifying headline emotion based on the information collected from the World Wide Web. Also associated with topic models, there are two related works [8,15] that share some similarities with ours. The first studies the change of sentiment over time in written documents using a Topic Sentiment Change Analysis. The second work proposes a probabilistic mixture model, Topic-Sentiment Mixture, to extracts the topics and sentiment from weblogs. It uses Hidden Markov Model to extract topic life cycles and sentiment dynamics from the document.
More related with our approach since it tries to model emotions with respect to time spans, [26] inserted an emotional layer in Latent Dirichlet Allocation. It uses topic models for analyses of the correlation between the sentiment of the crowd and topics in News comments.
Some works while not exactly performing sentiment analysis and opinion mining are related to ours. An example is [9] that seeks to understand what makes baseball games viewers in Korea interact with other fans in an online chat during a match. The factors analyzed are pre-game factors, the team’s statistics in previous matches and during the game as well as factors that indicate how enjoyable the game is for a fan. In addition to studying data about fans and games, another factor that is similar to our work is that, to define the factors during the games, the authors used Markov Hidden Model to predict the progress of the “half-innings.” With this information they try to estimate the number of messages in chats using regression models. Another related work is [1] which proposes an assistant, using a neural network fuzzy, for retrieving on the Internet relevant information about NBA players in order to help the NBA scouting agents.
None of these works studies the relation between people’s mood, which continually modifies over time, and events that affect them as we intended to with this research.
FootyCrowd
The social network FootyCrowd (FC) is a digital space for interaction of soccer fans. It includes an area for interaction of fans from the same team, which is called a team page, and an area for interaction of fans from different teams. In the team page, the fan is weekly consulted about the sentiment he/her is experiencing towards the team at that specific moment. This sentiment can be expressed in six distinct forms: great, good, worry, sad, bad and terrible. Each user has a voting period, which is not necessarily the same as the other users’. FootyCrowd defines the sentiment of fans for a team at a set time by computing the most voted sentiment registered by fans from the seven previous days.
On FC fans participate in competitions and are encouraged to win virtual prizes through gamification strategies. The main competition is a race in which one can earn points and badges through bets placed on the results of matches from Brazilian soccer championships. The bets are made with virtual money and consist of two kinds: bets on the result (victory, tie, defeat) or in the match’s exact final score. The social network is composed of more than 200,000 subscribed users with 10% of them considered active.
Figure 2 shows the distribution of the 34,764 votes on sentiment by fans from 2012 to 2014. It is possible to observe that most votes are toward positive sentiments (great and good).

Distribution of votes by sentiment per year.
We use results of matches from several championships that take place in Brazil during the season of a year. In all, 896 matches. For such matches occurred 113,820 bets, an average of 127 bets per match. We have merely used types of bets that one only chooses win, lose or draw.
First order basic models
In order to model the sentiment of fans we initially developed three variations of a first order Markov model, each with more added information. Our hypothesis is the larger the amount of information in the model, the best it will represent reality. The models are the following:
M1 – It was set up considering the 6 phases that describe fans’ sentiment, s, toward a team in FootyCrowd. Each phase was modeled as a state in the Markov model. Thus
M2 – In this model we added the concept of rout. The expectation is that a defeat by several goals of difference would have a greater effect on fan’s mood. A rout is represented by the difference between the goals of both teams during a match, in which that result is equal or greater than three. The model in this case will have five visible states: winning by rout, winning, tying, losing and losing by rout.
M3 – It was determined by the increase of an extra function, which consists of the bets placed on matches and available in the social network. Due to the number of bets placed it is possible to know which are the favorite and most non-favorite teams for each game. The belief is that if a team, which is considered a favorite to win, ends up losing the game, its fans will be more dissatisfied. In case it does win, the victory does not have a great impact in increasing fan’s happiness. Therefore, a model was defined with 15 observations (Table 1) and the same states from M2.
Possible values of the observations from the model that considers bets placed matches
Possible values of the observations from the model that considers bets placed matches
The definition of a level of favoritism of a team during a match required a more elaborate strategy to adapt data coming from FootyCrowd. We began by computing the difference, Δ, between the number of bets favorable to the analyzed team (

Curve illustrating the number of matches for each difference in comparison to a Power Law.
We decided to divide the data into four equal parts using the quartile concept, from observing that the data from score differences per number of matches forms a Power Law distribution. Each quartile divides a probability distribution into four equal parts. Thus when calculating the first, second and third quartile the values obtained were 10, 40 and 90. By assuming the existence of more matches with more balanced votes, we defined that the two first quartiles would represent matches set as non-favorite and not-unpopular, hence, any match with a difference between 0 and 40 would have observations with values winning, losing or tying. 54% of the matches do not have favorites.
Differences over 40 were analyzed separately and divided again into four equal groups, which gave us the new values for the three quartiles of 60, 90 and 120. Subsequently we set the observations favorite and non-favorite as all matches in the first quartile, between 40 and 60, while leaving out observations most non-favorite and most favorite as well as all matches with difference greater than 60. 43% of the matches in which the difference is greater than 40 are favorites or dark horse and 57% are very favorite or very dark horse.
The Markov model includes as its parameters a vector of initial probabilities, a transition matrix between states and an observation matrix.
The initial probabilities are values, which represent the probabilities of each model state in the beginning of the process. Thus, in our initial model the vector of initial probabilities will have a value for each state great, good, worry, bad, sad and terrible.
The transition matrix represents the probability of transition between a specific state in a time t into any other state in a subsequent time. To illustrate the transition matrix we initially defined
The observation matrix stores the probabilities of occurrence of the observations. For instance, in our basic model (M1) they represent the possibility of winning, losing or tying given some state of sentiment from the fans, i.e. the probability of an observation given a state
In our model the probability of an observation
Markov models of higher order
Our hypothesis is that a fan sentiment depends on more than just its state in a short previous moment, but also on what the sentiment was a longer time ago. We decided to represent this trend through a second order Markov model.
We use the same strategy of [14] specifying the second-order Markov chain by a 3-D matrix
The probability of the state sequence
Each second-order Markov model has an equivalent first-order model on the twofold product space. For instance, Fig. 4 shows two equivalent HMM1 (bottom) and HMM2 (upper).

Example of a HMM2 represented as a HMM1.
The extension of the Viterbi algorithm to HMM2 is straightforward. Instead of referring to a state in the state space S, one must refer to an element of the twofold product space
Considering that
The second order models were generated only for the model with the greatest amount of information (M3).
In order to prevent any probability is zero and improve the performance of HMM we have used Laplacian Smoothing (LS). Smoothing is a mathematical method that removes the excess of data variability, while keeping the same expressiveness [3].
Equation (7) shows how to computer a probability p using LS.
There is a change in the definition of the model probabilities. The probability of sentiment changing,
Empirical evaluation
The trials were made by using the Baum–Welch algorithm to train the generated Markov models and the Viterbi algorithm to discover which sequence of states was more plausible for an observation sequence.
Our first evaluation compared models with a varied amount of information represented on visible states. We compared these models with a baseline algorithm from an intuition that, whenever a team wins a match, fans’ sentiment elevate and, whenever it loses, the sentiment degrades. For instance, if a team is in a good phase and wins, it gets upgraded to a great phase. If, in the next game, the team wins again it remains in that great phase since it is the maximum limit. In case the team loses, fans remove it from a good phase to worry and if the team ties fans let the team continue in its current phase.

Impact of observations in transitions between states in the baseline system.
Figure 5 presents a diagram of states explaining how, in this baseline algorithm, the observations (arcs) impact on states (circles).
To assemble the initial model, which we will refer to as the training model, we estimated the probabilities of the transition and observation matrices from FootyCrowd data, which captured fan sentiment. This data consists of voting which took place during the seasons of 2012 and 2013 of 8 of the largest Brazilian teams (in terms of number of fans) and also represent the first teams from FootyCrowd‘s fan ranking. The teams consist of Corinthians, Palmeiras, Santos, São Paulo, Botafogo, Vasco, Flamengo and Fluminense. In addition to the sentiment database, we used data from matches from 2012 and 2013. Altogether, 533 match results were utilized. Assuming that the model will attempt to estimate states for soccer matches of teams that do not have sentiments registered, the initial probabilities in our model were the same for all sentiments,
The results from those teams’ matches on season 2014 were used to refine the model, which, after applying Baum–Welch, presented new probabilities. Through this refined model, it is possible to apply Viterbi to infer fan sentiment for each week after the match and compare with the value from fan sentiment captured by FC. Table 2 presents the amount of matches for each team on season 2014.
Amount of matches for each team from the test database
The inference for fan sentiment was made individually for each team. For example, to test the accuracy of the algorithm in estimating the sentiment of the Flamengo’s supporters, for each one of the 50 matches composing the test database we compare the sentiment inferred by the Viterbi’s algorithm with the sentiment of the crowd represented in FootyCrowd just after each particular match. Using M1, the success rate was 29.16% meaning that for 14 matches the inference made via Viterbi was correct. Table 3 presents the results per team with the success percentage between inferences performed by Viterbi algorithm and fan sentiment on FootyCrowd at that moment. From the table it is possible to view results for different model types (M1, M2, M3, all from first order) as well as results obtained with the baseline system previously described.
Results referred to the 3 first order models and the baseline system
It is possible to notice an improvement in results of almost all models, as more information is included in the observations from the Markov model. For instance, using M1, the Viterbi’s algorithm was accurate in 25% of inferences. This percentage of accuracy increased to 29.78% while the M2 was used and 44.68% to M3. However they don’t have clear advantage compared with the baseline system.
In the previous Section, in the M2 model, we considered the concept of rout. This concept allows to measure how impactful is, for the fans, the information that her team lost or won by a large goal difference. Rout was defined as a difference of 3 or more goals (M2-Dif.3) at the end of the match. We have also executed the model varying the boundary as a difference of 4 or more goals (M2-Dif.4). Table 4 shows these values.
Comparison with the results of the M2 model for different definitions of rout
Comparison with the results of the M2 model for different definitions of rout
In general the results are similar and without large variations. However, the significant differences identified in the results of Palmeiras and Fluminense indicate that the perception of what a rout is, and the impact of it in the sentiment of the crowd varies from team to team. Further studies need to be conducted to validate this.
We compared the results from the best model (M3) with its extended version for orders 2. Table 5 presents the tests results conducted for second HMM (HMM2) and compares to first order HMM (HMM1).
Results from sentiment inference by First (HMM1) and second order HMM (HMM2)
Results from sentiment inference by First (HMM1) and second order HMM (HMM2)
There is a significant improvement on inferences made by HMM2 compared to HMM1. Solely the inferences made for the sentiment Palmeira’s and Botafogo’s fans did not present improvement (we will elaborate more on this matter further). We realized that, by increasing the Markov model order, the amount of examples (with respect to transitions between states and observations) to build the model decrease. In the second order model, some transitions from the transition matrix and some probabilities from the observation matrix remain null. Further tests with more data are necessary to reach definitive conclusions on whether increasing the order will always increase the accuracy of the model.
The accuracy of HMM was compared with two classification algorithms: Support Vector Machine [5] and Naïve Bayes [11]. The classification algorithms were executed from the Weka framework [7]. The training and testing base was set up in the following manner: the classes consisted of the six sentiments, which represent the vote of team supporters on their feelings toward their team. The set of attributes that compose an example is:
Result (or results) of the match (or matches) which assumes values lost, tied or won;
If the team is favorite or non-favorite in the match;
The level of favoritism or of non-favoritism of a team, which may consist of favorite, non-favorite, most favorite, most non-favorite or neither.
Different tests were conducted with SVM and Naives Bayes, in which we varied the number of matches of the training example between one and two. Our idea was to allow for a fair comparison between first and second order HMM.
Results for sentiment inference using SVM with one and two matches (SVM1, SVM2), first and second order HMM (HMM1, HMM2), Naïve Bayes with one and two matches (NB1, NB2) and the baseline (BL) algorithm previously described
Results for sentiment inference using SVM with one and two matches (SVM1, SVM2), first and second order HMM (HMM1, HMM2), Naïve Bayes with one and two matches (NB1, NB2) and the baseline (BL) algorithm previously described
Table 6 shows the results of all models. Variation of SVM and Naïve Bayes did not present significant differences.
The more accurate results were obtained in the HMM2 model from tests performed with HMM, SVM, Naive Bayes and the baseline algorithm. This indicates that the historical information of results and the temporal evolution of sentiment are important to obtain better inferences in this scenario. However, the inferences made for the sentiment of Palmeira’s and Botafogo’s fans did improve when compared to HMM1, which has required a more thorough analysis of the FC data. The reason of this is going to be discussed in the next section.
Bias towards good moments
From the data we observed that FC fans use the functionality “Fan Sentiment” more frequently when their team has a positive result. This means that, when a team loses, the fan does not express frustration with the same frequency as he expresses happiness. This is probably due to the fact that FC is a social network, and interacting with friends and acquaintances from the community after a bad result is not particularly pleasant.
Percentage of votes of fans in FootyCrowd after a match
Percentage of votes of fans in FootyCrowd after a match
Table 7 shows that for all 8 considered teams from the database, except for Vasco, fans voted the most when their team attained better results in the matches. Even so, the values for Vasco for victory and defeat were very close. By looking at Corinthians, for example, 41% of votes occurred after a victory, 35% after a tie and 23% after a defeat. In such cases, the difference between the amount of votes in victory and defeat is fairly large (e.g. Corinthians, Palmeiras and São Paulo).
These observations show that the models learnt tend to value transitions between positive sentiments, with larger probabilities to positive phases. We have numbered the amount of positive phases (phases great and good) and negative (worry, sad, bad and terrible) and found that more than double the votes are in positive sentiments (27.579) than negative (11.787).
As previously explained, by increasing the order of the Markov model the amount of examples to develop the model decreases and, since the states with more transitions are the ones which involve positive sentiments, it is presumed that the second order model could be even more biased towards positive sentiment. Therefore, predictions over sentiment of fans whose team’s data from the test contain more negative sentiments will have lower accuracy.
This assumption made us investigate more closely the relation between positive/negative sentiment and the inference results achieved with HMM2. Palmeiras and Botafogo, the teams with the lowest accuracy in inferences made with second order model, are the only teams, which presented more negative than positive sentiment for each observation in the trial dataset. Table 8 shows those numbers for all teams.
Amount of positive and negative sentiments for each team
For the 42 matches that were tested on Viterbi with HMM2s models, from FC’s database for Palmeiras, 20 of them are marked as positive sentiment and 22 as negative sentiment. Botafogo has 14 positive votes and 32 negative ones. Opposed to Corinthians, for which the HMM2 model had the largest accuracy rate with 43 matches tested as Viterbi observations, 38 of them marked as positive sentiment and only 5 as negative sentiment.
We have also performed some tests in which we removed the sentiment and results of the matches from the process of constructing the initial model from teams that would have its sentiment inferred from season 2014. This means that, in order to infer Corinthian’s fan’s sentiment from a determined period, the initial model with data from 2012 and 2013 was created without considering a single sentiment vote made by Corinthians fans. The goal was to verify how many inferences obtained from the model were dependent of sentiment data collected by fans themselves.
Results form sentiment inference for models with (W) and without (WO) influence of team with inferred sentiment
Results form sentiment inference for models with (W) and without (WO) influence of team with inferred sentiment
It is clear, on Table 9, that there was no decline in results. This indicated that the Markov model might be applied in teams not included in the training model in order to infer crowd sentiment from soccer team fans.
So far the sentiment of fans were analyzed based on the entire dataset of matches available on FootyCrowd. We have considered the results from the National Championship (Serie A), the Brazilian Cup and two regional championships (one that refers to the region of Rio de Janeiro and another referring to the region of São Paulo). In these tournaments, teams qualify of different manners. The Brazilian Cup competition is a single elimination knockout tournament featuring two-legged ties played by 86 teams, representing all 26 Brazilian states plus the Federal District. In the first two rounds, if the away team wins the first match by 2 or more goals, it progresses straight to the next round avoiding the second leg. The Brazilian Cup uses the away goals rule that states that the team that has scored more goals ”away from home” will win if scores are otherwise equal.
It is worth to note that in this kind of tournament a passing of phase is more important than a simple victory. Conversely, fans may become happy even though a defeat has happened.
The National Championship of Serie A or Brasileirão has 20 clubs. During the course of a season (from May to December) each club plays the others twice (a double round-robin system), once at their home stadium and once at that of their opponents, for a total of 38 games. Teams receive three points for a win and one point for a draw. No points are awarded for a loss. Teams are ranked by total of points, victories, goal difference and goals scored. At the end of each season, the club with the most points is crowned champion. A system of promotion and relegation exists between the Serie A and the Série B. The four lowest placed teams in the Serie A are relegated to Série B, and the top four teams from the Serie B promoted to the Serie A. Also, the top four of Serie A are allowed to play the Libertadores Cup (continental tournament). Football fans created their own way to watch the league’s scoring. They consider that there are three regions: the G4 involving those disputing the top four places, the Z4 region involving those who want to leave the four last positions and the intermediary region involving teams that are in between a zone and another.
In the State level, several independent championships exist. The most important ones are those from São Paulo (typically with 20 clubs) and Rio (only with 16 clubs). They may include obscure formats or experiment with proposed innovations in rules. This can influence in the perception of fans. Typically, the Brazilian Cup, the regional tournaments and the Libertadores Cup are played simultaneously. To track the sentiment of the fans in this context is challenging because the performance in different championships considering different rules of classification depends on a lot of hidden variables that are not present in data. We hypothesize that this might be one of the reasons for the high variability of the results of our models. We decided then to investigate the fans sentiment considering only the results from matches of Serie A, the longest championship in the country (8 months’ length). We decide also to represent, in the model, the information about the position of the club in the general classification.
A second-order Markov Model (named here HMM2Z) was created with observations that consider the three previously mentioned zones. We have created three categories. The first one groups the clubs that are at least three points away to G4. The second group represents those that are three points away from Z4 and, the last group involves the rest of the teams (intermediary zone). Table 10 presents the observations of HMM2Z model.
Observations from the combination of favoritism and the zones of Serie A
Observations from the combination of favoritism and the zones of Serie A
The dataset used to create the models contains matches from June to December of 2012, 2013 and 2014. The test set refers to the same period of 2015. Table 11 shows the comparison of the results of HMM2Z and SVM using the information about the three classification zones. HMM2 without this information is also displayed
Results comparing HMM2 to HMM2Z as well as comparing to SVM after the addition of the feature representing the team classification in the championship
In terms of accuracy, the results of HMM2Z model did not improve those produced by HMM2 that did not take into account the classification of the teams in the league. The most striking was the fact that SVM started to have similar levels of accuracy to HMM. Our interpretation is that information on the classification of the team in the championship incorporates a semantic that correlates with the sentiment of the fans. That is, if a team is in the zone Z4 is because it came several defeats and the crowd was already with a negative sentiment. So does the G4 zone for wins and positive feelings. This is exactly what is inherent in the HMM and that was not available to classification algorithms. In other words, it seems that classification is competitive in terms of accuracy if some historical information is represented in the examples.
To confirm our intuition we decided to do another test, this time considering only matches that took place in the first half of the year when many championships take place concurrently and no historical information was available. The tests took into account the information about favoritism and the match results, featuring HMM2 model described above. However we use a dataset storing matches played between January, 1st and May 31st of 2012, 2013 and 2014. The test dataset refers to the same period of 2015.
The results (see Table 12) confirm our intuition. Comparatively, the accuracy of SVM has reduced. On the other hand the variability of the results increased. HMM2 seems to be more robust because it is less imune to situations when historic information is unavailable.
HMM2 with SMV using data with first semester results
This research has investigated a database of sentiments expressed by soccer fans from teams in Brazil. We proposed to create a predictive model based on Hidden Markov Models due to its stochastic characteristics, which describe a procedure that operates over a long period of time. Therefore, the hypothesis that a sentiment is formed over time and not just by a single observation can be demonstrated by comparing results from HMM with classifiers such as Support Vector Machine and Naive Bayes. A significant improvement on results was also observed when a second order model was used. This reinforced the hypothesis that current sentiments are dependent on previous states.
Variations of Markov models show that the accuracy rate improves, since observations are capable of expressing winnings or losses as well as the favoritism of the opponents. Another important finding is the fact that sentiment evolution over time also needs to be represented.
We had indications that the feeling of the fans behaves differently during the year. In the first half, when the championships take place in parallel and are based on different rules inferences with greater variability happen. During the second half, when a long-term championship occurs, the variability is lower. On both occasions HMM seems well represent the feeling of the fans, although in the second half, classification using the team ranking information in the championship also presents itself as an appropriate model.
The limitations of our approach guide us towards future investigations. In all our tests, the model validation was based on comparing the feeling most votes in FootyCrowd with the feeling that has the highest probability of being inferred by the Viterbi algorithm. Investigate new forms of assessment is part of our intentions for future work. In particular, we believe that the comparison process can be done for all the probability distribution. That is why we are investigating how to represent this problem as a HMM with continuous states. Instead of inferring the sentiment represented by the majority of the votes, we want to infer a histogram, which represents the distribution of fans’ sentiment.
The bias towards good moments verified on the dataset is also an important direction to future work. We think that this problem is similar to the class imbalance in supervised learning methods. An alternative way is to apply procedures of under/over sample on the data in order to have more accuracy in the estimates of the model.
