On Predicting Sociodemographic Traits and Emotions from Communications in Social Networks and Their Implications to Online Self-Disclosure

Abstract

Social media services such as Twitter and Facebook are virtual environments where people express their thoughts, emotions, and opinions and where they reveal themselves to their peers. We analyze a sample of 123,000 Twitter users and 25 million of their tweets to investigate the relation between the opinions and emotions that users express and their predicted psychodemographic traits. We show that the emotions that we express on online social networks reveal deep insights about ourselves. Our methodology is based on building machine learning models for inferring coarse-grained emotions and psychodemographic profiles from user-generated content. We examine several user attributes, including gender, income, political views, age, education, optimism, and life satisfaction. We correlate these predicted demographics with the emotional profiles emanating from user tweets, as captured by Ekman's emotion classification. We find that some users tend to express significantly more joy and significantly less sadness in their tweets, such as those predicted to be in a relationship, with children, or with a higher than average annual income or educational level. Users predicted to be women tend to be more opinionated, whereas those predicted to be men tend to be more neutral. Finally, users predicted to be younger and liberal tend to project more negative opinions and emotions.

We discuss the implications of our findings to online privacy concerns and self-disclosure behavior.

Introduction

People express their thoughts, emotions, and preferences in many settings. The ideas and emotions we express depend not only on what we experience but also on our environment. Self-disclosure is the act by which one person reveals himself to another person during communication and has been a prominent research area for many years. Although self-disclosure was widely investigated in the offline world,^1–3 less work has examined self-disclosure in online settings. In particular, online social networks (OSNs) are a relatively recent phenomenon, therefore, less research has examined how people self-disclose in such settings.

Twitter and Facebook are prominent OSNs, used regularly by over 1/7^th of the world's population. Researchers used the massive volumes of data produced by OSNs to study how users present themselves⁴ and the language they use,⁵ showing how to predict user psychodemographic profiles,^6,7 user happiness, and well-being,⁸ and performing other types of social network analysis.⁹ Researchers have also studied the relation between privacy concerns and information disclosure in OSNs,^10–12 noting that users are cautious about engaging in self-disclosure because they are concerned that what they share may have negative consequences,¹³ highlighting privacy risks associated with information revelation in public forums.¹⁴

Earlier work has uncovered interesting correlations between various demographic traits and the emotions that people express in many contexts. Researchers have noted relations between gender and anger,^15–17 disgust and fear,^18–20 and between income and emotional well-being,^21,22 ethnicity, age and anger,²³ and correlations between a person's relationship status and expressions of anger.²⁴

OSNs differ from the offline world in ways that have a high impact on self-disclosure. First, they allow for digital communication based mostly on text and images, stored for long periods of time, and allow people to interact without being present in the same place at the same time.²⁵ Second, unless restrictive privacy settings are used, information from OSNs is typically disclosed to many followers in an open forum, rather than to a specific individual.²⁶ Third, messages can be sent at any time, giving people more control over the pace of the conversation and the impression that they have on their peers.²⁷ Initial results on online self-disclosure indicate that users who communicate online display a different degree of self-disclosure and that reciprocity in self-disclosure in such settings is different than in offline settings.^28–31

Researchers have noted how individual differences affect self-disclosure in the offline world. There are known gender differences in self-disclosure,^32–34 and personality traits such as self-monitoring are correlated with self-disclosure.^35,36 The mood and the size of the audience also correlate with certain self-disclosure behaviors.^37,38

One influential framework regarding self-disclosure is social penetration theory,³⁹ stipulating that as relationships develop, interpersonal communication shifts from superficial and nonintimate levels (breadth dimension) to deeper, more revealing, and more intimate levels (depth dimension).

Against this background work on self-disclosure and social penetration theory in the offline world, we examine the role of emotions and demographics in self-disclosure in OSNs. As opposed to earlier work on self-disclosure in OSNs, we perform a large-scale (123,000 users and 25 million messages) study on the relation between emotions and automatically inferred sociodemographic traits, using language-based statistical models to automatically infer latent user demographics^40,41 in OSNs.

We first examine whether individual differences, known to affect self-disclosure behaviors in the offline world, still have similar effects in the online world. In particular, we show that various sociodemographic factors are correlated with peoples' propensity to express different emotions in OSNs.

Furthermore, we address social penetration theory in the online world. We show that by examining a person's emotional tone regarding a wide variety of topics, as expressed in their public profile, we can infer deep insights about them, including their education, income, and life satisfaction. This may indicate that people are not aware of how revealing the information they share on OSNs can be.

To the best of our knowledge, this is the first study which analyzes user communications in a social network on a large scale (25 million tweets and 123,513 user profiles), examining a range of automatically detected emotions (Ekman's six basic emotions⁴²) and a variety of demographic traits. Carrying out such an analysis requires us to use a large data set consisting of many users, described by their demographic attributes, and a large pool of text generated by each such user, annotated with the emotion or sentiment expressed in the text. Generating such a large data set is costly; it requires obtaining a large sample of social network users along with the pieces of text they produce; the users should then be tagged with their demographic attributes, which are not available or hidden due to privacy settings; finally, each piece of text should be examined to determine the emotion expressed in it.

Focusing on Twitter, we use crowdsourcing to get demographic labels for a medium size sample of U = 5,000 users and train machine learning classifiers to predict these demographic traits from the textual content generated by these users. We then use the trained classifiers to get labels for a much larger sample of U = 123,513 users. We use a similar method for labeling the emotions expressed in user text, train an emotion classifier on an initial sample of T^L = 52,925 tweets, and then use the classifier to get emotion labels for a much larger sample of T = 24,919,528 tweets; however, rather than obtaining the tags for the initial sample through crowdsourcing, we use tweets annotated with emotional hashtags such as #disgust or #anger, identifying a specific emotion. To perform a reliable analysis of the differences in emotion expressed by different user groups, our demographic and emotion predictions must be highly accurate. We show that our models outperform existing state-of-the-art systems as described in the Results section. Our high-level methodology is shown in Figure 1.

FIG. 1.

Our approach for predicting user demographics, emotions, and opinions on Twitter relies on two machine learning components. The first component is a user-level demographic classifier \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Phi_A ( u )$$ \end{document} , which is learned from U^L = 5,000 user profiles that annotated 12 demographic attributes using crowdsourcing; it can examine a set of tweets produced by any Twitter user and output a set of predicted demographic traits for that user, for example, gender and income. The second component relies on tweet-level emotion \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Phi_E ( t )$$ \end{document} and sentiment \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Phi_s ( t )$$ \end{document} classifiers learned from \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$T_E^L = 52 , 925$$ \end{document} tweets annotated with emotions using distant supervision and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$T^L_S = 19 , 555$$ \end{document} tweets labeled with sentiments; it can examine any tweet to predict the emotion and sentiment expressed in the tweet.

The role of machine learning

We rely on machine learning methods to achieve a large-scale sample for correlating demographics and emotions. We used machine learning for two reasons. First, machine learning methods pick up many subtle hints regarding a user and aggregate them into a high-quality prediction regarding the user's profile. For example, we show that the propensity of people to use various words in OSNs is predictive of their sociodemographic traits. Although any single word in isolation does not provide us with much information, machine learning methods can use data regarding the prevalence of many such words and distill them into high-quality predictions.

The second reason is scalability. Using machine learning, we start with a moderate size data set, where people tag tweets with the emotion expressed in them, and build modules that accurately determine the emotion expressed in any tweet. This allows us to use many more tweets and profiles in our study (a sample of over 100,000 users), as processing each new tweet only requires computation, rather than human labeling effort. An alternative methodology that does not rely on machine learning is manually obtaining demographic tags and emotional tags for users, for example, by using crowdsourced workers. However, using human labor to tag specific tweets with the emotion expressed in them is extremely costly.

Practical implications and applications

Our research has several noteworthy implications to the study of self-disclosure in online settings and several practical applications.

First, with regard to self-disclosure, we note that many people make a conscious effort to project a certain image online. Our results indicate that the emotions displayed on social media are predictive of sociodemographic traits. Thus, it might be possible for people to project a desired impression by controlling the emotions they display online. This highlights the role that emotions play in self-disclosure even in the online environment. Furthermore, our results indicate that people may disclose more than they intend to, as raw information regarding emotions can be distilled into much deeper and more intimate insights. This stands in contrast to the progression of social penetration theory in offline settings, where records of emotions are not stored for long periods of time.

Second, regarding potential applications, we show that one can predict sociodemographic traits by observing the emotions expressed online. One application of such methods is advertising, as advertisers can use demographic traits to better target a desired audience. Some online advertising forms do support advertising targeted based on gender or age; however, our results show that far deeper insights can be gained in relation to demographics, such as income, education level, or family and relationship status. Furthermore, our tools enable the identification of emotions displayed on OSNs, enabling real-time healthcare analytics, such as automatically identifying patterns of depression or mental illness from OSN communications.

Materials and Methods

We now describe how we collected the Twitter sample for our analysis and how we built models for user-level (attributes) and tweet-level (emotion and sentiment) predictions. We then discuss how we evaluated emotional differences between users of different demographic types.

Sampling Twitter data

The median number of tweets produced by a random Twitter user per day is small.⁴¹ Thus, to exclude users who tweet very often (such as bots or celebrities) or tweet only occasionally, we first estimate the user tweeting frequency from the 1 percent Twitter feed. We then randomly sample users who tweet between 4 and 10 tweets per day. Using the Twitter API (code library for accessing Twitter data), we extract data only for users who tweet (a) exclusively in English (verified by “language” field), (b) from the US and Canadian time zones (verified by “time zone” field), and (c) have at least 10 followers. We collected 10,741 user profiles with their 200 most recent tweets.

We then crawled the set of users that each target user had followed (their “friends”) using the snowball sampling strategy.⁴³ We randomly sample 10 friends per user, extracting their profile information and their 200 most recent tweets. Similarly, for all 10,741 users, we extracted a collection of user names that the target user has mentioned or retweeted in their most recent 200 tweets.

Thus, our random sample contains U = 123,513 users and T = 24,919,528 tweets, achieving a sample size comparable to recent large-scale studies.^6,7,44

Crowdsourcing demographic annotations

We take an unconnected set of 5,000 random Twitter users and ask people sourced from Amazon's Mechanical Turk service to examine their profiles and judge their demographics. Crowdsourcing annotations is a common practice, successfully used for a variety of predictive analytics.⁴⁵ In Table 1, we present 12 demographic attributes with the corresponding values and the number of annotated profiles for each attribute value used to train the models. We also report the interannotator agreement as Cohen's Kappa measured on a random 2 percent sample of the annotated data. Although the original annotations are more fine grained, we binarize all attributes into two categories that can be easily disambiguated, such as male versus female or single versus in a relationship.

Table 1.

Personal Attribute Annotation Distributions for 5,000 Random Twitter Users Collected Using Crowdsourcing

Attribute	Attribute values (binarized)	Kappa
Gender	Male: 2,124; female: 2,874	0.68
Age	Below 25: 2,511; above 25: 1,372	0.35
Political	Conservative: 595; liberal: 1,903	0.13
Ethnicity	African American: 1,705; Caucasian: 2,409	0.71
Religion	Christian: 3,388; unaffiliated: 1,610	0.08
Education	High school: 3,423; college degree: 1,575	0.30
Relationship	Single: 3,615; in a relationship 1,383	0.03
Children	Yes: 797; no: 4,203	0.40
Income	Under $35K: 3,324; over $35K: 1,675	0.28
Intelligence	Average and below: 4,087; above average: 911	0.07
Optimism	Pessimist: 907; optimist: 2,655	0.30
Life satisfaction	Dissatisfied: 840; satisfied: 2,949	0.30

The inter-annotator agreement measured as Cohen's kappa is substantial for gender and ethnicity; fair for age, optimism, children, income, education, and life satisfaction; and slight for the most difficult attributes political, religion, intelligence, and relationship status.

Constructing emotion data set

Similar to other work that bootstraps noisy data annotated with emotions, we rely on the fact that people use hashtags such as #sadness or emoticons such as :-( to indicate that they are sad. We collected tweets from the 1 percent Twitter feed over the past 4 years with hashtags corresponding to the six emotions identified by Ekman⁴²: #joy, #anger, #fear, #sadness, #disgust, and #surprise. In addition, we compile a synonym list using WordNet-Affect, Google Synonyms, and Roget's Thesaurus. We thus expand our initial emotion hashtag set to 360 emotional hashtags and collected more tweets containing them.

We exclude tweets with fewer than three words, filter out non-English tweets, remove retweets, and remove tweets where the hashtags do not appear at the end of the tweet.⁴⁶ Finally, we collect 28,656 tweets for the original six emotion hashtags and 24,269 tweets for the emotion synonym hashtags, a total of T^L = 52,925 tweets annotated with emotions. We present the distribution of tweets for each emotion in Table 1. Our hashtag emotion data set is three times larger than the recently released Hashtag Emotion Corpus,⁴⁷ but smaller than a prior bootstrapped corpus.⁴⁸ Nonetheless, we show that we outperform all existing emotion prediction models in the resulting prediction accuracy.

Collecting external sentiment data sets

We use existing publicly available resources for sentiment classification on Twitter.⁴⁹ We rely on several public sentiment analysis data sets: Stanford, Sanders, SemEval-2013, JHU, SentiStrength, the Obama-McCain Debate dataset, and the HealthCare Reform dataset. In total, we aggregated T_S^L = 19,555 tweets labeled with positive (35 percent), negative (30 percent), and neutral (35 percent) sentiment over seven data sets.

Emotion detection in social media

Emotion analysis⁵⁰ has been applied to e-mails, blogs, and chats, but was only recently investigated for OSNs. Ekman⁴² proposed an emotion classification framework capturing six high-level emotions, including joy, sadness, fear, disgust, surprise, and anger, and researchers have used supervised learning models to determine which emotions are expressed in various texts.^47,48,51,52 Due to the lack of OSN data annotated with emotions and opinions, this line of work bootstraps noisy labeled data for sentiment⁵³ and bases emotion prediction training on hashtags. We used a similar technique to take our emotion data set and build models for automatic emotion prediction, significantly outperforming previous models.

Building classifiers for tweet-level (emotions and opinions) predictions

We assume a set of independent tweets T = {t_i}. A tweet is labeled if we know the value of the sentiment function S(t): T → {positive, negative, neutral} and the emotion function E(t): T → {joy, anger, disgust, fear, surprise, sadness}.

We define two tweet-based supervised models Φ_E(t) and Φ_S(t) for emotion and sentiment classification, learned from the independent sets of tweets labeled with emotion \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$T^L_E$$ \end{document} and sentiment \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$T^L_S$$ \end{document} . We represent each tweet t_i as a feature vector over the words used in a tweet and add other stylistic and syntactic features described below. Then, the log-linear function⁵⁴ (or logistic regression⁵⁵) is learned from a labeled feature vector representation to map each unlabeled tweet to the most likely variable assignment using a parameter vector \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\vec { \theta}_E$$ \end{document} as shown for the emotion attribute below: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \begin{split} & \Phi_E ( t ) = \arg \max \nolimits_e P ( E ( t ) = e \mid t , \vec{ \theta}_E ). \\ & P ( E ( t ) = e \mid t , \vec{ \theta}_E ) = \left[ 1 + \exp \big( - \vec{ \theta}_E\cdot \vec {f} ( t ) \right] ^{-1}\end{split} \tag{1}\end{align*} \end{document}

Inferring user demographics in social media

Earlier work on predicting user attributes based on Twitter data has used supervised Support Vector Machine (SVM) models with lexical bag-of-word features for classifying four demographic attributes: gender, age, political preferences, and ethnicity.^56–62 Other methods characterize Twitter users by applying some network structure information.^41,63–65 We focus on previously unexplored personal attributes, including relationship status, parental status (having children), religious beliefs, education level, intelligence, annual income, optimism, and life satisfaction.⁶² We demonstrate that these attributes can be predicted using word unigrams, that is, linguistic features. We investigate the relation between emotions, opinions, and these different demographic attributes. Some previous work has briefly studied such correlations on Twitter such as the relation between gender and sentiment^66,67 and personality and emotions.⁴⁰

Building classifiers for user-level (psychodemographics) predictions

We assume a set of independent users U. A user \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$u \in U$$ \end{document} is labeled if we know the value of the attribute function, for example, for gender \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$A ( u ) : U \rightarrow \{ a_0 = Male , a_1 = Female \} $$ \end{document} .

We define 12 user-based supervised models \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Phi_A ( u )$$ \end{document} for classifying 12 user attributes presented in Table 1. These log-linear models⁵⁴ (logistic regression⁵⁵) are learned from user self-authored content, 200 tweets per user \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$T^ {\,( u )}$$ \end{document} . We construct a feature vector representation \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$f ( T^{( u ) } )$$ \end{document} from a word distribution \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$w_1^{\; ( u ) } , \cdots , w_n^{\; ( u ) }$$ \end{document} over 200 tweets \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$T^{ ( u ) }$$ \end{document} per user u. The models then map each unlabeled user represented as \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$f ( T^{ ( u ) } )$$ \end{document} to the most likely variable assignment using a parameter vector \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \vec { \theta}}_A$$ \end{document} as shown below: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \begin{split} & \Phi_A ( u ) = \arg \max \nolimits_aP ( A ( u ) = a \mid T^{ ( u ) } , \vec{ \theta}_A ) , \\ & P ( A ( u ) = a \mid T^{ ( u ) } , \vec{ \theta}_A ) = [ 1 + \exp ( - \vec{ \theta}_A\cdot \vec{f} ( T^{ ( u ) } ) ] ^{ - 1}\end{split} \tag{2}\end{align*} \end{document}

Measuring user emotions

Given a set of tweets T^(u) with predicted emotions, we estimate the proportion or normalized frequency of each emotion e per user. An example of the emotion distribution for a random Twitter user is shown in Figure 2.

FIG. 2.

Emotion and sentiment distributions for a random Twitter user. Distributions are the proportions of six predicted emotions or three sentiments aggregated over 200 tweets for per user.

Given the normalized distribution of emotions for each user, we estimate the user's positive emotion score \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$E^ + ( u )$$ \end{document} . For that we subtract the scores of four negative emotions, anger, sadness, fear, and disgust, from the score of the positive emotion, joy: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*}E^ + ( u ) = e_{joy} - e_{anger} - e_{sad} - e_{disg} - e_{fear}. \tag{3}\end{align*} \end{document}

We exclude the “surprise” emotion from the \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$E^ + ( u )$$ \end{document} score as it can be both positive and negative and it is hard to evaluate out of context. However, the remaining five emotions can be easily disambiguated.

Measuring user sentiment

We estimate the proportion or normalized frequency of sentiment expressed by each user. The example sentiment distribution for a random user is shown in Figure 2. Given the proportion of sentiment per user, we estimate the user's positive sentiment score \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$S^ + ( u )$$ \end{document} . For that we subtract the proportion of negative opinions from positive opinions as shown below: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*}S^ + ( u ) = S_{pos} - S_{neg}. \tag{4}\end{align*} \end{document}

We exclude neutral sentiment from \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$S^ + ( u )$$ \end{document} because we are only interested in measuring the difference between the most extreme sentiment polarities.

Estimating emotion and opinion differences

We first evaluate whether Twitter users of contrasting attribute values (e.g., \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$a_0 = {\rm Male}$$ \end{document} vs. \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$a_1 = { \rm Female}$$ \end{document} ) differ in the emotions and sentiments they tend to express. We examine the distributions of the emotions \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$e \in E$$ \end{document} and the sentiments \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$s \in S$$ \end{document} for the contrasting user populations a₀ and a₁, as well as the emotion and sentiment scores \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$E^ + ( u )$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$S^ + ( u )$$ \end{document} , calculated using the Eqs. 3 and 4. We apply a nonparametric Mann-Whitney U statistical test. Our null hypothesis H₀ is that the mean emotion and sentiment scores for the population a₀ are the same as for a₁; The alternative hypothesis H_a is that these two user populations have different mean scores.

We then quantitatively measure emotion and opinion differences between a₀ and a₁ users. For that we estimate the average scores for emotions and opinions for the populations a₀ and a₁: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu_k^ { ( a_0 ) } = \frac { \sum \nolimits_ { a_ { 0 } } \mid e_k \mid } { U } $$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu_k^ { ( a_ { 1 } ) } = \frac { \sum \nolimits_ { a_ { 1 } } \mid e_k \mid } { U } , e_k \in E$$ \end{document} (similarly, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu_l^{ ( a_{0} ) }$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu_l^{ ( a_{1} ) } , s_l \in S$$ \end{document} ). We then take the difference between mean values estimated over groups of a₀ and a₁ users for every emotion \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta \mu_k = \mu_k^{ ( a_{0} ) } - \mu_k^{ ( a_{1} ) }$$ \end{document} and opinion \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta \mu_l = \mu_l^{ ( a_{0} ) } - \mu_l^{ ( a_{1} ) }$$ \end{document} .

Experimental Setup and Results

Our experiment consists of three stages: (a) building emotion, sentiment, and personal attribute classifiers to predict 12 attributes for 123,000 users and label their 25 million tweets with emotions, for example, joy, sadness, anger, disgust, surprise, or fear and sentiment, for example, positive, negative, or neutral; (b) studying diversities between users of different demographic types and the emotions and sentiment they express; (c) measuring the strength of the correlation between demographics and emotions by fitting a regression model to infer demographic attributes from emotions and opinions.

Our classifiers are the log-linear models \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Phi_A ( u )$$ \end{document} defined in the Eq. 2, trained using the scikit-learn toolkit.⁶⁸ We prefer logistic regression⁶⁹ over reasonable alternatives such as SVM or a perceptron following previous work on predictive analytics and text classification in social media.^{41,54,58,62,67}

Predicting user emotions, sentiment, and demographics from tweets

For the opinion and emotion classification we train the \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Phi_E ( t )$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Phi_S ( t )$$ \end{document} models based on Eq. 1 using combined features:

• Lexical: binary unigram bag-of-word features (using higher order ngrams or normalized frequency count features does not improve classification results).

• Stylistic: elongated words, for example, Yaaay, noooo; capitalization, for example, COOL, MAD; positive and negative emoticons; punctuation, for example,!!!!,????; and number of hashtags.

• Negation: append a _NEG suffix to every word appearing between a negation and a clause-level punctuation mark.⁷⁰

• Lexicon: presence and score for unigram features from the Emotion Lexicon.⁷¹

• POSTags: part-of-speech tags obtained using Twitter POS tagger.⁷²

Table 2 presents emotion classification results obtained using lexical, stylistic, and negation features (using lexicon and part-of-speech tag features does not improve performance). We compare our models to the results reported by Mohammad⁴⁷ and Wang.⁴⁸ Compared to Mohammad, our emotion predictors yield a significantly higher performance for all emotions when trained on our and their combined data, using a 10-fold cross-validation \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta F_{m + T}$$ \end{document} : +0.23 F1 improvement in a six-way classification. Moreover, getting more data and adding stylistic features dramatically improves classification for individual emotions: disgust +0.60 F1 and anger +0.43 F1. Furthermore, even though our models are learnt with a far smaller data set (the alternative data set is 40 times larger), we significantly outperform Wang's models,⁴⁸ with an F1 improvement ranging between +0.08 and +0.33 across the different emotions (except sadness).

Table 2.

Emotion Detection Results Using Our Emotion Classifier Compared to Mohammad's \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta F_M$$ \end{document} Models ⁴⁷ and Wang's \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta F_W$$ \end{document} Models ⁴⁸

Emotions	Wang ⁴⁸		Mohammad ⁴⁷		This work		\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta F_W$$ \end{document}	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta F_M$$ \end{document}	\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta F_{ M}{ \prime}$$ \end{document}
Anger	457,972	0.72	1,555	0.28	4,963	0.80	0.08	0.52	0.43
Disgust	—	—	761	0.19	12,948	0.92	—	0.73	0.60
Fear	11,156	0.44	2,816	0.51	9,097	0.77	0.33	0.26	0.21
Joy	567,487	0.72	8,240	0.62	15,559	0.79	0.07	0.17	0.13
Sadness	489,831	0.65	3,830	0.39	4,232	0.62	−0.03	0.23	0.10
Surprise	1,991	0.14	3,849	0.45	8,244	0.64	0.50	0.19	0.15
All	1,991,184	—	21,051	0.49	52,925	0.78	—	+0.29	+0.23

\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta F_{ M}{ \prime}$$ \end{document} values show the absolute improvements for each emotion over Mohammad's results when we combine their and our data sets to train a joint model. Evaluations are done using 10-fold cross-validation.

We train our sentiment classifier using the same features we used for our emotion classifier. We train our model on 19,555 tweets and test on the 3,223 tweets from the official SemEval-2013 test set. Our sentiment classifier yields F1 = 0.66 (three classes) that is comparable to the top ranked system in the SemEval-2013 official ranking⁷³ (F1 = 0.69) and the current state-of-the-art system that uses other advanced features⁷¹ (F1 = 0.70).

For personal attribute prediction from tweets, we rely on binary word unigram features. Figure 3 presents classification results in terms of area under the receiver operating characteristic (ROC) for the 12 attributes outlined in Table 1. Our results for gender and ethnicity prediction demonstrate a significantly higher performance compared to previous work.^56,57,63,74 In Figure 4, we present the most predictive lexical markers learned by the classifier for income and relationship status attributes.

FIG. 3.

Demographics attribute classification performance in terms of the area under the ROC (ROC AUC is equivalent to probability of correctly classifying two randomly selected users, one from each class, e.g., single vs. in a relationship). Models are learned using binary word unigram features from 5,000 Twitter user profiles annotated through crowdsourcing. Evaluations are performed using 10-fold cross-validation.

FIG. 4.

Predictive lexical markers (“features”) learned by the classifier for discriminating users with annual income below $35K versus users with above $35K and single users versus users in a relationship.

Analyzing emotion and opinion differences for contrastive demographics

We now analyze emotional differences between users with contrasting attributes (denoted a₀ and a₁) and present the results in Figure 5. We present opinion differences in Table 3. Our key findings are discussed in detail below. All the differences in emotion and opinions for groups of users with contractive demographics are statistically significantly different (p-value < 0.001, using the Mann-Whitney U test).

FIG. 5.

Emotion differences \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta \mu_k$$ \end{document} (x-axis, percent) for Twitter users of different demographic types, for example, education, life satisfaction, and relationship status. For instance, users with children express 3.6 percent more joy and 1.2 percent more fear compared to the users without kids.

Table 3.

Sentiment Differences \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Delta \mu_l$$ \end{document} Estimated Over Groups of Users with Contrastive Demographics, for Example, Male Versus Female

Attributes	S _pos	S_neut	S_neg	S ⁺
Male+, female−	−3.7	+7.2	−3.5	−0.3
Below 25+, above 25−	−1.6	+5.3	+6.9	−8.4
Single+, relationship−	+1.1	+0.7	+1.8	−3.0
Kids+, no kids−	+0.8	+3.3	−4.1	+4.9
Degree+, school−	+3.3	+4.1	−7.5	+10.8
≤$35K+, >$35K−	−3.0	−4.5	+7.5	−10.6
Liberal+, conservative−	−1.1	−3.4	+4.5	−5.6
≤Average+, >average−	−1.9	−4.5	+6.4	−8.2
African American+, Caucasian−	−5.3	+1.2	+4.1	−9.4
Christian+, unaffiliated−	+1.2	+0.0	−1.2	+2.3
Optimist+, pessimist−	+5.3	+2.3	−7.6	+12.8
Satisfied+, dissatisfied−	+5.2	+2.6	−7.8	+13.0

S⁺ stands in differences in positive sentiment score as defined in Eq. 4. The most pronounced differences are highlighted in bold.

For brevity, we refer to a user predicted to be male simply as a male and a tweet predicted to contain surprise as simply containing surprise. However, it is important to recall that a major contribution of this work is that these results are based on automatically predicted properties, compared to ground truth. We argue that while such automatically predicted annotations may not be perfect at the individual user level or the tweet level (as shown using the receiver operating characteristic area under the curve (ROC AUC) in Fig. 3), they provide for meaningful analysis when reviewed on aggregate.

• Gender: Female users generate more happy and sad tweets, while male users produce more surprise and fear tweets. Female users have higher positive emotion scores. In line with previous work, our results further confirm that female users are more emotional online.^66,67

• Age: Older (over 25 years old) users are 7.5 percent more positive and generate 4 percent more joyful tweets and 4 percent fewer sad tweets. Younger users produce more disgust, anger, and surprise tweets. Our results are in line with the recently explored “aging positivity effect” in social media that states that older people are happier than younger people.⁷⁵

• Relationship status: Users in a relationship produce 4 percent more positive emotions and generate 2.5 percent more joyful tweets and 1.4 percent fewer sad tweets compared to single users. Single users produce more surprise, anger, and disgust tweets.

• Children: Users without children produce 3.5 percent more sad tweets and 3.6 percent fewer joyful tweets. Users with children have higher positive emotion scores (by 6.4 percent). They produce less disgust, anger, and surprise tweets but more fearful tweets.

• Education: Users with a college degree produce 4.7 percent more joyful tweets and have higher positive emotion scores (by 8.7 percent). In contrast, users with only high school education generate 4.4 percent more sad, disgust, and angry tweets.

• Political preferences: Conservative users produce 3 percent more joyful tweets and 1.8 percent more fearful tweets. Liberal users generate 2.6 percent more sad tweets and have a 7 percent lower positive emotion score.

• Income: Users with higher income (above $35K) generate 4.9 percent more joyful tweets and have a higher positivity score (by 8.9 percent). Users with a lower income tweet 4.5 percent more sad tweets and roughly 1 percent more angry and disgusted tweets.

• Intelligence: Users with above average intelligence generate 3.9 percent more joyful tweets and 3.7 percent fewer sad tweets. In contrast, users with average or below average intelligence have a lower positive emotion score (by 7.1 percent).

• Ethnicity: Caucasian users produce 4.2 percent more joyful tweets and have a higher positive emotion score (by 7.5 percent). African American users generate 2 percent more sad tweets, 1 percent more disgusted and surprised tweets, and roughly 1 percent more fearful and angry tweets.

• Religion: Christian users produce more joyful tweets and have a higher positive emotion score compared to unaffiliated users who produce more disgusted and angry tweets.

• Optimism: Optimists produce 7 percent more joyful tweets and have a higher positive emotion score (by 13 percent) compared with pessimists, who generate 3.4 percent more sad tweets and roughly 1 percent more angry and disgusted tweets.

• Life Satisfaction: Users satisfied with life produce 6 percent more joyful and 3 percent fewer sad tweets and have a higher positive emotion score (by 11 percent).

Sentiment divergence results appear in Table 3, demonstrating that several user groups produce significantly more positive opinions than their contrasting group: women, those older than 25 years, users in a relationship, users with children, users with a degree, users earning more than $35,000 a year, conservative users, users with above average intelligence, and Caucasian and Christian users.

We have also examined differences in social media activity between demographic groups using a random sample of 10,500 users from our data set. The average number of daily tweets (DT) for men is DT = 0.54 and for women is DT = 0.58. This is a statistically significant difference (p < 0.001, using a Mann-Whitney U test). Our findings are consistent with other work stating that women generate more content in social networks such as Facebook.⁷⁶ However, we did not find significant differences in the quantity of user generated content between younger and older users as has been previously reported for Facebook.^76,77

We also found that users predicted to be liberal are more active on Twitter (DT = 0.58) than conservative users (TD = 0.48) and the users predicted to be Caucasian are less active (DT = 0.50) than African American users (DT = 0.70).⁷⁶

Inferring user demographics exclusively from emotions and opinions

Figure 6 presents demographic classification results using only emotion and opinion distributions as features. We show that some emotions and opinions are predictive of one attribute value (red), some of an opposite value (blue). For instance, negative sentiment and sadness are predictive of users with no children and users less than 25 years old; angry tweets are predictive of non-Christian users; finally, surprised tweets are predictive of single users. We show a dendrogram for attributes (rows) and emotions (columns), grouping data based on row and column similarities using a hierarchical clustering algorithm.

FIG. 6.

Predicting hidden attributes from user emotions and opinions. Colors represent regression coefficients for each feature, for example, white stands for male, satisfied, optimist, single, nonreligious, liberal, no kids, below 25 years old, over $35K, Caucasian, college degree, above average intelligence and black for opposite attributes values.

Finally, we compare the predictive performance of models using emotions and opinions versus models using lexical features only. We observe that some attributes are more linguistically expressed and, therefore, are better predicted using lexical features, including gender (−0.16), ethnicity (−0.19), relationship status (−0.05), having children (−0.04), political beliefs (−0.05), and religion (−0.05). However, some attributes are better predicted from emotions and opinions extracted from tweets, including age (+0.9), income (+0.05), education (+0.02), optimism (+0.06), and life satisfaction (+0.06).

Discussion and Conclusions

We demonstrated that users of different demographic types project different emotions in social media.

Some of our results are stereotypical and are similar to known correlations in offline environments. For example: (a) users with high income produce significantly less sad tweets and users with lower income express more negative emotions;^21,22,8 (b) female users are more emotional and opinionated than male users^40,41; (c) older users express more joy and less sadness than younger users⁷⁸; (d) optimism and life satisfaction correlate with positive emotions^79,80; and (e) people in a relationship or who have children express more positive emotions.⁸¹

Our results highlight a key difference between OSNs and offline environments with regard to self-disclosure: OSNs keep track of our thoughts and emotions for long periods of time, making it possible to easily infer deep and intimate knowledge about us. Thus, people using OSNs may reveal more about themselves than they intended.⁸² Despite these unintended self-disclosure and privacy concerns, our results also indicate various applications of our emotions and demographic data mining.⁸³

Footnotes

Author Disclosure Statement

No competing financial interests exist.

References

Cozby

. Self-disclosure: a literature review. Psychological Bulletin, 1973; 79:73.

Derlaga

, Berg

. (1987) Self-disclosure: theory, research and therapy. Springer Science & Business Media. New York: Plenum Press.

Hill

, Knox

. Self-disclosure. Psychotherapy: Theory, Research, Practice, Training, 2001; 38:413.

Bazarova

, Yoon

. Self-disclosure in social media: extending the functional approach to disclosure motivations and characteristics on social network sites. Journal of Communication, 2014; 64:635–657.

Bamman

, Eisenstein

, Schnoebelen

. Gender identity and lexical variation in social media. Journal of Sociolinguistics, 2014; 18:135–160.

Bachrach

, Kosinski

, Graepel

, et al. (2012) Personality and patterns of Facebook usage. In Proceedings of WebSci, pp. 24–32. Evanston IL, USA.

Kosinski

, Stillwell

, Graepel

. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences 110. 15 (2013): 5802–5805.

Moira

, Marlow

, Lento

. (2010) Social network activity and social well-being. In Proceedings of ACM SIGCHI, pp. 1909–1912. Atlanta, GA, USA.

Scott

. (2012). Social network analysis. Sage Publications Inc., Thousand Oaks, CA.

10.

Tufekci

. Can you see me now? Audience and disclosure regulation in online social network sites. Bulletin of Science, Technology & Society, 2008; 28(1):20–36.

11.

Qian

, Scott

. Anonymity and self-disclosure on weblogs. Journal of Computer-Mediated Communication, 2007; 12:1428–1451.

12.

Li-Barber

. Self-disclosure and student satisfaction with Facebook. Computers in Human Behavior, 2012; 28:624–630.

13.

Velasco-Martin

. (2011) Self-disclosure in social media. In Proceedings of CHI. ACM. pp. 1057–1060. Vancouver, BC, Canada.

14.

Acquisti

, Ralph

. Predicting social security numbers from public data. Proceedings of the National Academy of Sciences U S A, 2009; 106:10975–10980.

15.

Wolf

. Emotional expression online: gender differences in emoticon use. Cyberpsychology & Behavior, 2000; 3:827–833.

16.

Nunn

, Thomas

. The angry male and the passive female: the role of gender and self-esteem in anger expression. Social Behavior and Personality, 1999; 27:145–153.

17.

Newman

, Gray

, Dale

. Sex differences in the relationship of anger and depression: an empirical study. Journal of Counseling and Development, 1999; 77:198–203.

18.

Eisler

, Skidmore

, Ward

. Masculine gender-role stress: predictor of anger, anxiety, and health-risk behaviors. Journal of Personality Assessment, 1988; 52:133–141.

19.

Schienle

, Schafer

, Rudolf

, et al. Gender differences in the processing of disgust-and fear-inducing pictures: an fMRI study. Neuroreport, 2005; 16:277–280.

20.

Haidt

, McCauley

, Rozin

. Individual differences in sensitivity to disgust: a scale sampling seven domains of disgust elicitors. Personality and Individual differences, 1994; 16:701–713.

21.

Kahneman

, Deaton

. High income improves evaluation of life but not emotional well-being. Proceedings of the National Academy of Sciences U S A, 2010; 107:16489–16493.

22.

Kushlev

, Elizabeth

, Lucas

. Higher income is associated with less daily sadness but not more daily happiness. Social Psychological and Personality Science, 2015. DOI: 10.1177/1948550614568161

23.

Reyes

, Meininger

, Liehr

, et al. Anger in adolescents: sex, ethnicity, age differences, and psychometric properties. Nursing Research, 2003; 52:2–11.

24.

Ross

, Van Willigen

. Gender, parenthood, and anger. Journal of Marriage and the Family, 1996; 572–584.

25.

Yum

, Kazuya

. Computer-mediated relationship development: a cross-cultural comparison. Journal of Computer-Mediated Communication, 2005; 11:133–152.

26.

McCarthy

. Social penetration theory, social networking and Facebook. 2009.

27.

McKenna

, Bargh

. Plan 9 from cyberspace: the implications of the Internet for personality and social psychology. Personality and Social Psychology Review, 2000; 4:57–75.

28.

Sheldon

. I'll poke you. You'll poke me! Self-disclosure, social attraction, predictability and trust as important predictors of Facebook relationships. Cyberpsychology: Journal of Psychosocial Research on Cyberspace, 2009; 3(2):67–75.

29.

Krasnova

, Kolesnikova

, Guenther

. (2009) It won't happen to me!: self-disclosure in online social networks. Proceedings of AMCIS, p. 343. San Francisco, CA, USA.

30.

Chen

, Sharma

. Self-disclosure at social networking sites: an exploration through relational capitals. Information Systems Frontiers, 2013; 15(2):269–278.

31.

Trepte

, Reinecke

. The reciprocal effects of social network site use and the disposition for self-disclosure: a longitudinal study. Computers in Human Behavior, 2013; 29:1102–1112.

32.

Dindia

, Allen

. Sex differences in self-disclosure: a meta-analysis. Psychological Bulletin, 1992; 112:106.

33.

Farber

. (2006) Self-disclosure in psychotherapy. Guilford Press. New York, NY.

34.

Ignatius

, Kokkonen

. Factors contributing to verbal self-disclosure. Nordic Psychology, 2007; 59:362–391.

35.

Snyder

. Self-monitoring of expressive behavior. Journal of Personality and Social Psychology, 1974; 30:526.

36.

Shaffer

, Smith

, Tomarelli

. Self-monitoring as a determinant of self-disclosure reciprocity during the acquaintance process. Journal of Personality and Social Psychology, 1982; 43:163.

37.

Solano

, Dunnam

. Two's company: self-disclosure and reciprocity in triads versus dyads. Social Psychology Quarterly, 1985; 48(2):183–187.

38.

Forgas

. Affective influences on self-disclosure: mood effects on the intimacy and reciprocity of disclosing personal information. Journal of Personality and Social Psychology, 2011; 100:449.

39.

Altman

, Taylor

. (1973) Social penetration theory. New York: Holt, Rinehart & Mnston.

40.

Mohammad

, Kiritchenko

. Using nuances of emotion to identify personality. arXiv preprint arXiv:1309.6352.

41.

Volkova

, Coppersmith

, Van Durme

. (2014) Inferring user political preferences from streaming communications. In Proceedings of ACL, pp. 186–196. Baltimore, MD, USA.

42.

Ekman

. An argument for basic emotions. Cognition & Emotion, 1992; 6:169–200.

43.

Goodman

. Snowball sampling. The Annals of Mathematical Statistics, 1961; 148–170.

44.

Coviello

, Sohn

, Kramer

, et al. Detecting emotional contagion in massive social networks. PLoS One, 2014; 9:e90315.

45.

Callison-Burch

. (2009) Fast, cheap, and creative: evaluating translation quality using Amazon's Mechanical Turk. In Proceedings of EMNLP, pp. 286–295. Singapore.

46.

González-Ibáñez

, Muresan

, Wacholder

. (2011) Identifying sarcasm in Twitter: a closer look. In Proceedings of ACL HLT, pp. 581–586. Portland, OR, USA.

47.

Mohammad

, Kiritchenko

. Using hashtags to capture fine emotion categories from tweets. Computational Intelligence 31. 2 (2015):301–326.

48.

Wang

, Chen

, Thirunarayan

, et al. (2012) Harnessing Twitter “big data” for automatic emotion identification. In Proceedings of SocialCom, pp. 587–592. Amsterdam, The Netherlands.

49.

Hassan

, Fernandez

, Alani

. (2013) Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold. First ESSEM Workshop. Turin, Italy.

50.

Desmet

. (2002) Designing emotion. Delft University of Technology. Department of Industrial Design.

51.

Roberts

, Roach

, Johnson

, et al. (2012) Empatweet: annotating and detecting emotions on Twitter. In Proceedings of LREC. Istanbul, Turkey.

52.

Kim

, Bak

, Oh

. (2012) Do you feel what I feel? Social aspects of emotions in Twitter conversations. In Proceedings of ICWSM. Dublin, Ireland.

53.

Pak

, Paroubek

. (2010) Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of LREC. Malta.

54.

Smith

. (2004) Tutorial: log-linear models.

55.

Bishop

. (2006) Pattern recognition and machine learning (information science and statistics). Secaucus, NJ: Springer-Verlag New York, Inc.

56.

Rao

, Yarowsky

, Shreevats

, et al. (2010) Classifying latent user attributes in Twitter. In Proceedings of SMUC, pp. 37–44. Toronto, Canada.

57.

Burger

, Henderson

, Kim

, et al. (2011) Discriminating gender on Twitter. In Proceedings of EMNLP, pp. 1301–1309. Edinburgh, Scotland, UK.

58.

Van Durme

. (2012) Streaming analysis of discourse participants. In Proceedings of EMNLP, pp. 48–58. Jeju Island, Korea.

59.

Ciot

, Sonderegger

, Ruths

. (2013) Gender inference of Twitter users in non-English contexts. In Proceedings of EMNLP, pp. 1136–1145. Seattle, WA, USA.

60.

Nguyen

, Gravel

, Trieschnigg

, et al. (2013) How old do you think I am? A study of language and age in Twitter. In Proceedings of ICWSM, pp. 439–448. Boston, MA, USA.

61.

Cohen

, Ruths

. (2013) Classifying political orientation on Twitter: it's not easy! In Proceedings of ICWSM. Boston, MA, USA.

62.

Volkova

, Bachrach

, Armstrong

, et al. (2015) Inferring latent user properties from texts published in social media. In Proceedings of AAAI. Austin, TX, USA.

63.

Pennacchiotti

, Popescu

. (2011) A machine learning approach to Twitter user classification. In Proceedings of ICWSM, pp. 281–288. Barcelona, Spain.

64.

Conover

, Ratkiewicz

, Francisco

, et al. (2011) Political polarization on Twitter. In Proceedings of ICWSM, pp. 89–96. Barcelona, Spain.

65.

Zamal

, Liu

, Ruths

. (2012) Homophily and latent attribute inference: inferring latent attributes of Twitter users from neighbors. In Proceedings of ICWSM, pp. 387–390. Dublin, Ireland.

66.

Mohammad

, Yang

. (2011) Tracking sentiment in mail: how genders differ on emotional axes. In Proceedings of WASSA, pp. 70–79. Portland, OR, USA.

67.

Volkova

, Wilson

, Yarowsky

. (2013) Exploring demographic language variations to improve sentiment analysis in social media. In Proceedings of EMNLP. Seattle, WA, USA.

68.

Pedregosa

, Varoquaux

, Gramfort

, et al. Scikit-learn: machine learning in python. JMLR, 2011; 12:2825–2830.

69.

Logistic regression in scikit-learn: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

70.

Pang

, Lee

, Vaithyanathan

. (2002) Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of EMNLP, pp. 79–86. Philadelphia, PA, USA.

71.

Mohammad

, Kiritchenko

, Zhu

. (2013) NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. In Proceedings of SemEval pp. 321–327, Atlanta, GA, USA.

72.

Owoputi

, O'Connor

, Dyer

, et al. (2013) Improved part-of-speech tagging for online conversational text with word clusters. In Proceedings of NAACL, pp. 380–390. Montreal, Canada.

73.

Nakov

, Rosenthal

, Kozareva

, et al. (2013) Semeval-2013 Task 2: sentiment analysis in Twitter. In Proceedings of SemEval, pp. 312–320. Atlanta, GA, USA.

74.

Bergsma

, Dredze

, Van Durme

, et al. (2013) Broadly improving user classification via communication-based name and location clustering on Twitter. In Proceedings of NAACL-HLT, pp. 1010–1019. Atlanta, GA, USA.

75.

Preoţiuc-Pietro

, Volkova

, Lampos

, et al. Studying user income through language, behaviour and affect in social media. PLoS One, 2015; 10:e0138717.

76.

Maeve

, Brenner

. (2012) The demographics of social media users, Vol. 14. Washington, DC: Pew Research Center's Internet & American Life Project, 2013.

77.

Correa

, Amber

, Homero

. Who interacts on the Web?: The intersection of users' personality and social media use. Computers in Human Behavior, 2010; 26:247–253.

78.

Goode

. Happiness may grow with aging, study finds. www.nytimes.com/1998/10/27/science/happiness-may-grow-with-aging-study-finds.html (accessed Oct. 27, 1998).

79.

Kern

, Eichstaedt

, Schwartz

, et al. From sooo excited!!! to so proud: using language to study development. Developmental Psychology, 2014; 50:178.

80.

Schütz

, Sailer

, Nima

, et al. The affective profiles in the USA: happiness, depression, life satisfaction, and happiness-increasing strategies. Peer J, 2013; 1:e156.

81.

Grover

. Relationships and happiness. www.facebook.com/notes/facebook-data-team/relationships-and-happiness/304457453858 (accessed Feb. 4, 2010).

82.

Open Science Collaboration. Estimating the reproducibility of psychological science. Science, 2015; 349:aac4716.