Cross-platform personality exploration system for online social networks: Facebook vs. Twitter

Abstract

Social networking sites (SNS) are a rich source of latent information about individual characteristics. Crawling and analyzing this content provides a new approach for enterprises to personalize services and put forward product recommendations. In the past few years, commercial brands made a gradual appearance on social media platforms for advertisement, customers support and public relation purposes and by now it became a necessity throughout all branches. This online identity can be represented as a brand personality that reflects how a brand is perceived by its customers. We exploited recent research in text analysis and personality detection to build an automatic brand personality prediction model on top of the (Five-Factor Model) and (Linguistic Inquiry and Word Count) features extracted from publicly available benchmarks. Predictive evaluation on brands’ accounts reveals that Facebook platform provides a slight advantage over Twitter platform in offering more self-disclosure for users’ to express their emotions especially their demographic and psychological traits. Results also confirm the wider perspective that the same social media account carry a quite similar and comparable personality scores over different social media platforms. For evaluating our prediction results on actual brands’ accounts, we crawled the Facebook API and Twitter API respectively for 100k posts from the most valuable brands’ pages in the USA and we visualize exemplars of comparison results and present suggestions for future directions.

Keywords

Big Five model personality prediction brand personality machine learning social media analysis

1. Introduction

Social networking has become a big part of our everyday life and users are increasingly open to the choice of where they interact. In 2017, more than half of the global population uses the internet and there are more than 2.7 billion active social media accounts worldwidely [27]. Therefore, it is no surprise that social media plays a big role in individuals social interaction. Every social media user leaves a mark as digital footprint by writing posts, liking pages, providing content or just browsing the social media sites.

Previous research in psychology domain has suggested that an individual’s behaviour can be explained by psychological constructs, which are called personality traits [21]. There are different personality models built on top of this concept. The best well-known personality model is the FFM (Five-Factor Model) introduced by [8], also referred to as Big Five personality traits. This model is based on the association between words and human personality and defines five global factors: Openness, Conscientiousness, Extraversion, Agreeableness and Neuroticism.

Knowledge of an individual’s psychological emotions and personality allows predictions of users’ interests and preferences across different contexts and environments [17,23]. This can be used to alligen advertisements [4], distinguish sales managers skills [18], identify malicious behaviours [5], optimize product and page recommendations [19], as well as for studying humans disease as Parkinsons and Alzheimers disease [3,9].

The traditional approach to measure individual personality traits require users to fill out long questionnaires. An example of this questionnaire is the revised NEO Personality Inventory questionnaire, consisting of 20 to 360 personality related questions [7]. It is very unlikely that web users will fill out those time-consuming and impractical questionnaires to personalize services, like search results [28] or product recommendations. But scientists can automatically infer users’ personality from each individual digital footprint within various social media platforms. This is even more accurate than assessments made by friends and family as pointed out in [16]. Different features from social media can be considered to be used as the digital footprint of a user. Marketers can utilize user’s liked pages, social network attributes, demographic information or media content, which was liked or posted by the user, to better target purchasing preferences without his/her consent. This study focus on the textual language used in social media posts, because the way people use words is reliable over time and internally consistent with traditional measures of personality [6].

A huge part of the 2.7 billion social media accounts are used by separate individuals, brands, communities and public figures. Therefore, brands leave also footprints in online manner with their activities as normal users do. Each brand represents a digital personality with its footprint referred to as brand personality [1]. The knowledge of its own personality could be used to improve the brand’s marketing and public relation in general. In our study, we will focus on the hypothesis that assume we can predict brand personalities based on engineered features from user personalities. We also aim to examine if brand personalities can be predicted by models trained on user personalities. Also, we will test the hypothesis that assume users’ language on Facebook offers more self-disclosure than Twitter and visualize how it affects the final personality prediction score.

This article is organized as follows. The subsequent Section 2 introduces the previous literature in the domain of individual personality and brand personality. Section 3 discuss issues concerning data acquisition, feature selection and implementation criteria. In Section 4, we illustrate different experiments results against various machine learning classifiers while in Section 5 we visualize insights for the followed evaluation method. Finally, in Section 6 and Section 7 we summarize final results with redirection and suggestion for the future work in term of automating personality detection.

2. Related work

In contrast to traditional methods in determining users’ personality, leveraging social media footprints for predicting personality promises forthright and direct insights. Farnadi et al. [10] compared a variety of univariate and multivariate regression methods on datasets from Facebook, Twitter, and YouTube. The multivariate models often outperformed univariate ones, but the differences were not significant. They found out that no common features can be identified, which perform well on all social media datasets. Even expanding a model with training samples from another social network could not improve their regressors. Farnadi et al. concluded that the context of the data plays a major role in learning. Their dataset from YouTube was labeled by impressions, whereas their Facebook and Twitter labels were self-reported through psychometric questionnaires.

Hall et al. [12] examined the effects of self representation of Facebook users to studies on social phenomena within social media networks. Users of social media platforms consciously or subconsciously represent themselves in a way which is appropriate for their audience. The lack of appropriate methods to identify and control the effects of this restricts research findings. Thus conducted a case study involving 509 paid Amazon Mechanical Turk workers. They provided psychometric survey results and Facebook footprints to the researches. This data was used to predict user’s personality according to the (FFM) using (LIWC)-only features. The study pointed out that self-representation is an existing phenomenon in social media and that personality is still detectable even when self-representation is present.

Both research efforts ([10] and [12]) used supervised machine learning approaches to predict user’s personality according to the (FFM) model. [20] proposed a new approach using linear semi-supervised regression to improve prediction results. Their study is based on data with 1792 users collected from Sina Microblog, the most popular social platform in mainland China. They stated that their experimental results support their thesis that unlabeled data could improve prediction results.

Also, [2] proposed an approach using multi-task regression and incremental regression to predict from Sina microblogging platform the Big- Five personality from online behaviors. Their study is based on survey data of 444 users and indicates that the correlation factors are significant between different personality dimensions. They stated that their training data set is reliable enough and multi-task regression performs better than other modeling algorithms.

Research effort of [14] focused on comparing self-disclosure on Facebook versus Twitter, they collected and process social media data for the same users for both platforms, this enabled them to perform a comparative analysis under a proper scientific setup. The results indicate that users do prefer to self-disclose more on Facebook than on Twitter as platform affordances do play a big role in determining users’ self disclosure behavior.

In regard to brand personality research, two papers from the business domain are relevant. [1] developed a theoretical framework of the brand personality construct by determining the nature of dimensions of brand personality (Sincerity, Excitement, Competence, Sophistication, and Ruggedness). Geuens et al. [11] also developed a new brand personality measure consisting of a dimension mapping to (FFM) personality items (Responsibility = Conscientiousness, Activity = Extraversion, Aggressiveness = Agreeableness, Simplicity = Openness, Emotionality = Neuroticism) in contrast to other models [1].

3. Implementation

This section is divided into several sub-sections, at the first, we retrieve the training and evaluation datasets from MyPersonality project and Facebook API (Section 3.1). The next sub-section consist of feature creation in the (LIWC) tool, selection of the most significant features (Section 3.2) to train different regression models (Section 3.3) on the (FFM) traits. Section 3.4 introduces our final evaluation metric.

3.1. Datasets

Due to labeling data with valid personality scores is very time consuming and difficult task, there are only a few golden standard datasets from social media platforms available for personality prediction tasks. One of the well-known datasets is MyPersonality’s Facebook dataset, which we used it as the only source of labeled training data for Facebook users to train our prediction models.

Also, there are no ground truth dataset from social media platforms for the task of brand personality prediction. Therefore, we crawled our own 100k posts dataset for brands from Facebook API and Twitter API respectively. This dataset is not labeled yet and therefore cannot be used to train supervised machine learning algorithms. The rest of this section describes the datasets we used in this research in more depth and details.

3.1.1. MyPersonality Facebook dataset

MyPersonality was a popular Facebook application introduced by [24] in 2007. It allowed Facebook users to participate in various psychometric tests, including a (FFM) questionnaire comparable to the revised NEO Personality Inventory from [7]. Roughly 30 percent of participating users decided to let the application collect data from their Facebook profile and donate it to research [15]. The MyPersonality database consists of more than six million psychometric test results and more than four million distinct Facebook profiles.

We used the following datasets from MyPersonality project for our research:

Demographic Details:

It contains demographic details for over 4 million Facebook users and consists of unique user identifier, gender, birthday, age, relationship status, Interested In information, language, number of friends and timezone of the user.

BIG5 Personality Scores:

Contains the Five-Factor model personality scores for more than three million users. Scores are represented in the range of $[1, 5]$ and annotated with the questionnaire size.

Facebook Status Updates:

Facebook posts, 25 million status post texts from 22 million unique Facebook users.

The datasets contains (LIWC) annotations for 153617 Facebook users. They were calculated by running the Linguistic Inquiry and Word Count application [22] per user aggregated status posts. The (LIWC) analysis reflects the different emotions, thinking styles, social concerns and parts of speech in free text. Each annotation, also called word categories, is represented as a percentage of words in all of each user’s status posts. However, the sum over all word categories can get greater than 100%, because some words fall into multiple categories. As our goal is to infer the (FFM) personality scores for an English speaking user, we aggregated both filtered tables, (FFM) scores with (LIWC) scores as status post features. Table 1 provides details about the final training dataset. A user is now represented with the extracted linguistic features of his or her status posts and annotated with the (FFM) score for each personality trait.

Table 1
Characteristics of training dataset

Characteristics

# Samples = # Users 108547

# Male 44844

# Female 63245

# Features 93

# Labels 5

Avg. Age 27

Labels Mean Standard deviation

Openness 3.8435 0.6759

Conscientiousness 3.4631 0.7358

Extraversion 3.5068 0.8135

Agreeableness 3.5659 0.7070

Neuroticism 2.7334 0.8003

Characteristics
# Samples = # Users	108547
# Male	44844
# Female	63245
# Features	93
# Labels	5
Avg. Age	27

Labels	Mean	Standard deviation
Openness	3.8435	0.6759
Conscientiousness	3.4631	0.7358
Extraversion	3.5068	0.8135
Agreeableness	3.5659	0.7070
Neuroticism	2.7334	0.8003

Table 2

Correlation of trait labels in the training dataset

Traits	Ope	Con	Ext	Agr	Neu
Openness	1.00	0.03	0.13	0.04	−0.05
Conscientiousness	0.03	1.00	0.19	0.18	−0.30
Extraversion	0.13	0.19	1.00	0.17	−0.34
Agreeableness	0.04	0.18	0.17	1.00	−0.33
Neuroticism	−0.05	−0.30	−0.34	−0.33	1.00

In Table 2, we correlate the trait labels to check if we can consider each trait individually. For calculating the correlation we are using Pearson product-moment correlation. From Pearson result, all correlations are significant ( $P < 0.05$ ). Furthermore, there are clear dependencies among the different personality traits/scores. This indicate that multivariate regression could lead to promising results as multivariate regression takes dependencies between target variables into consideration.

All three datasets from MyPersonality contain an anonymized user ID that can be only used to match users between tables. We used the demographic details table to filter out users, which had not set their locale or used another language than English. Figure 1 represent the distribution of Openness trait scores in the training dataset. Figure 18 in the Appendix contains the full list of the remaining scores for all personality labels.

Fig. 1.

Distribution of openness personality scores in the training dataset.

3.1.2. Crawled Facebook dataset

As MyPersonality dataset has no data for brand’s Facebook pages, we decided to crawl our own dataset. Using the Facebook graph API, we crawled status updates from 46 popular brands appearing in the Top 50 of Forbes’ The World’s Most Valuable Brands list. Four brands are missing in our dataset, because they did not have a representation on Facebook. Altogether we collected 85347 status updates for the whole lifetime of the brands’ Facebook pages until January 2018. Some brands have only very few status updates, so we decided to only consider brands with at least 1000 posts. This will improve feature extraction with (LIWC). Table 3 shows more details view about the crawled dataset.

Table 3
Statistics for the crawled public posts for top brands from Facebook social platform

Characteristics

# Samples = #Brands 32

# Features 93

Avg. #Posts 2460

Brand #Posts Period Brand #Posts Period

ESPN 4817 3199 CVS 2348 2697

Cisco 4596 3619 Home Depot 2247 3151

Accenture 3960 2913 Wells Fargo 2173 2700

Amazon 3912 3582 UPS 2004 2682

Mercedes Benz 3501 2320 Verizon 1753 2704

Toyota 3421 2970 Google 1706 3097

HP 3389 2923 Siemens 1595 1951

Disney 3292 3167 H&M 1538 3958

GE 3110 2475 Microsoft 1493 1911

Intel 3034 3473 SAP 1482 3705

Gucci 2942 2532 Audi 1470 1981

AT&T 2700 3515 IBM 1330 2268

Ford 2659 3331 Nescafe 1329 3107

Walmart 2548 3008 Frito-Lay 1266 2574

Oracle 2518 3073 L’Oreal 1193 2228

BMW 2394 1943 Pampers 1003 3199

Characteristics
# Samples = #Brands	32
# Features	93
Avg. #Posts	2460

Brand	#Posts	Period	Brand	#Posts	Period
ESPN	4817	3199	CVS	2348	2697
Cisco	4596	3619	Home Depot	2247	3151
Accenture	3960	2913	Wells Fargo	2173	2700
Amazon	3912	3582	UPS	2004	2682
Mercedes Benz	3501	2320	Verizon	1753	2704
Toyota	3421	2970	Google	1706	3097
HP	3389	2923	Siemens	1595	1951
Disney	3292	3167	H&M	1538	3958
GE	3110	2475	Microsoft	1493	1911
Intel	3034	3473	SAP	1482	3705
Gucci	2942	2532	Audi	1470	1981
AT&T	2700	3515	IBM	1330	2268
Ford	2659	3331	Nescafe	1329	3107
Walmart	2548	3008	Frito-Lay	1266	2574
Oracle	2518	3073	L’Oreal	1193	2228
BMW	2394	1943	Pampers	1003	3199

We did not look further into comments on brand pages’ posts. The data found in the comments was noisy with a lot of links to other Facebook profiles. This could be tied to the many posts made within a scope of sweepstakes. Manual inspection of the remaining posts revealed a lot of Spam.

3.1.3. Crawled Twitter dataset

Using the Twitter API, we crawled tweets updates for the same previous Facebook list. Altogether we collected 103053 tweet updates. Table 4 provide statistics about the crawled Twitter pages.

Table 4
Statistics for the crawled public tweets for top brands from Twitter micro blogging platform

Characteristics

# Samples = #Brands 32

# Features 81

Avg. #Posts 3220

Brand #Posts Period Brand #Posts Period

ESPN 3248 3199 CVS 3231 2697

Cisco 3224 3619 Home Depot 3234 3151

Accenture 3227 2913 Wells Fargo 3228 2700

Amazon 3204 3582 UPS 3208 2682

Mercedes Benz 3201 2320 Verizon 3224 2704

Toyota 3220 2970 Google 3201 3097

HP 3211 2923 Siemens 3224 1951

Disney 3248 3167 H&M 3224 3958

GE 3202 2475 Microsoft 3238 1911

Intel 3215 3473 SAP 3216 3705

Gucci 3216 2532 Audi 3201 1981

AT&T 3227 3515 IBM 3208 2268

Ford 3247 3331 Nescafe 3213 3107

Walmart 3198 3008 Frito-Lay 3242 2574

Oracle 3209 3073 L’Oreal 3235 2228

BMW 3221 1943 Pampers 3208 3199

Characteristics
# Samples = #Brands	32
# Features	81
Avg. #Posts	3220

Brand	#Posts	Period	Brand	#Posts	Period
ESPN	3248	3199	CVS	3231	2697
Cisco	3224	3619	Home Depot	3234	3151
Accenture	3227	2913	Wells Fargo	3228	2700
Amazon	3204	3582	UPS	3208	2682
Mercedes Benz	3201	2320	Verizon	3224	2704
Toyota	3220	2970	Google	3201	3097
HP	3211	2923	Siemens	3224	1951
Disney	3248	3167	H&M	3224	3958
GE	3202	2475	Microsoft	3238	1911
Intel	3215	3473	SAP	3216	3705
Gucci	3216	2532	Audi	3201	1981
AT&T	3227	3515	IBM	3208	2268
Ford	3247	3331	Nescafe	3213	3107
Walmart	3198	3008	Frito-Lay	3242	2574
Oracle	3209	3073	L’Oreal	3235	2228
BMW	3221	1943	Pampers	3208	3199

3.2. Feature selection

As mentioned in Section 3.1, we use the combined datasets from MyPersonality [15] to train our models. (LIWC) features are word categories and some categories are also contained in other categories, which means they depend on each other. Appendix B, inclose statistics about the LIWC features in the used training dataset.

Hence, it is reasonable to select a subset of relevant features for training. In general, we will build one predictor per personality trait (see Section 3.3). We will consider three different feature sets per model:

Own_Features:

These features are important to a specific personality trait. This means we will have five different features sets, one for each trait.

Common_Features:

A feature set which contains important features for all personality traits. This set will be the same for the all five traits.

Union_Features:

This is a union of set Own and Common features.

We investigated two different approaches to select features for the mentioned feature sets: An approach based on Pearson’s correlation coefficient and an approach based on boosted decision trees’ feature significance. They are described in the following two sections.

We will refer to the used feature sets in the following way: X-S-M. X is the affected trait (e.g. O for Openness and C for Conscientiousness). This field is optional if the set is independent from the trait. S is one of the introduced feature sets. M defines the used method. P for Pearson and B for gradient boosting. As an example O-own-P describes the feature set own defined with the Pearson approach for the trait Openness.

3.2.1. Pearson correlation

Fig. 2.

Pearson correlation coefficient heatmap between LIWC features and personality traits. Red = high coefficient.

We performed pairwise correlation analysis between all 93 features and the (FFM) personality scores using Pearson product-moment correlation. This leads to $m = 5$ correlations with the same feature set. The more inferences are made, the more likely Type-I errors are to occur. To get around this multiple comparison problem, we employ Bonferroni correction to our global significance level of $α = 0.05$ to determine our local significance levels: $α^{*} = \frac{p}{m} = \frac{0.05}{5} = 0.01$ .

Correlation results and significance levels between features and all five traits can be found in heatmap Fig. 2 and heatmap Fig. 3. Unsurprisingly, not all features are correlated with the personality scores. Examples are the word categories Dash, QMark or Period. These punctuations are consistently barely used in status updates and are hence bad discriminators. Features with overall high relative correlation coefficients are e.g. tone, negemo, netspeak and Apostro. Correlations with trait Neuroticism are harder to find than with the other traits. Table 5 reveals the correlation between Openness personality trait and LIWC extracted features. Appendix C contains all the LIWC extracted features pairwise correlated with all personality dimensions.

Fig. 3.

Pearson correlation significance heatmap between LIWC features and personality traits. Red = low significance.

Table 5

LIWC features correlations with openness personality trait

LIWC Features	Coefficient	Significance
Apostro	0.088365	1.87E−199
Article	0.080803	5.22E−167
Insight	0.077391	2.49E−153
Sixltr	0.071569	2.30E−131
Death	0.057287	8.67E−85
Percept	0.05433	1.94E−76
Affiliation	−0.052079	2.23E−70
Drives	−0.052251	7.83E−71
Reward	−0.056552	1.13E−82
Time	−0.058103	3.59E−87
Tone	−0.069253	4.17E−123
Affect	−0.074599	1.40E−142
Family	−0.075725	7.19E−147
Posemo	−0.085454	1.28E−186
Informal	−0.086264	3.82E−190
Netspeak	−0.102049	1.19E−265

To get important features to a trait (X-own-P feature sets), we only consider those features with a significant correlation ( $p < α^{*} = 0.01$ ) to the selected trait X. We reduce the number of features even further by only selecting features with high correlations ( $| r | > 0.05$ ). The common-P feature set was obtained by selecting features that significantly correlated with all trait labels with $p < 0.01$ . Table 11 in Appendix D shows all resulting feature sets O-own-P, C-own-P, E-own-P, A-own-P, N-own-P and common-P.

As Pearson coefficient can only measure linear correlations, we use an additional approach for selecting features: Feature importance of boosted decision trees.

3.2.2. Gradient boosting

Fig. 4.

Significance and relative importances features for openness trait: the diagram contains all features with a relative importance higher than 0.011 value.

Another approach to get a well suited subset of features is to use the relative importance of features within a gradient boosted regression tree [13]. The idea behind this approach is to train a model on all available features. The resulting predictor will not be good at predicting values but the model implicitly contains the importance of each feature to make a decision. We train one model per personality trait, each with all features as input, to get the own feature sets. There are two approaches to get the set common. First option can be an intersection between all own sets. The second approach is to boost one multivariate regression tree and extract the significance features as the own sets. We used the former approach. Figure 4 shows the resulting importance graph of the Openness trait. You can find the selected features in Table 11 in the Appendix as well. They are called O-own-B, C-own-B, E-own-B, A-own-B, N-own-B and common-B.

All significant correlated feature sets can be seen in Appendix C as well as for all final selected features in Appendix D. Both approaches clearly select similar feature sets, e.g. the feature sets for trait Openness have 12 out of 17 possible common features. The Pearson approach is more selective than the boosted approach. This is explainable by the fact that the Pearson approach only considers linear relationships between features and scores. Generally the number of features was reduced to about $40 %$ . Due to the selection method used to find common features for all traits in the boosted approach, only 9 common features are found. Compared to the other feature sets this is a small amount and could lead to greater errors as generalization gets too high.

Extraversion and Neuroticism are hard to predict, because their feature sets (E-own-P and N-own-P) have only a few significantly correlated features. The gradient boosting approach however finds a lot of important features for these traits. This leads to the assumption that there could be non-linear relationships between the features and the personality traits extraversion and conscientiousness.

3.3. Machine learning algorithms

As Five-Factor model’s personality scores are continuous values ranging from 1 to 5, predicting a user’s personality score is a regression task. Regression models approximate a mapping function from the feature vector to a continuous output variable. Based on our training data described in Section 3.1, we trained three different machine learning algorithms: support vector regression, boosted regression trees and Neural Nets. They are described in the following sections.

The (FFM) depict an individual’s personality via five personality scores, therefore we decided to train five models: One for each personality trait. Each algorithm is trained on three different feature sets, selected with two approaches for all five traits (see Section 3.2). This means we will train $3 \cdot 2 \cdot 5 = 30$ models per algorithm.

All three selected algorithms require hyperparameters to be set before training. We perform a grid search over selected hyperparameters combined with a 3-fold cross-validation for each model to find the best performing parameter combination for the used feature set. (RMSE) root mean squad error is used for comparing the separate grid search folds. This measure is described in Section 3.4. The dataset used to build the models has 108,547 sample. We used a random split to extract $20 %$ of the data to act as testing data. Grid search and 3-fold cross-validation are solely done on the remaining training samples. The best performing parameter combination is used to train the final model on all training samples. All 90 models can than be compared by calculating the performance on the $20 %$ test samples (see Section 4).

3.3.1. Support vector regression

Support Vector Machines can be used as a regression method. Thereby main features of Support Vector Machines are maintained.

SVRs like other Support Vector Machines allow using kernels to transform data into a higher dimensional feature space for non-linear base data. As experiments with a linear kernel showed bad results for our data, we decided to use a Gaussian Radial Basis Function (rbf)-kernel. It projects the input vectors in an infinite dimensional vector space and is defined as [26]: $\begin{array}{l} (1) & K_{RBF} (x, x^{'}) & = ⟨ ϕ (x), ϕ (x^{'}) ⟩ \\ (2) & = exp (- \frac{‖ x - x^{'} ‖^{2}}{2 σ^{2}}) \end{array}$

It measures the similarity of two feature vectors $x$ and $x^{'}$ in the input space. The Kernel $K_{RBF} (x, x^{'})$ is large if the Euclidean distance between the two feature vectors $‖ x - x^{'} ‖$ is small. The rbf-kernel has one free parameter σ. Together with the regularization parameter C of SVR two hyperparameters can thus be adjusted in the training phase.

Rbf-kernels have a high computational complexity and do not scale well with the number of training samples used. Our training set with roughly 100000 samples already took multiple hours to complete training one model. We suggest to use an approximation of the rbf-kernel for future trainings with even more data.

3.3.2. Gradient boosting

The same approach we use in Section 3.2.2 for feature selection can be used to learn a model based on regression trees. We used the selected feature sets as input and train the model. This approach results in a stair like function. Each leaf has a scalar as output. As the tree has hard decision boundaries the scalar has a fixed number of values.

3.3.3. Neural nets

We applied a feed-forward Neural Network. This means the output of every perceptron is connected as input to every perceptron of the next layer. We deploy four hidden layers with 1024, 512, 256 and 128 perceptrons as presented in Fig. 9.

Right after the input layer there is a layer to normalize the input variables. This normalization is also learned during the training of the Neural Net. There are two more layers, one in the beginning and one right before the output. These layers are dropout layers. They will cut a specific rate of connections between the perceptrons during the training. This can prevent overfitting and supports generalization of the model. The rate is a hyperparameter and gets optimized together with other parameters.

3.4. Quality measures

There are many metrics to estimate the skill of a regression model as an error in its predictions. We evaluate our regression models based on the popular (RMSE) measure, which calculates the difference between the predicted values by the model and the observed ones. It is defined by the following formula: $\begin{matrix} (3) & RMSE = \sqrt{\frac{\sum_{t = 1}^{n} {(y_{t, act} - y_{t, pred})}^{2}}{n}} \end{matrix}$

With a sample size of n, we identify instances by their number $t = 1, \dots, n$ . $y_{t, act}$ stands for the actual observed values and $y_{t, pred}$ stands for the values predicted by the model. $RMSE$ ranges from 0 to ∞ and lower values imply a lower error rate and therefore a better model.

4. Results

In this sections, we demonstrate the performance of the three applied algorithms’ models on various feature sets. We utilize the mean trait scores as baseline predictor. At the end of this section we compare the best performing models of each algorithm type among the others.

4.1. Support vector regression

Figure 5 shows the results of the SVR models trained on the feature sets selected by Pearson correlation. The blue bars indicate the mean baseline performance for all traits. The baseline error for the traits Neuroticism and Extraversion is considerably higher than for the other traits. This indicates that predicting this personality scores right is harder. This was already observed during feature selection in Section 3.2.

Fig. 5.

Errors of different Pearson feature sets with SVR.

The SVR models trained on trait-specific features (own) and on common features (common) clearly outperform the baseline predictor for all personality traits. The combined feature set (union) although using the same features than own and common together, does not achieve good results. Models trained with the union feature set could not beat the mean baseline for the traits Neuroticism, Openness, Extraversion and Conscientiousness.

For the feature sets selected by Pearson correlation the best performing models use the common-P feature set. It consists of more features than the trait-specific ones, which is certainly a better representation of the original feature space.

Fig. 6.

Errors of different boosted feature sets with SVR.

Figure 6 displays the (RMSE) of the models trained on the feature sets selected by gradient boosting. Similar to the previous models, the models trained on X-own-B and common-B clearly outperform the mean baseline. The models trained on union features have a considerably higher (RMSE) than the other models besides the model for trait Agreeableness.

In contrast to the Pearson feature set, the own features perform the best over all traits on the test data set. This is due to the small number of features in the common-B feature set. It only consists of 9 features. These are too few to successfully represent the original feature space.

Table 6

Comparison of (RMSE)s of best feature sets on Pearson and boosted with SVR. Bold values indicate lowest error for this trait

Trait (X)	Baseline	common-P	X-own-B
Neuroticism	0.7963	0.7696	0.7689
Openness	0.6779	0.6402	0.6382
Extraversion	0.8079	0.7686	0.7672
Conscientiousness	0.7381	0.6991	0.7003
Agreeableness	0.7051	0.6774	0.6786

In Table 6, we compare the best feature sets of both selection approaches to the baseline. The best performing models of the Pearson correlation feature selection approach used the common-P feature set. Whereas the best performing models of the gradient boosting approach used the trait-specific (X-own-B) feature sets.

Both approaches produce similar results regarding (RMSE), which outperform the baseline for all traits. For the traits Neuroticism, Openness and Extraversion the models trained on X-own-B perform slightly better than the models trained on common-P. Conscientiousness and Agreeableness are better predicted by the models trained on common-P. None of both approaches performs significantly better than the other.

4.2. Gradient boosting

Figure 7 shows the results of the (XGB) models trained on the feature sets selected by Pearson correlation. All our models outperform the mean baseline and have very similar results. The models trained on X-own-P have smaller errors than all other models for their traits Openness, Agreeableness and Conscientiousness. Only for the traits Neuroticism and Extraversion they could not beat the models trained on common-P and X-own-P. On average the models trained on X-own-P have the lowest errors.

Fig. 7.

Errors of different Pearson feature sets with (XGB).

For the feature sets selected by gradient boosting, there is more variance in the resulting (RMSE). Figure 8 shows the results of the different models. The feature set common-B performs the worst for all traits compared to the other boosted feature sets. Models trained on X-union-B clearly outperform all other models for their trait.

Fig. 8.

Errors of different boosted feature sets with (XGB).

The best feature sets of the two feature selection approaches are compared in Table 7. The models trained on the X-union-B feature sets are slightly better than the models trained on X-own-P for all traits. In contrast to the SVR models, the difference between the two selection approaches is more distinctive.

Table 7

Comparison of (RMSE)s of best feature sets on Pearson and boosted with (XGB). Bold values indicate lowest error for this trait

Trait (X)	Baseline	X-own-P	X-union-B
Neuroticism	0.7963	0.7890	0.7635
Openness	0.6779	0.6416	0.6309
Extraversion	0.8079	0.7845	0.7655
Conscientiousness	0.7381	0.7045	0.6934
Agreeableness	0.7051	0.6864	0.6823

4.3. Neural networks

As results of the SVR approach has shown that the feature set union of both selection methods is not an optimal feature set and the results from (XGB) were not yet available, we reduced the amount of models for Neural Nets. Figure 9 represent the utilized feed-forward Neural Network architecture.

For instance, we successfully trained a Agreeableness model on the own sets of Pearson and boosting selection. The error of the boosting based model is 0.7069 and therefore slightly worse than the baseline prediction. The model based on the Pearson set is with 0.6927 better than the baseline. However, these results does not allow any generalization.

Fig. 9.

The utilized neural net architecture without normalization and dropout layers.

Fig. 10.

Loss function of the Neural Net based on A-own-B starting with epoch 100.

More interesting in this case is the loss function of the trained models as it can give some insights on the suitability of the designed models for the given task. Figure 10 shows the loss function of the Agreeableness model based on A-own-B features extracted by boosting trees while Fig. 11 represent the loss function of the same personality trait but for the model with features defined by person correlation. First of all we can see that the train loss has very little noise. This means the chosen batch size of 50000 samples during the training is big enough and optimal. Comparing the test loss to the train loss we can see that the model has a good generalization on the data and does not overfit. The loss functions of the other trained models are quite similar to the one demonstrated above.

Fig. 11.

Loss function of the Neural Net based on A-own-P starting with initial epoch.

4.4. Comparison

As Section 4.2 points out, the gradient boosting algorithms had the best results when trained on the combined features set selected by the boosting approach (X-union-B). The SVR models performed well on both the feature set selected by Pearson correlation (common-P) and the trait-specific feature set selected by the boosting approach (X-own-P), but are bad on the combined feature set (union). For SVR we chose X-own-P for comparison.

Table 8 compares the (RMSE) of the best performing algorithm’s models. Only the Neural Net classifier trained for trait Agreeableness cannot beat the mean baseline. All other models outperform the baseline. The SVR and (XGB) models perform very similar besides they use different feature sets.

Table 8
Comparison of (RMSE) values of the SVR and NN models trained on X-own-B features where (XGB) models trained on X-union-B features. Bold values indicate lowest error for the associated trait

Trait Baseline SVR Boosting NN

Neuroticism 0.7963 0.7689 0.7635 0.7611

Openness 0.6779 0.6382 0.6309 0.6525

Extraversion 0.8079 0.7672 0.7655 0.7877

Conscientiousness 0.7381 0.7003 0.6934 0.7279

Agreeableness 0.7051 0.6786 0.6823 0.7069

Trait	Baseline	SVR	Boosting	NN
Neuroticism	0.7963	0.7689	0.7635	0.7611
Openness	0.6779	0.6382	0.6309	0.6525
Extraversion	0.8079	0.7672	0.7655	0.7877
Conscientiousness	0.7381	0.7003	0.6934	0.7279
Agreeableness	0.7051	0.6786	0.6823	0.7069

Overall the (XGB) models trained on X-union-B perform the best. The greatest improvement compared to the baseline is achieved for the trait Openness with about $6.93 %$ . The overall improvement of the (XGB) models is about $5.12 %$ .

5. Evaluation

We utilized the Facebook brand pages and Twitter brand pages described earlier (in Section 3.1) to predict the (FFM) traits with the proposed SVR model. To compare whether our general personality prediction is accurate on brand data we used the API from ApplyMagicSauce [25] to predict the traits on the same data. ApplyMagicSauce is a research project from University of Cambridge, using not only datasets like the MyPersonality project, but questionnaires, Tweets, browsing data and open text to identify different psychological parameters. The (FFM) scores are given in percentiles in ratio to the average of each trait in the whole dataset.

Fig. 12.

Personality scores prediction for brand called (CVS Health) at ApplyMagicSauce API versus the proposed SVR prediction model using their public Facebook posts.

As seen in the radar diagrams in Fig. 12, our model is capable to detect the five personality traits of Facebook brands pages on Facebook, and reported significant improvements in detecting specific personality traits over the another by extracting and engineering several textual features from online available social fingerprints. Figure 13 and 14 represent the predicted personality scores for the same brand page by analyzing their public posts and public tweets at our proposed models and we show how feature extraction approaches at the training phase (Pearson versus Boosting trees) can affect the final predicted results.

Fig. 13.

Personality prediction for brand (CVS Health) using the proposed SVR model with features extracted by Pearson-Correlation. Prediction is made based on Facebook public posts (Blue) VS Twitter public Tweets (Orange).

Fig. 14.

Personality prediction for brand (CVS Health) using the proposed SVR model with features defined by Gradient Boosted Regression Tree. Prediction is made based on Facebook public posts (Blue) VS Twitter public Tweets (Orange).

We evaluate how supervised language models trained on Facebook users are capable of detecting personality traits from Twitter users. The results shows that Facebook users’ tend to use more psycho-linguistic conceptual emotion categories words than Twitter users’ and this leads to better personality prediction at Facebook platform. The results are comparable to the state-of-the-art language models provided by [14] where they conclude that Facebook users’ prefer to use Facebook social platform for posting content about their personal relationships and personal concerns, where Twitter users’ tend to use Twitter micro blogging platform for posting about their psychological needs and derives.

To this extent, the lack of restrictions on the posts length at Facebook platform can be considered as a major factor in Facebook’s superiority over Twitter in predicting personality dimensions. Figure 15, 16, and 17 shows how brands personality are varying when well-established language models are used to predict brand personality traits based on their published Facebook posts and Twitter tweets.

Fig. 15.

Personality prediction for brand (SAP) using the proposed SVR model with features defined by Gradient Boosted Regression Tree and Pearson. Prediction is made based on Facebook public posts VS Twitter public Tweets.

Fig. 16.

Personality prediction for brand (General Electric) using the proposed SVR model with features defined by Gradient Boosted Regression Tree and Pearson. Prediction is made based on Facebook public posts VS Twitter public Tweets.

Fig. 17.

Personality prediction for brand (Cisco) using the proposed SVR model with features defined by Gradient Boosted Regression Tree and Pearson. Prediction is made based on Facebook public posts vs Twitter public Tweets.

6. Future work

The evaluation of brand personality in online space is not an easy task. Creating a gold standard for brand personalities by conducting interviews and questionnaires with employees as well as marketing and enterprise managers would definitely bring the research to the front. A point worth investigating is whether followers personality is matching the brand personality. A brand can take advantage and apply reverse psychology in marketing campaigns to attract similar or even totally contrary personality types. This knowledge would also greatly help public relations to identify target audience of a brand over various Social Media networks. The same analysis is conceivable for employee personalities in comparison to the brand and could support human resources in a company or help new applicants to find an appropriate job position.

7. Conclusion

This paper aims to predict brands personality from social online fingerprints with machine learning algorithms that trained on labeled data from user self-report personalities test at both Facebook and Twitter platform. It uses two different approaches to select feature sets and evaluates three different types of machine learning algorithms. The final model is able to properly distinguish between personality dimensions of Facebook at Twitter pages by investigating a wide set of combination between the extracted features with state-of-the-art machine learning classifiers. In term of the implications for machine learning domain, our experiments suggest that the source of the language samples can greatly affect the ability of capturing users’ personality. In general, language models trained of Facebook data to predict personality dimensions can be decidedly transferred to Twitter platform but not vice versa.

Footnotes

The big five personality labels scores distribution in the training dataset

Explatory features analysis at training dataset

Table 9

Statistics for LWIC features extracted from MyPersonality dataset

LIWC features	Count	Mean	std	Min	25%	50%	75%	Max	IQR
Analytic	115822	59.421852	19.351168	1	45.9	59.64	73.52	99	27.62
Clout	115822	55.601547	17.641694	1	44.18	54.71	67.12	99	22.94
Authentic	115822	50.215093	26.728161	1	31.29	54.62	71.13	99	39.84
Tone	115822	67.301046	28.058909	1	46.3925	73.76	92.98	99	46.5875
WPS	115822	14.907625	23.822601	0.5	9	11.5	15.55	3834	6.55
Sixltr	115822	13.995215	4.521729	0	11.65	13.59	15.74	100	4.09
Dic	115822	76.968739	17.780135	0	75.93	83.12	86.94	100	11.01
Function.	115822	42.557772	11.886974	0	39.89	46.11	49.9	100	10.01
Pronoun	115822	13.131699	4.8014	0	10.7	13.65	16.15	100	5.45
Ppron	115822	8.991791	3.654331	0	6.93	9.16	11.22	100	4.29
I	115822	5.113042	2.74385	0	3.3	5.08	6.79	100	3.49
We	115822	0.559721	0.689054	0	0.17	0.44	0.75	50	0.58
You	115822	2.294037	1.749893	0	1.16	2.02	3.11	50	1.95
SheHe	115822	0.648692	0.780837	0	0.16	0.49	0.9	60	0.74
They	115822	0.378993	0.442825	0	0.09	0.31	0.53	18.18	0.44
Ipron	115822	4.13145	2.048509	0	3.21	4.3	5.18	100	1.97
Article	115822	4.796804	1.969031	0	3.82	4.95	5.8975	37.5	2.0775
Prep	115822	9.976873	3.477478	0	8.71	10.68	12.04	66.67	3.33
Auxverb	115822	7.502806	2.830535	0	6.41	7.98	9.14	50	2.73
Adverb	115822	4.581186	2.109951	0	3.67	4.8	5.71	50	2.04

Pairwise correlation between features and personality traits

Table 10

Features that significantly correlate with all trait labels ( $P < 0.01$ ) via Pearson product-moment correlation

Features	Openness	Conscientiousness	Extraversion	Agreeableness	Neuroticism
WC	0.068494	0.009237	0.074461	0.008496	0.052487
Clout	−0.049436	0.103304	0.049392	0.073409	−0.049931
Tone	−0.069253	0.172775	0.135431	0.18298	−0.092732
Sixltr	0.071569	0.014526	−0.061646	−0.015693	−0.013553
Ppron	0.014514	−0.013042	0.046791	0.023391	0.041792
You	−0.023257	0.015831	0.04325	0.024644	0.009767
Article	0.080803	0.060756	−0.017511	0.034252	−0.042799
Auxverb	0.046716	0.023749	−0.007983	0.044148	0.022498
Conj	0.042369	0.04841	0.019684	0.056052	0.01743
Negate	0.013816	−0.023289	−0.016496	−0.026628	0.041096
Affect	−0.074599	0.014773	0.050155	0.046865	0.018472
Posemo	−0.085454	0.079127	0.074458	0.101012	−0.028261
Anger	0.014434	−0.107697	−0.017511	−0.113296	0.047073
Friend	−0.027461	0.019418	0.033898	0.028197	−0.008801
Female	−0.027568	0.027472	0.036498	0.014373	0.032973
Male	−0.025863	0.019266	0.021924	0.013261	−0.02197
Cogproc	0.049599	0.020891	−0.027528	0.034595	0.032331
Percept	0.05433	−0.033162	−0.008349	0.020174	0.017156
Bio	0.013998	−0.031997	0.032891	−0.017096	0.029259
Body	0.02723	−0.061689	0.009341	−0.038108	0.030724
Sexual	0.01805	−0.083953	0.014281	−0.075323	0.017293
Drives	−0.052251	0.091857	0.049295	0.070159	−0.0463
Affiliation	−0.052079	0.061489	0.055885	0.062503	−0.02759
Achieve	−0.025544	0.062448	0.009735	0.033299	−0.039728
Reward	−0.056552	0.058267	0.043904	0.048076	−0.036966
Focuspresent	0.010417	0.025491	0.012839	0.036017	0.023151
Relativ	−0.023329	0.091913	0.026936	0.071985	−0.034793
Motion	−0.019126	0.048417	0.028863	0.045586	−0.024184
Space	0.034205	0.050738	0.017932	0.034359	−0.031808
Time	−0.058103	0.092945	0.019274	0.073315	−0.024624
Work	−0.009032	0.056185	−0.035491	0.025133	−0.028369
Leisure	−0.014062	0.014116	0.026138	0.028533	−0.046717
Death	0.057287	−0.051633	−0.044626	−0.049209	0.020564
Informal	−0.086264	−0.079306	0.073246	−0.031772	0.008127
Swear	0.008354	−0.10041	0.018342	−0.106396	0.021728
Apostro	0.088365	−0.038703	−0.054195	−0.013173	0.048340

Final feature sets extracted by two approches: Pearson and gradient boosting

Table 11

Feature sets selected by Pearson correlation coefficient and gradient boosting feature importances. The bold printed features for the boosting approach are common features (feature set common-B). There are 9 features in this set

Set	Count	Features
Pearson
O-own-P	17	Apostro, Sixltr, Tone, WC, affect, affiliation, article, death, drives, family, informal, insight, netspeak, percept, posemo, reward, time
C-own-P	30	Clout, Dic, Tone, achieve, affiliation, anger, article, body, death, drives, family, focusfuture, function., i, informal, negemo, netspeak, posemo, prep, quant, relativ, relig, reward, sexual, social, space, swear, time, we, work
E-own-P	9	Apostro, Sixltr, Tone, WC, affect, affiliation, informal, netspeak, posemo
A-own-P	19	Authentic, Clout, Dic, Tone, affiliation, anger, conj, drives, focusfuture, function., negemo, posemo, prep, relativ, sexual, social, swear, time, we
N-own-P	5	Analytic, Tone, WC, i, negemo
common-P	37	Analytic, Apostro, Clout, Sixltr, Tone, WC, achieve, affect, affiliation, anger, article, auxverb, bio, body, cogproc, conj, death, drives, female, focuspresent, friend, informal, leisure, male, motion, negate, percept, posemo, ppron, relativ, reward, sexual, space, swear, time, work, you
Boosting
O-own-B	39	AllPunc, Apostro, Clout, Comma, Dic, Exclam , OtherP, Period , QMark, Sixltr , Tone , WPS , article, assent, certain, conj, death , drives, family , focusfuture, focuspast, focuspresent, home, i, informal, insight, negate, negemo , netspeak, number, power, relig, reward, sad, sexual, social, space, time, you
C-own-B	38	AllPunc, Colon, Comma, Dic, Exclam , OtherP, Period , QMark, Sixltr , Tone , WPS , adverb, anger, article, assent, conj, death , drives, family , function., hear, i, informal, ipron, leisure, money, motion, negemo , prep, quant, relativ, sexual, swear, tentat, they, time, we, work
E-own-B	41	AllPunc, Apostro, Clout, Colon, Comma, Exclam , OtherP, Parenth, Period , Sixltr , Tone , WPS , adverb, affiliation, bio, certain, conj, death , discrep, drives, family , female, focusfuture, focuspast, friend, function., home, informal, leisure, motion, negemo , netspeak, nonflu, sad, sexual, social, space, tentat, they, we, work
A-own-B	35	Analytic, Apostro, Colon, Dic, Exclam , Period , QMark, Sixltr , Tone , WPS , affiliation, anger, article, bio, death , family , female, focuspast, function., home, ingest, ipron, male, money, motion, negate, negemo , number, power, relativ, relig, sexual, swear, they, time
N-own-B	38	Apostro, Clout, Exclam , OtherP, Period , QMark, Sixltr , Tone , WPS , anx, article, assent, compare, death , discrep, drives, family , female, focusfuture, health, home, informal, ingest, interrog, ipron, leisure, male, motion, negemo , number, reward, sad, see, shehe, space, swear, verb, you

References

Aaker, Dimensions of brand personality, Journal of Marketing Research 34 (1997), 347–356. doi:10.1177/002224379703400304.

Bai,

Yuan,

Hao and

Zhu, Predicting personality traits of microblog users, Web Intelligence and Agent Systems: An International Journal 12(3) (2014), 249–265. doi:10.3233/WIA-140295.

Balconi,

Siri,

Meucci,

Pezzoli and

Angioletti, Personality traits and cortical activity affect gambling behavior in Parkinsons disease, Journal of Parkinson’s Disease (2018), 1–12, Preprint.

Bin Tareaf,

Berger,

Hennig,

Jung and

Meinel, Identifying audience attributes: Predicting age, gender and personality for enhanced article writing, in: Proceedings of the 2017 International Conference on Cloud and Big Data Computing, ACM, 2017, pp. 79–88.

Bin Tareaf,

Berger,

Hennig and

Meinel, Malicious behaviour identification in online social networks, in: Book IFIP International Conference on Distributed Applications and Interoperable Systems, Springer, 2018, pp. 18–25.

Boyd and

Pennebaker, Language-based personality: A new approach to personality in a digital world, Current Opinion in Behavioral Sciences 18 (2017), 63–68. doi:10.1016/j.cobeha.2017.07.017.

Costa and

McCrae, The revised neo personality inventory (neo-pi-r), The SAGE handbook of personality theory and assessment 2.2, 2008, 179–198.

Digman, Personality structure: Emergence of the five-factor model, Annual Review of Psychology 41(1) (1990), 417–440. doi:10.1146/annurev.ps.41.020190.002221.

Dlorio,

Garramone,

Piscopo,

Baiano,

Raimo and

Santangelo, Meta-analysis of personality traits in Alzheimers disease: A comparison with healthy subjects, Journal of Alzheimers Disease 62(2) (2018), 773–787. doi:10.3233/JAD-170901.

10.

Farnadi,

Sitaraman,

Sushmita,

Celli,

Kosinski,

Stillwell,

Davalos and

Moens, Computational personality recognition in social media, in: User modeling and user-adapted interaction 26.2-3, 2016, pp. 109–142.

11.

Geuens,

Weijters and

De Wulf, A new measure of brand personality, International Journal of Research in Marketing 26(2) (2009), 97–107. doi:10.1016/j.ijresmar.2008.12.002.

12.

Hall and

Caton, Am I who I say I am? Unobtrusive self-representation and personality recognition on Facebook, PloS one 12(9) (2017), e0184417. doi:10.1371/journal.pone.0184417.

13.

Hastie,

Tibshirani and

Friedman, Unsupervised learning, in: The Elements of Statistical Learning, Springer, New York, NY, 2009, pp. 485–585. doi:10.1007/978-0-387-84858-7_14.

14.

Jaidka,

S.C.

Guntuku and

L.H.

Ungar, Facebook versus Twitter: Differences in self disclosure and trait prediction, in: Twelfth International AAAI Conference on Web and Social Media, 2018.

15.

Kosinski,

Matz,

Gosling,

Popov and

Stillwell, Facebook as a research tool for the social sciences: Opportunities, challenges, ethical considerations, and practical guidelines, American Psychologist 70(6) (2015), 543. doi:10.1037/a0039210.

16.

Kosinski,

Stillwell and

Graepel, Private traits and attributes are predictable from digital records of human behavior, in: Proceedings of the National Academy of Sciences, Vol. 110, 2013.

17.

Lambiotte and

Kosinski, Tracking the digital footprints of personality, Proceedings of the IEEE 102(12) (2014), 1934–1939. doi:10.1109/JPROC.2014.2359054.

18.

J.W.

Lounsbury,

N.A.

Foster,

J.J.

Levy and

L.W.

Gibson, Key personality traits of sales managers, IOSPress, Work 48(2) (2014), 239–253. doi:10.3233/WOR-131615.

19.

Mulyanegara,

Tsarenko and

Anderson, The Big Five and brand personality: Investigating the impact of consumer personality on preferences towards particular brand personality, Journal of Brand Management 16 (2009), 234–247. doi:10.1057/palgrave.bm.2550093.

20.

Nie,

Guan,

Hao,

Bai and

Zhu, Predicting Personality on Social Media with Semi-Supervised Learning, Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint, Vol. 2, IEEE, 2014.

21.

Ozer and

Benet-Martinez, Personality and the prediction of consequential outcomes, Annu. Rev. Psychol. 57 (2006), 401–421. doi:10.1146/annurev.psych.57.102904.190127.

22.

Pennebaker,

Francis and

Booth, Linguistic inquiry and word count: LIWC 2001, in: Mahway: Lawrence Erlbaum Associates Journal, Vol. 71, 2001.

23.

B.T.

Raad,

Philipp,

Patrick and

Christoph, ASEDS: Towards automatic social emotion detection system using Facebook reactions, in: 2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), IEEE, 2018, pp. 860–866.

24.

Stillwell and

Kosinski, myPersonality Project website, 2016-11-25, http://mypersonality.org/.

25.

University Of Cambridge, ApplyMagicSauce, 2018-03-01, available at: https://applymagicsauce.com/demo.html.

26.

Vert,

Tsuda and

Scholkopf, A primer on kernel methods, 2004-07-27, available at: http://members.cbio.mines-paristech.fr/~jvert/publi/04kmcbbook/kernelprimer.pdf.

27.

We are Social and Hootsuite, Digital in 2017 Global Overview, In slides 2018-7-7, available at: https://www.slideshare.net/wearesocialsg/digital-in-2017-global-overview.

28.

Ye,

Chua and

Kei, Clustering web pages about persons and organizations, in: Web Intelligence and Agent Systems, Vol. 3, IOS Press, 2005, pp. 203–216.