Abstract
The idea that words used by people are indicative of their personality has been established in several studies. In this study, we ask whether song lyrics of different music genres (e.g., punk) are associated with different personalities. We tested the hypotheses that (1) personalities are associated with lyrics of various music genres and that (2) personality, as expressed in song lyrics, may be used for the classification of songs into genres. We used a database of 17,495 songs categorized into music genres and 2468 essays written by students whose personalities have been assessed through the five factor model of personality. The method uses sophisticated tools of Natural Language Processing. The research hypotheses were confirmed and it was found that it is possible to predict participants’ personality based on the similarity of their writing style to the style of various music genres, and that songs can be automatically classified into music genres based on the similarity of those songs to personality factors. Therefore, it is possible to conclude that the lyrics of music genres can be used to predict personalities, and conversely, personalities can be used for genre identification and classification.
The idea that a text produced by a person represents his/her personality appears recurrently in the psychological literature from Freud (1997) to modern computational personality analysis (e.g. Mairesse, Walker, Mehl, & Moore, 2007; Neuman & Cohen, 2014). The rise of popular music and the availability of various music genres (e.g. pop, jazz, punk) to a wide audience through mass media may turn the lyrics of these songs into the object of a scientific analysis and it is interesting asking to what extent the lyrics characterizing these genres are associated with different personality types. While the idea that the words used in a text may represent the personality of the author has become accepted, the idea that distinct personality traits are reflected in the lyrics of various music genres is less trivial. The relation between the lyrics of various music genres and personality has not been studied but there is a wealth of studies that have associated musical preferences and personality traits.
Musical preferences and personality
Rentfrow, Goldberg, and Levitin (2011), who analyzed the structure of music preferences, found that preferences can be reduced to five factors:
A Mellow factor comprising smooth and relaxing music (e.g. soft rock);
An Unpretentious factor comprising a variety of country and singer-songwriter music (e.g. mainstream country music);
A Sophisticated factor covering complex, intelligent, and inspiring styles (e.g. Classical music);
An Intense factor defined by loud, forceful, and energetic music (e.g. punk); and
A Contemporary factor defined by rhythmic and percussive music (e.g. rap).
Based on this typology of music preferences, one can start thinking about the association between personality and music preferences. For example, it is possible to hypothesize that participants who have a clear preference for intense music may be energetic individuals defined as extraverts. Indeed, it was found by Langmeyer, Guglhör-Rudan, and Tarnai (2012) that music preferences are indicative of an individual’s personality. In their study they used the Five Factor Model of personality, also known as the Big Five model.
The Five Factor Model of personality (FFM) (John & Srivastava, 1999; McCrae, 2009) suggests that the taxonomy of personality can be described through five major traits:
Extraversion (E) involves an “energetic approach” to the social and material world and includes traits such as sociability, activity, and positive emotionality.
Agreeableness (A) involves a pro-social and communal orientation and includes traits such as altruism, tender-mindedness, trust, and modesty.
Conscientiousness (C) describes socially prescribed impulse control and goal-directed behavior.
Neuroticism (N) involves negative emotionality and feeling anxious, sad, and tense. This factor is sometimes referred to through its opposite pole: Emotional stability.
Openness to experience (O) describes the breadth, depth, and originality of the person’s mental and experiential life.
Rentfrow and Gosling (2003) identified four general dimensions of music:
Reflective & Complex (covering blues, jazz, classical, and folk music)
Intense & Rebellious (rock, alternative, and heavy metal music)
Upbeat & Conventional (country, soundtracks, religious, and pop music)
Energetic & Rhythmic (rap/hip-hop, soul/funk, and electronic/dance music).
Using the Five Factor Model of personality, Langmeyer et al. (2012) found that individuals open to experience prefer reflective and complex music and intense and rebellious music, whereas they dislike upbeat and conventional types of music. Extraverts, on the other hand, were found to prefer upbeat and conventional, and energetic and rhythmic types of music.
The relations between music preferences and personalities are far from being consensual. For instance, in a recent study, Bodner and Bensimon (2014) studied the relation between the preference for “problem music” – “musical genres such as heavy metal, punk, alternative rock, hip-hop and rap” (p. 2) and characteristics of their fans. They found that fans of problem music reported less conservatism than non-fans of the problem music. This finding can be interpreted as a positive correlation between Openness to experience and preference for problem music. However, the researchers found no difference between fans and non-fans on measures of the Five Factor Model of personality. In contrast, Cleridou and Furnham (2014) found that Conscientiousness had a negative correlation with preferences for intense styles (e.g. punk), Neuroticism was negatively correlated with preferences for intense and unpretentious (e.g. folk music) styles and that Openness was positively correlated with sophisticated (e.g. jazz), intense and mellow (e.g. soul) styles.
The abovementioned results suggest that the relation between music preferences and personality is not straightforward. For instance, the relation between music preferences and personality traits may be mediated by preferences for different types of mode and tempo. Dobrota and Ercegovac (2015) point to the importance of tempo and argue that “tempo is the most important music characteristic in modulating affect” (p. 236). Slow tempo is related to low arousal and sad music and fast tempo to high arousal (Schellenberg, Krysciak, & Campbell, 2000). This proposal points to the possibility that the relation between music preference and personality may be partially mediated by tempo and affect. For example, as extraverts are characterized by high levels of arousal it seems reasonable to expect them to prefer intense music, such as rock, that better represents and expresses their emotion, arousal, and tempo. Indeed, Kopacz (2005) found that Extraversion was related to tempo. Along the same lines, Dobrota and Ercegovac (2015) found significant correlations between preferences for major key and fast tempo music and Emotional Stability, Conscientiousness, and Agreeableness.
The rationale of the current study
The abovementioned results and many others suggest that different personalities are associated with different music preferences, but the idea that the lyrics of different music genres (e.g. punk) reflect different personality dimensions has never been studied. This provides the rationale for studying the relation between personality and the lyrics used in various music genres. However, it is important to realize that the relation between music preferences and the lyrics of the preferred genres is not simple. Music preferences are probably influenced by several factors (e.g. musical and textual) and include the lyrics characterizing the genre. For example, punk music has been described as highly confrontational with anti-establishment themes (Laing, 1985). As such, the lyrics of punk music as well as its intense characteristics may attract participants who score lower in Conscientiousness, meaning that they have less interest in social desirability and in adhering to social norms. Ipso facto one may hypothesize that the lyrics of punk songs may express lower levels of Conscientiousness as they represent an anti-establishment and rebellious state of mind.
The first study reported here aims to test the hypothesis that different personality types are associated with different language use in the lyrics of music genres. In contrast with previous studies that examined the association between self-reported personality factors and preferences of music genres, we examined whether the expression of different music genres’ lyrics in a participant’s written essay may be indicative of his/her personality. In other words, we tested the hypothesis that while writing an expository essay, a person uses certain words that are indicative of his/her personality. These words are also more or less similar to the lyrics used by different music genres. As such, the degree to which the lyrics of various music genres are represented in the essay may be indicative of the person’s personality and vice versa.
Study 1
This study focused on a very specific research question: Do personality types differ in terms of their essays’ similarity to the lyrics of various music genres? More specifically and based on the abovementioned findings, we hypothesized that:
Hypothesis no. 1. With respect to the personality factor of Extraversion, in comparison with non-extraverts, essays written by Extravert participants will be characterized by words that are more similar to those characterizing energetic music.
Hypothesis no. 2. With respect to the personality factor of Neuroticism, in comparison with non-neurotic participants, essays written by neurotic participants will be characterized by words that are more similar to those characterizing reflective music. The justification is as follows. Neuroticism is associated with Introversion (the opposite pole of Extraversion). In our personality dataset (see below), most of the participants who are neurotic are also introverts (56%) and most of those who are not neurotic are extraverts (60%). The association between Neuroticism and Extraversion is statistically significant (χ2 = 63.59, p < .001). Reflective music (e.g. jazz) is a genre associated with introversion as it involves introspection (i.e. reflection) and therefore it was reasonable to assume that neuroticism would be associated with a reflective style of writing
Hypothesis no 3. With respect to the personality factor of Conscientiousness, in comparison with non-organized, non-conscientious participants, essays written by conscientious personalities will be characterized by words that are more similar to those characterizing conventional music.
To test these hypotheses we have used two datasets and a novel methodology for automatic text-based personality analysis.
Methods
Datasets: The song lyrics data set
The first dataset is the musiXmatch dataset (Bertin-Mahieux, Ellis, Whitman, & Lamere, 2011), a collection of lyrics from 237,662 songs. The lyrics of these songs come in a bag-of-words format and are stemmed. Each song is described as the word-counts for a dictionary of the top 5000 words across the set. Each song is tagged according to one of 10 music genres such as classic pop and rock, classical music, and so on. Matching the song lyrics to each of 10 types of tagged genres, we have identified in the dataset 17,495 songs for the analysis. The distribution of the songs across the genres is shown in Table 1.
Distribution of songs per genre (N = 17,495).
Datasets: The essays and personality dataset
The second dataset we used is the Essays dataset provided to the participants of the “Workshop on computational personality recognition: Shard task” (Celli, Pianesi, Stillwell, & Kosinski, 2013). This is a corpus of 2468 stream-of-consciousness essays; each essay was labeled with the Big Five personality type of the author. In other words, for each essay written by the participant, we have the personality scores of the subject who wrote it; for each participant we have (1) a written expository essay and (2) his/her scores on each of the five personality dimensions of the Five Factor Model of personality. The personality scores obtained from the Five Factor Model of personality test had been normalized by Mairesse et al. (2007) and turned into nominal classes by the organizers of the workshop (Celli et al. 2013). The labels in the dataset are provided as categorical variables with a balanced frequency of around 50% in each category, to give approximately 50% extraverts and 50% non-extraverts and so on. It must be emphasized again that a participant scored on each of the five personality factors. For instance, he/she could have been a non-extravert, neurotic, conscientious, agreeable, and open to experience. In addition, it is important to note that while personality dimensions are usually measured on a continuous scale, the above-mentioned dataset that we have used in this study contains only the categories rather than continuous scores.
Preprocessing
We analyzed the songs in each genre separately. First, we calculated the tf-idf measure (Manning & Schütze, 1999) for each word in a genre. The tf-idf measure is a statistic that is intended to represent how important a word is to a document in a collection or corpus. The tf-idf measure is comprised of two terms. The first term is the normalized Term Frequency (tf). It is the number of times a word appears in a document, divided by the total number of words in that document; in our study, it is the number of times a word appears in the corpus of songs comprising a specific genre divided by the number of words in the corpus. The second term is the Inverse Document Frequency (idf), computed as the logarithm of the number of the documents in the corpus divided by the number of documents where the specific term appears.
The result of calculating the tf-idf measure for each word in the corpus of each song’s genre is the ranking of the words according to their level of importance in characterizing the genre. For instance, scoring the words used in the punk genre, we find die as one of the 10 highest ranked words. This word is not scored among the top-10 ranked words of the jazz genre. In sum, through the abovementioned procedure, it is possible to identify the words that best represent the lyrics of the different music genres.
In practice, we ranked the words in each genre according to their score and hence, importance. Next, we selected the top 100 ranked words for each genre. Each genre’s list of words (N = 100) was used as a vector representing the genre (the idea of using the words as a vector is explained in the next section). Similarly, we used a Part of Speech Tagging (POS Tagging), identifying in each student’s essay only nouns, verbs, adjectives, and adverbs for further analysis, and turned each essay into a vector of words comprised of these four part of speech categories only.
Next, applying the idea of vectorial semantics (Turney & Pantel, 2010), we automatically measured the similarity between each genre’s vector of words and each of the students’ essays, by using Turney’s similarity matrix (Turney, Neuman, Assaf, & Cohen, 2011), which is a tool that allows us to measure the semantic distance between words. This procedure is explained and detailed below.
Automatic text analysis
The proposed methodology for automatic text analysis has been introduced and validated elsewhere (Neuman & Cohen, 2014). The methodology is based on vector space models of semantics (Turney & Pantel, 2010) suggesting that the meaning of a target-word/s can be identified by analyzing words co-occurring with a target-word in a given text. For example, we can imagine a situation in which a researcher is interested in understanding the meaning of the word sad. Instead of using dictionaries, vector semantics identifies the words that co-occur with sad. For example, it is possible to search for the adjectives that appear with sad in the same lexical context, the neighboring words to sad. First, we search for the word sad, second, identify the adjectives that appear to the right/left of our target word in a given window (e.g. three words to the right/left of the target word), and third, identify the words that appear with sad above a certain threshold of statistical significance. One may find, for instance, that the adjectives most often co-located with sad are: lonely, depressed, and bitter. For the sake of simplicity it is possible to focus on the first two words: lonely and depressed, and assume that in the linguistic corpus sad appears with lonely six times and with depressed three times.
The idea of vectorial semantics suggests that lonely and depressed are represented as two dimensions defining the meaning of sad. According to this idea, the meaning of sad is represented as a vector in a two-dimensional space defined by lonely and depressed, as shown in Figure 1.

A vectorial representation of sad.
This is of course a simplified representation of sad. In the above figure, we used only two dimensions to represent the meaning of sad, but it is possible to use more dimensions and build a high-dimensional representation of sad in which the dimensions are words co-located with it, and in which the meaning of sad is represented as a vector in the high-dimensional space. This approach provides a way of measuring the semantic distance between words and between texts that may be considered as collections of words. For instance, if one wanted to measure the degree of sadness in a given text, one could first choose words for defining the vector of sad such as lonely and depressed, and then represent the text as a vector according to the words that it contains. At this point, it is possible to measure the distance between the two vectors. The closer the vectors are, the higher the expressed degree of sadness in the text.
In sum, in measuring the degree to which a certain concept appears in a text, the first step in the vectorial semantics approach is to identify words that are the best representatives of the notion and then measuring the distance between the vector representing this notion and the target text. For instance, in order to understand the degree in which the genre of energetic music appears in a written essay, we identify the 100 most important words characterizing this genre, represent them as a vector and measure the distance (as described below) between this vector and the vector of words extracted from the essay. This is precisely what we have done when we identified the 100 most important words in each music genre by using the tf-idf measure. The words we have chosen are considered to be the vector of the genre.
To measure a distance between a given essay and a genre, we should turn the text into a vector of words and measure the distance between the vectors. To accomplish this task, we need to define a semantic space that allows researchers to measure the distances between the words or the texts. For this reason we use Turney’s similarity matrix that has been found highly effective in several contexts for measuring semantic distance. This matrix is a word-context frequency matrix computed as follows: each row vector in the matrix corresponds to a word from a long list of common words, and the columns correspond to contexts (the words to the left and right of a given word in a given text) in which the term appeared. The final matrix includes 114,501 rows and 139,246 columns.
Applying the abovementioned methodologies and tools, we generated a similarity score for each of the essays with each of the music genres. The similarity score is indicative of the degree to which the student’s essay is semantically similar to each of the music genres. The analysis produced for each participant generates 10 similarity scores, corresponding to the essay’s similarity to each of the 10 music genres.
To gain a better understanding of the degree to which each essay is similar to a music genre vector, each similarity score was transformed as follows. First, the similarity scores were converted to z-scores (i.e. Z-Genre) that were used for all other calculations. Z-scores are converted scores that indicate how far the original score is from the mean in terms of standard deviations. This transformation was used in order to normalize the results and allow comparison across genres. Next, we defined the similarity score of the essay to a given music genre as its similarity score to the genre minus its average similarity scores to the other genres. This procedure is similar to the one used by Turney and Littman (2003) for identifying semantic orientation and aims to better identify the essay’s most important similarity to a single music genre.
Next, based on Rentfrow and Gosling’s (2003) factor analysis of music preferences, we grouped the similarity scores gained for each subject into the genre categories/dimensions. Each similarity score of these categories has been defined as the average of the essay’s similarity scores to the relevant music genres. For example, the Rebellious category is actually the average of the essay’s similarities to the metal and punk genres. The procedure applied so far is summarized in Figure 2.

The summary of the processing procedure.
Analysis and results
For all of the statistical analysis tests we have used IBM SPSS Statistics 21. To test the hypotheses, we compared the abovementioned scores (e.g., Reflective) across participants’ personality categories. For instance, we compared extravert vs. non-extravert participants on their Rebellious score, testing the difference between extravert and non-extravert participants in terms of their essay’s similarity to the Rebellious music genre.
For these comparisons, we used the Mann-Whitney Test with a Monte Carlo simulation of 10,000 samples for each analysis. To avoid inflation of Type I errors as a result of multiple comparisons on the same sample, we report results in which the significance level is .01 or below (.05 divided by 4).
With respect to extravert vs. non-extravert participants, extraverts were found to be more Energetic (z = 2.9, p = .003), which means that the extraverts’ essays were significantly more similar to the vector of the Energetic songs. Put differently, the words extraverts use in their essays are much more similar than those used by introverts to the words found in dance and hip-hop songs.
With respect to neurotic vs. non-neurotic participants, neurotic participants were found to be less Energetic (z = -2.9, p = .004) and Conventional (z = -5, p < .001) but more Reflective (z = -4.46, p < .001) than non-neurotic participants. That is, the words used in the essays of neurotic participants are more similar to the words used in the Reflective music category, such as in the jazz genre. In addition, those participants who are emotionally stable presented in their essays higher levels of similarity with the Energetic and Conventional genres (e.g., pop music).
With respect to conscientious vs. non-conscientious participants, conscientious participants were found to be more Conventional (z = -3, p < .001) and less Rebellious (z = -4.9, p < .001).
These results confirm the three research hypotheses. In addition, we analyzed the results with respect to the other two personality types: Agreeableness and Openness to Experience.
With respect to agreeable vs. non-agreeable participants, agreeable participants were found to be more Conventional (z = -4.85, p < .001) and less Rebellious (z = -6.63, p < .001) than non-agreeable participants.
With respect to open vs. non-open participants, participants who are open to experience were found to be less Energetic (z = -3.8, p < .001) and Conventional (z = -4.3, p < .001) and more Rebellious (z = -7.7, p < .001).
These results confirm our hypotheses, and are in line with the understanding of the five personality types. They can be summarized as follows. Rebellious music, such as punk, is associated with people who are open to experience and less with people who are organized and friendly (conscientious participants). Indeed, the lyrics of punk, which is a rebellious genre, seem to be echoed in the essays of those who are less oriented toward organization or social harmony (i.e. non-agreeable participants) and more oriented toward the experiential aspect of life (open participants).
Essays similar in their language to Reflective music are associated with the Neurotic type. Essays similar in their language to Conventional music are associated with the friendly agreeable type of personality, and less associated with the open to experience and neurotic types. In other words, people who are emotionally stable, adhere to social harmony, and are less open to various experiences show in their essays greater similarity to the lyrics in Conventional music genres such as pop music.
Energetic song lyrics are more powerfully evident in the essays of those who are extraverts and less among those who are neurotic or open to experience. These findings indicate that different personality types differ in their essays’ similarity to the four categories of music genres’ lyrics.
Predicting personality based on the genre style of the essay
Based on the above results, we hypothesized that the participant’s personality can be predicted based on their essay’s similarity scores to the music genres’ lyrics. This hypothesis suggests that personality type can be predicted based on the extent to which the participant’s written text represents various genres of music.
To test this hypothesis, we conducted a Binary Logistic Regression Analysis with Personality as a dependent variable (e.g. extraverts vs. non-extraverts) and scores for the four similarities to music-genres as independent variables. In other words, we aimed to predict participants’ binary value on each of the five personality dimensions (e.g. Are you an extravert or an introvert?) by using their essays’ similarity scores to each of the four music genres (e.g. Reflective). For the analysis we used the Binary Logistic Regression Analysis with a bootstrapping procedure of 1000 samples. The results are summarized in Table 2.
Results of the Binary Logistic Regression Analysis.
Next we explain the results in Table 2. First, all results were statistically significant, meaning that by using the essay’s similarity scores we can predict the participant’s personality. Second, the recall measure indicates the percent of participants correctly identified by our model as belonging to a certain personality type. For instance, 52% of the participants in our sample are defined as extraverts. Therefore, when asked who in our sample is an extravert a random guess would yield ~50% correct prediction. This baseline for prediction can be significantly improved by taking into account the participant essay’s similarity scores with the music genres. In the case of extraverts, the model correctly identified 74% of them, which is a 22% improvement over the baseline (i.e. 52%). Comparing our recall to the baseline, we found that on average our analysis gained 14% improvement in prediction. For a complementary analysis we performed a Backward (Conditional) Binary Logistic Regression Analysis. This analysis is informative in identifying the significant predictors, namely the similarity of the essay written by the participant to the Energetic, Rebellious, and Reflective genres. The results are presented in Table 3.
Results of the Backward (Conditional) Binary Logistic Regression Analysis.
Study 2
In Study 1, we tested the hypothesis that different personalities are associated with different music genres as reflected in the song lyrics. This hypothesis has been confirmed. In the second study, based on this success, we aimed to address a different challenge, the automatic classification of music genres according to their lyrics and the personality dimensions associated with them.
The automatic music genre classification involves the classification of songs into predefined categories. For example, consider a new song released to the public. The song appears in many sites, such as YouTube, and is accompanied by tags that aim to support its retrieval from databases. For instance, Jango [www.jango.com] is an internet radio station in which songs are categorized under different genre tags such as jazz, country, dance, and folk. The automatic music genre classification aims to assign the appropriate tag to the song. This procedure is highly important for the music industry given the high number of songs produced, shared, retrieved, and consumed through various media platforms. Although there has been some work on automatic genre classification (e.g., Howard, Silla, & Johnson, 2011; Liang, Gu, & O’Connor, 2011), most of it concerns non-textual features of the genre and none uses personality dimensions for classification.
For the classification task, we used the k-Nearest Neighbors algorithm (KNN), which is a common classification algorithm. Specifically we used the KNN classification algorithm with a 10-fold cross-validation, which is a common statistical validation technique. The aim of the classification procedure was to test the ability to classify the songs in our dataset into the four music genres (e.g. Rebellious) by using the similarity of song lyrics to personality dimensions.
The similarity of song lyrics to personality types was measured through the similarity of the song word vectors to each of the personality vectors defined by Neuman and Cohen (2014). These vectors represent the five personality types covered by the Five Factor Model of personality and include words describing both the positive and negative aspects of each of the five factors.
For example, the vector describing the positive aspect of the conscientiousness personality type includes words such as organized, orderly, and tidy, while the vector describing the negative aspect of the conscientious personality type includes words such as distracted, unreliable, and incompetent. Overall, we have used 10 vectors: the five factors of personality × positive/negative dimensions as detailed in Neuman and Cohen (2014).
Using the KNN, we asked binary questions such as, “Is it a song belonging to the Conventional music genre that holds 53% of the sample”? We tested this form of binary question for each of the four music genres: Conventional, Rebellious, Reflective, and Energetic. Using the KNN procedure, we were able to correctly identify 75% of the songs belonging to the Conventional genre, which is a 22% improvement in prediction over the baseline. For the Energetic genre, 29% recall was found (19% improvement in prediction over the baseline), for Reflective 32% recall (7% improvement in prediction over the baseline), and for Rebellious, 50% recall (37% improvement in prediction over the baseline). In all these cases we were able to improve our prediction of whether a song belongs to a certain music genre with an average of 21% improvement in prediction over the base rate.
Conclusion
In the current paper, we studied the relation between personality and the lyrics of music genres. Our contribution can be summarized as follows. First, we showed that there is a difference between personality types (e.g. extraverts vs. non-extraverts) in terms of their writing style’s similarity to various music genres. Second, we showed that participants can be successfully classified into personality types by using their writing style’s similarity to the writing style characterizing the words of four major music genres. Finally, we have shown that automatic classification of songs into genres is possible by using the similarity of the songs to the five personality factors.
What is the nature of the relationship between the person’s personality and the similarity of his/her writing style to the lyrics of various music genres? The current study cannot answer this question as it is limited to a more modest scope of studying a relation that to date has not been empirically studied. However, we can try to address this question with all the necessary qualifications.
The essays that we used in this study are expository essays, in which the participants reflect on their inner world. As such, the words used in the essays represent the participant’s personality. This suggestion is supported by various studies of computational personality that have used the same corpus of essays and have been successful in identifying people’s personalities from linguistic contents of essays.
Personality is not only an individualistic issue and it is legitimate to argue that, similarly to other psychological phenomena, it is primarily grounded at the collective-social level of analysis (Neuman, 2014; Valsiner, 2007; Vološinov, 1986). Therefore the words used by various music genres may be indicative of the genre’s “personality” just as the words used in a person’s essay may be indicative of his/her personality. This explanation is supported by our findings and their convergence with previous results. For example, Hansen and Hansen (1991) found that punk rock fans were less accepting of authority than those who disliked this kind of music. Authority is deeply associated with the personality dimension of Conscientiousness and indeed we found that the essays of conscientious participants were less similar to the lyrics of rebellious music than the essays of non-conscientious participants. This finding is in line with those of Langmeyer et al. (2012) and Cleridou and Furnham (2014).
We also found that the essays of extravert participants were more similar than those of non-extraverts to the lyrics of energetic music, a finding that converges with Langmeyer et al. (2012). A similar convergence to the results of these researchers is also evident with respect to the personality dimension of openness to experience. In sum, our explanation of the relation between the personalities and the lyrics of music genres is the same as the finding that the individual’s personality is reflected in one’s written essay: the personality of a music genre is reflected in its song lyrics. Adopting a contemporary approach to cultural psychology (e.g. Neuman, 2014), identifying the personality of a genre is not a move that involves a categorical error, but a move which is fully justified under the appropriate qualifications and the application of the relevant tools.
From a prospective perspective, we may think about future studies. In studying the personality of a genre one may ask questions such as: What are the dominant emotions evident in the genre? We can scientifically answer this question by using contemporary tools and theories of emotion research such as those presented in Westbury, Keith, Briesemeister, Hofmann, and Jacobs (2014). In addition, we can identify the cognitive and behavioral dimensions of personality in the lyrics of music genres by asking what kind of actions and schemes are evident in the songs. Such future studies may help researchers to better understand the personality of music genres in a scientific way.
In addition, close attention can be given to the embodied aspect of song lyrics. Those studies associating music preferences to basic music dimensions, such as of tempo, and to the bodily aspect of music (Jola, Pollic & Calvo-Merino, 2014) are in line with the current trend in cognitive sciences that attempts to ground higher level cognition in basic sensory-motor experience (Lakoff & Johnson, 1999; Meteyard, Cuadrado, Bahrami, & Vigliocco, 2012; Pulvermüller, 2013). In this context, the relation between personality and the lyrics of music genres may be studied through the mediation of basic embodied processes. For example, it could be proposed that extraverts prefer energetic music because both the tempo and the lyrics of energetic songs evoke in their mind an integrated simulation of affect/arousal, behavior (e.g. the mental simulation of energetic dance), and the schemes of thoughts associated with high activation. Neuroimaging studies might provide a better understanding of this complex relation.
From a critical perspective, we should qualify our findings as they are limited to a certain context of analysis. First, we have used categorical measures of personality (e.g. are you an extravert or not) rather than the full range of personality scores. The choice of categorical measure has been imposed by the categorical dataset that we have used. This dataset, however, was a gold standard for the participants of the workshop on computational personality and has been analyzed by several researchers (e.g. Neuman & Cohen, 2014). Using continuous measures of personality in future studies may improve our ability to study the relation between personality and lyrics. In this context, we should acknowledge the fact that although lyrics are an important aspect of a musical genre, they are only a partial aspect. In some musical genres (e.g. punk, folk), the lyrics may be more important than in other musical genres (e.g. classical music). Therefore, future studies should fuse different aspects of songs and music genres, both textual and non-textual, in order to extend and deepen the relation we have studied.
In sum, the idea that the words used in the songs of various music genres are indicative of different personality types is a simple extension of a logic used previously in the analysis of texts. However, this is the first time personality dimensions have been associated with the lyrics of music genres. Based on a large sample of participants and songs, we have shown not only that personality types differ in terms of their texts’ similarity to various music genres, but also that the participant’s personality can be successfully classified according to the music genre’s style in which he/she writes.
Beyond a theoretical interest in the association between the lyrics of music genres and personality types, it is possible to consider the practical implications of these associations. For example, automatic music recommendation engines can be improved by taking into account the personality of the person listening to the music. Algorithms can be designed that automatically analyze the participant’s texts as they are naturally written on social media platforms (e.g. Facebook), identify personality dimensions, and use them in order to provide the participant with recommendations that adhere to his/her character. While such an engine is currently not available it is a potential feasible development based on the body of knowledge gathered in studying the psychology of music and the automatic analysis of personality.
Footnotes
Acknowledgements
The authors would like to thank the anonymous reviewers for their constructive reading of the paper, and the editor for her careful reading and proposed revisions that significantly improved the paper.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
