Exploiting MUSIC model to solve cold-start user problem in content-based music recommender systems

Abstract

The new user cold-start problem is a grand challenge in content-based music recommender systems. This happens when the systems do not have sufficient information regarding the user’s preferences. Towards solving this problem, in this study, a rating prediction framework is proposed. The proposed framework allows the systems to predict the user’s rating scores for unrated musical pieces, by which good recommendations can be generated. The core idea here is to leverage the so-called MUSIC model, i.e., a five-factor musical preference model, which is characterized by Mellow, Unpretentious, Sophisticated, Intense, and Contemporary as the user’s musical preference profiles. When a user newly joins the systems, the first five-factor musical preference profile is established based on the user’s age and brain type information which is extracted from questionnaires. When the user experiences the systems for a certain period, his/her rating scores for experienced musical pieces are utilized for generating the second five-factor musical preference profile. The recommendations are then provided based on the rating scores predicted from a non-linear combination of these two five-factor musical preference profiles. The results demonstrated the effectiveness of the five-factor musical preference in alleviating the new user cold-start problem. In addition, the proposed method can potentially provide high-quality recommendations.

Keywords

Music recommender system content-based recommendation new user cold-start problem five-factor MUSIC model

ï»¿

1. Introduction

In online music streaming services such as Spotify and Apple Music, music recommender systems play extremely important roles in suggesting musical pieces that meet users’ preferences. Such music recommender systems suffer from the cold-start problem [1, 2], which is related to the lack of users’ and items’ information in the system. Generally, there are two cases in the cold-start problem: new user case and new item case. The former is when the system does not have sufficient information regarding the new user’s preference profile. The latter refers to the case where new items lack user interactions when they are added to the system. Developing music recommender systems that can provide high-quality recommendations in these cold-start situations is a challenging task. To design recommender systems, there are two mainly adopted approaches: collaborative filtering (CF) [3] and content-based filtering (CBF) [4]. The CF approach tries to retrieve the recommendations by learning past user-item relationships, whereas the CBF approach determines the recommendations based on user preference profiles and item attributes. The CBF does not rely on other user ratings, thus the new item, in other words, an unrated item can be recommended. However, both types of systems suffer from the new user cold-start problem where no historical ratings of the user are available [5]. In this study, the new user cold-start problem in content-based music recommender systems is considered.

To deal with this problem, it is important to effectively construct musical preference profiles for the new users. In literature, there exist several alternatives to create musical preference profiles. One of the most common approaches, for example, is the genre-based method. In such methods, genre-based preferences can be obtained via various surveys such as the STOMP (Short Test On Musical Preferences) [6] or MPQ (Music Preference Questionnaire) [7]. However, some music genres might have different connotations in different cultures [8]. Also, they are broad, inconsistent, and ill-defined. As the result, the genre-based preference cannot express the user’s profile in some cases [8]. To solve these problems, Rentfrow et al. conducted a factor analysis of musical preference questionnaires and discovered five factors of musical preferences [9, 10]. This five-factor musical preference consists of Mellow, Unpretentious, Sophisticated, Intense, and Contemporary (MUSIC), and provides better underlying musical descriptions and preferences. Leveraging the five-factor MUSIC model, Soleymani et al. proposed a content-based music recommender system [11]. Their results showed that the five-factor MUSIC model is effective for music recommendations compared to other traditional methods such as genre and artist-based preferences. However, only the new item cold-start problem was considered in their study. The main aim of this study, on the other hand, is to exploit the five-factor MUSIC model, towards solving the new user cold-start problem in the content-based music recommender system. Therefore, a rating prediction framework is proposed in this paper. The output of the proposed framework is the user’s rating scores for musical pieces, which can be used to generate appropriate recommendations for the user. The framework comprises three main components: Feature-Based Profiler, Rating-Based Profiler, and Rating Predictor. Feature-Based Profiler is responsible for predicting the initial MUSIC model preference profile, in other words, feature-based profile (FBP), for a new user, based on human characteristics obtained from the user’s age [12] and brain type [13]. Rating-Based Profiler, on the other hand, is capable of producing another preference profile called rating-based profile (RBP) based on the user’s rating scores for experienced musical pieces. In this study, a non-linear function is proposed to combine FBP and RBP to produce a hybrid preference profile (HBP) which is then fed into the third component: Rating Predictor. This component accounts for predicting rating scores of the user for unrated musical pieces, facilitating the creation of appropriate recommendations for the user. The experimental results demonstrated that by using the five-factor MUSIC model, the proposed approach can provide better rating prediction in comparison with the baseline method, that is to say, the genre-based method. Such results emphasize the potential of using the MUSIC model in solving the new user cold-start problem in content-based music recommender systems.

This paper is an extension of our previous study [14] where the effectiveness of the five-factor MUSIC model in describing preference profiles was explored. In this study, a proposed rating prediction framework, using the five-factor MUSIC model as the representation of musical preference, is presented. The extension in this study is threefold:

•
First, a non-linear sigmoid function is proposed to combine FBP and RBP for producing HBP. In the previous study [14], the weighted linear combination was used. The linear curve cannot strictly capture the changes of eighter FBP or RBP contributions on HBP in different situations. Meanwhile, the sigmoid function provides better descriptions of the combination of FBP and RBP.
•
Second, a method to predict the user’s rating scores for musical pieces from obtained HBP is introduced in the predictor component.
•
Third, additional evaluation is performed to assess the performance of the proposed framework. This is done by comparing with a genre-based method as a baseline.

The remainder of the paper is organized as follows: Related work is provided in Section 2. Section 3 describes the proposed method. Sections 4 and 5 provide the performance evaluation of the proposed method and their discussions. Finally, the paper is concluded in Section 6.

Table 1
Descriptions and typical genres of the factors the five-factor MUSIC model [10]

Factor Description Typical genres

Mellow Romantic, Relaxing, Unaggressive, Sad, Slow, Quiet Soft Rock, R&B, Adult Contemporary

Unpretentious Uncomplicated, Relaxing, Unaggressive, Soft, Acoustic Country, Folk, Singer/Songwriter

Sophisticated Inspiring, Intelligent, Complex, Dynamic, Cultured Classical, Operatic, Avant-Garde, World Beat, Traditional Jazz

Intense Distorted, Loud, Aggressive, Tense Classical Rock, Punk, Heavy Metal, Power Pop

Contemporary Percussive, Electric, Rhythmic, Danceable Rap, Electronic, Latin, Acid Jazz, Euro Pop

2. Background and related work

Factor	Description	Typical genres
Mellow	Romantic, Relaxing, Unaggressive, Sad, Slow, Quiet	Soft Rock, R&B, Adult Contemporary
Unpretentious	Uncomplicated, Relaxing, Unaggressive, Soft, Acoustic	Country, Folk, Singer/Songwriter
Sophisticated	Inspiring, Intelligent, Complex, Dynamic, Cultured	Classical, Operatic, Avant-Garde, World Beat, Traditional Jazz
Intense	Distorted, Loud, Aggressive, Tense	Classical Rock, Punk, Heavy Metal, Power Pop
Contemporary	Percussive, Electric, Rhythmic, Danceable	Rap, Electronic, Latin, Acid Jazz, Euro Pop

Content-based recommender systems analyze items themselves to identify items that are of particular interest to the user [4]. The user’s interest is represented by a preference profile. There are several approaches to express the musical preference profile in existing studies. The simplest one is using the raw rating score for each musical piece. However, the computational cost will be exceedingly large when the number of musical pieces in the system is large. In practice, the musical pieces in the database are dynamic: added and removed every day. Moreover, it leads to suffering from the new user cold-start problem when the rating scores are not sufficiently collected. To avoid the use of all pieces, Chou et al. have used a subset of pieces based on the domain knowledge (e.g., albums, artists, or genres) to express the user’s profile [15]. Such a subset might be the minimal set to grasp or convey the user’s musical preferences [16]. However, it could be biased by selected pieces. The genre-based preference is also typically used, and easily constructed by the survey [6, 7]. According to [17], it might achieve good performance for music recommendation more efficiently because users are more capable of providing information regarding genre preferences of those surveys. However, Rafael et al. found that the genre-based preferences did not successfully account for the musical preferences of some individuals in their experiments because finding consensus to define a genre is often a challenging task [18]. Actually, music genres have broad and inconsistent definitions and different connotations in different cultures [8].

Figure 1.

Overview of the proposed rating prediction framework.

To express the appropriate musical preference, Rentfrow et al. identified the underlying musical preference structure, which consists of five factors, namely Mellow, Unpretentious, Sophisticated, Intense, and Contemporary (the five-factor MUSIC model) [9, 10]. This model was discovered by factor analysis using 52 musical pieces representing 26 different genres. Table 1 provides descriptions and typical genres of those five factors. Each musical piece has factor loadings which describe the correlational relation between the musical preference and each factor. Soleymani et al. proposed a content-based music recommender system using the five-factor MUSIC model [11]. They used the auditory temporal modulation features and the regression with sparse representation to detect acoustic attributes, which are factor loadings of the five-factor MUSIC model for each piece. Their method significantly outperformed the other traditional content-based methods in terms of rating prediction. The results demonstrated that the use of the five-factor MUSIC model could provide a better musical description and user’s preference profile. However, their study only focused on the new item cold-start problem.

In this study, the new user cold-start problem in content-based music recommender systems is considered. One way to deal with this problem is to try to enrich the preference profile of a new user by relying on other information that is known to be correlated with musical preferences. Researchers have examined the relation between musical preferences and human characteristics [8]. Bonneville-Roussy et al. investigated age differences in musical attributes and performances [12]. They examined age trends in musical preferences represented by the five-factor MUSIC model. The results indicated the Intense and Contemporary factors decrease with age, whereas the Unpretentious and Sophisticated factors increase with age. Greenberg et al. studied the relation between musical preference and human personality information [13]. By employing the empathizing-systemizing (E-S) theory [19], they identified the ways in which musical preferences are differentiated by cognitive ‘brain type.’ The results revealed that those who are type E (bias towards empathizing) preferred musical pieces on the Mellow factor compared to type S (bias towards systemizing) who preferred pieces on the Intense factor.

Being inspired by the above investigations, in this study, to establish the new user’s preference profile, he/she is encouraged to fill the questionnaire regarding his/her age and brain type. This information is then used as the input for the prediction of the five-factor musical preference of the new user.

3. Proposed framework

In this study, a rating prediction framework using the five-factor MUSIC model is proposed, towards solving the new user cold-start problem. The proposed framework is depicted in Fig. 1 where there are mainly three components: Feature-Based Profiler, Rating-Based Profiler, and Predictor. In the Feature-Based Profiler, the user’s five-factor MUSIC preference profile, i.e., feature-based profile (FBP), is predicted from the user’s age and brain type. When the user newly joins the system, the initial recommendations are significantly contributed by this profiler. In the Rating-Based Profiler, the user’s rating data for experienced musical pieces are used to create the preference profile, namely rating-based profile (RBP). This profiler contributes to updating the profile as users experience the system. The FBP and RBP are combined to create another musical preference profile, namely hybrid-based profile (HBP). The HBP expresses the contribution of FBP and RBP to the recommendations provided to the user in two considered scenarios: (1) when the user newly joins the system and (2) when the user experiences the system for a certain period. Based on the HBP, the rating scores for unrated musical pieces are predicted in the Predictor. Finally, the recommended musical pieces will be determined from the rating scores. The details of each component will be presented in the next subsections.

3.1 Feature-Based Profiler

Feature-Based Profiler is a component to output the new user’s FBP – a five-factor musical preference profile that is predicted from appropriate human characteristics that are obtained from the answered questionnaires. In this subsection, detailed descriptions of feature selection and regression models for the prediction of FBP will be provided.

3.1.1 Feature selection

The connection between the five-factor musical preference and human characteristics has been investigated in some existing studies. Bonneville et al. showed that the musical preference for each factor in the five-factor MUSIC model differs with age [12]. Meanwhile, Greenberg et al. indicted that the musical preference would differ for each brain type [13]. Thus, in this study, both age and brain type are used as input features for the prediction of the new user’s preference profile of the five-factor MUSIC model.

To calculate the brain type based on the E-S theory [19], the same procedures reported in [20] are used. The brain type is determined by the D score, which is the measurement of the difference between the standardized EQ and SQ scores. The EQ score is calculated from a 60-item questionnaire designed to measure the individual’s empathy level, and the SQ score is calculated from a 75-item questionnaire to measure the individual’s systemizing level. The EQ score ranges from 0 to 80, and the SQ score has a range of 0 to 150. The D score is calculated based on the following equation:

$\displaystyle D=\frac{S-E}{2}$ (1)

where $S$ is the standardized SQ score and $E$ is the standardized EQ score. This equation indicates that the original EQ and SQ axes are rotated 45 ${}^{\circ}$ to produce the $D$ axis as shown in Fig. 2. Accordingly, the $D$ score represents the tendency of the individual’s systemizing. Based on the $D$ score, those who score below the 35th percentile are classified as type $E(E>S)$ , whereas those who score between the 35th and 65th percentile are classified as type $B(E\approx S)$ . Finally, those who score above the 65th percentile are classified as type $S(S>E)$ .

Figure 2.

Empathizing-systemizing theory and brain type [19].

In the proposed system, when a user registers to the system, he/she is required to answer questionnaires regarding age, 60-item Empathy Quotient [21], and 75-item Systemizing Quotient-Revised [20]. Each of those items is answered on a 4-point scale (strongly disagree, slightly disagree, slightly agree, or stronglyagree). The brain type is then calculated with Eq. (1).

3.1.2 Polynomial regression model

In order to predict the new user’s FBP based on both age and brain type features, polynomial regression models are applied in this study. In general, the goal of polynomial regression analysis is to model the expected value of a dependent variable as $n$ th degree polynomial in terms of the value of independent variables. According to [12], the five-factor musical preference can be modeled as polynomial regression against the individual’s age. It has been revealed that the preference for the Mellow factor is modeled as a cubic model, the preferences for Sophisticated, intense, and Contemporary factors are as the quadratic model, and the preference for Unpretentious factor is as a linear model. In our previous study, those degrees of polynomial were leveraged to build the models. However, since the proposed method considers the use of both age and brain type for FBP prediction, the prediction models of polynomial regression against the user’s age are built for each brain type as follows:

$\displaystyle p_{\textit{f eature}}=pF=\beta_{\textit{F,BT,0}}+\mathop{\sum}% \nolimits^{n}_{i=1}\beta_{F,BT,i}f^{i}$

(2) $\displaystyle\{F\in[M,U,S,I,C],BT\in[E,B,S]\}$

where $p_{\textit{f eature}}$ ( $=$ [ $p_{M}$ , $p_{U}$ , $p_{S}$ , $p_{I}$ , $p_{C}$ ]) is a FBP vector consisting of profile values for each factor, $f$ is a numerical value of age, and $\beta_{\textit{F,BT,i}}$ represents a polynomial regression coefficient of $i$ th degree polynomial. The polynomial regression coefficient is estimated with ordinary least squares estimation. The degree of polynomial $n$ will be determined based on the prediction error in the experiments using the actual dataset. It can help to build more fitted prediction models for each factor and brain type against age.
3.2 Rating-Based Profiler

Rating-Based Profiler is a component to create the user’s RBP based on the user’s rating data. The RBP is represented by the weighted preference score for each factor of the five-factor MUSIC model. To obtain the preference score of each factor, the weighted average is calculated for the rating scores of musical pieces with the piece’s factor loading value. The factor loading is the correlational relation between the musical preference and a factor, that is the preference score is high when a user gives a high rating score to a musical piece with a high factor loading value. The factor loading value can be predicted from the acoustic features [11]. In this paper, the values reported in [10] by factor analysis are used. The above calculation is as follows:

$\displaystyle p_{\textit{rating}}=p_{F}=\frac{L_{F,1}r_{1}+L_{F,2}r_{2}+\ldots% +L_{F,n}r_{n}}{L_{F,1}+L_{F,2}+\ldots+L_{F,n}}\{F\in[M,U,S,I,C]\}$ (3)

where $p_{\textit{rating}}$ ( $=[p_{M}$ , $p_{U}$ , $p_{S}$ , $p_{I}$ , $p_{C}$ ]) is a vector of RBP representing the weighted average preference score for each factor, $r_{i}$ is the user’s rating score of $i$ th musical piece, and $L_{F,i}$ represents a factor loading value of $i$ th piece for each factor. $n$ is the number of the user’s rated pieces.

3.3 Weighted combination

In this study, the combination of FBP and RBP, in other words, hybrid-based profile (HBP) represents the five-factor MUSIC preference profile of the users in either scenario: when the users newly join the system or when the users experience the system for a certain period. The FBP represents the user’s implicit musical preference since it does not require rating data but the user’s features of age and brain type. Meanwhile, the RBP expresses the user’s explicit musical preference which is established based on rating data under the feedback to experienced musical pieces.

When a user does not provide sufficient rating data i.e., the new user case, the RBP provides a trivial contribution to HBP. Thus, the system relies on the FBP to provide initial recommendations to the user. On the other hand, after the user provides sufficient ratings for experienced pieces, the recommendations are made mostly based on the RBP rather than the FBP. To operate such a combination, the weighted combination of the FBP and the RBP is proposed for the new user’s preference profile prediction. Such a combination is shown in the following equation:

$\displaystyle{\bm{p}}=\alpha\times{\bm{p}}_{\textit{rating}}+(1-\alpha)\times{% \bm{p}}_{\textit{f eature}}$ (4)

where $p$ is the HBP as the combined preference profile of the FBP denoted by $p_{\textit{f eature}}$ and the RBP denoted by $p_{\textit{rating}}$ . Meanwhile, $\alpha$ is a weight parameter controlling the ratio of the contribution of the RBP to the HBP. In the proposed method, it is set to automatically adapt to the density level of rating data for each user. Here, the density level is the ratio of rated musical pieces in whole pieces. That is, the density level is 0.2 when a user has not rated 80% of musical pieces in the system (the new user case). On the other hand, its level is 0.8 when a user has already rated 80% of pieces in the system.

In the previous study [14], a weighted linear combination was used to express the combination of FBP and RBP. However, in such the method, Î± is set to the density level itself, thus the contribution of the RBP linearly increases as user rates to pieces. In other words, the linear curve strictly reflects the changes in the contribution of FBP and RBP along with the increase of density level. In this study, non-linear function, i.e., the sigmoid function is hypothesized to give better descriptions of such changes. Accordingly, $\alpha$ is set based on the function curve that has a non-negative derivative at all points and exactly one inflection point. The sigmoidal weight parameter is set as follows:

$\displaystyle\alpha=\left\{\begin{array}[]{cc}0&d=0\\ \varsigma(d)&0<d<1\\ 1&d=1\end{array}\right.$ (5a) $\displaystyle\varsigma(d)=\frac{1}{1+e^{-10*(d-0.5)}}$ (5b)

where $d$ is the density level of each user. Figure 3 illustrates how the weight parameter is changed against the density level. The sigmoid curve starts increasing slowly first, then increases rapidly, and finally levels off. In this case, such changes reflect that the RBP is not reliable for initial recommendations due to small rating data, thus the contribution of the RBP is set to a smaller value. After sufficient rating data is obtained, the contribution of RBP gradually becomes large.

Figure 3.

Weight parameter against density level.

3.4 Rating predictor

Rating Predictor is a component to predict the rating scores for unrated musical pieces based on obtained HBP. The five-factor musical preference profile is the weighted preference score for each factor. Also, the factor loading represents which factors have a strong influence on musical preference. The preference scores of factors with a high factor loading value highly affect the rating score of the musical piece. Therefore, the predicted rating score for $i$ th musical piece $\hat{r}$ is calculated based on a proposed equation as follows:

$\displaystyle\hat{r}_{i}=$ (6)

where $p_{F}(F\in[M,U,S,I,C])$ is the preference profile for each factor of the five-factor MUSIC model, and $L_{F,i}$ is the factor loading value of $i$ th piece for each factor. The factor loading value can be detected from that audio data itself using the method proposed in [11], thus this equation does not rely on other users’ and musical pieces’ information. This calculation is used to predict the rating scores for unrated musical pieces, then the recommended pieces are determined based on the rating scores.

4. Performance evaluation

In this section, the effectiveness of the proposed framework is evaluated on the dataset [10], based on the following points:

•
Evaluation of the performance of Feature-Based Profiler. This evaluation is to confirm the efficacy of the use of both age and brain type for the prediction of FBP of the user whose rating information is still insufficient.
•
Evaluation of using HBP as a non-linear combination of FBP and RBP.
•
Evaluation of Rating Predictor. Its performance in predicting the user’s rating is compared with those of a genre-based method.

In the following subsections, firstly, the description of the dataset used for the evaluation is provided. Then, the evaluation metrics to measure the performance are described. Finally, the performances for each point are presented.
4.1 Dataset

A dataset collected in [13] is leveraged for this evaluation. To obtain the dataset, the authors in [13] asked 353 participants to complete all measures of the survey. There were 220 females and 133 males with the sample range in age from 18 to 68 with the mean age $=$ 34.10, participating in the experiments. The participants completed the 60-item EQ and 75-item SQ-R. They were also required to rate the 25 musical pieces used in [9, 10] with a nine-point scale from 1 (extremely dislike) to 9 (extremely like). Factor loading values of the five-factor MUSIC model for each piece are reported in [10].

From the datasets, three participants who gave the same ratings to all musical pieces are removed as outliers. The criteria for the 9-point scale subjective rating differ depending on the participants. Therefore, in order to avoid bias in the rating scores for each participant, the rating scores are standardized for each participant. After that, based on all rating scores for 25 musical pieces, the preference scores of each participant are calculated by using Eq. (3). Such preference scores are treated as the ground truth for assessing the performance of the proposed profiler. In addition, 26 genres [10] are associated with each of the 25 pieces used in the dataset.

4.2 Evaluation metric

In this paper, Root Mean Squared Error (RMSE), Pearson Correlation Coefficient (PCC), and Area Under the Curve (AUC) are used as evaluation metrics. RMSE is considered for prediction accuracy between observed values y and predicted values $\hat{y}$ , and is calculated as follows:

$\displaystyle\textit{RMSE}=\sqrt{\frac{1}{n}\sum\nolimits_{i=1}^{n}(y_{i}-\hat% {y}_{i})^{2}}$ (7)

where $n$ is the number of users in the dataset. For this metric, a lower value illustrates a better result. In this evaluation, this metric is used to measure the performance of Feature-Based Profiler.

PCC is considered for measuring the relation between observed and predicted value, as the following equation:

$\displaystyle\textit{PCC}=\frac{\sum_{i=1}^{n}(y_{i}-\bar{y})(\hat{y}_{i}-\bar% {\hat{y}})}{\sqrt{\sum^{n}_{i=1}(y_{i}-\bar{y})^{2}}\sqrt{\sum_{i=1}^{n}(\hat{% y}_{i}-\bar{\hat{y}})^{2}}}$ (8)

where $\bar{y}$ represents the average value of $y$ . PCC can be ranging from $-$ 1 to 1, and a higher value is better. This value will be 0 in random prediction. In the evaluation, the important point is to obtain the same tendency between the observed and predicted value rather than to predict exactly the same value as the observed one. This is because the recommended musical pieces will be determined based on the tendency of the ratings.

AUC is the definite integral of a curve that describes the variation as a function of time. It is useful when trying to determine the total performance across time. In this evaluation, AUC is calculated for PCC against the density level. Here, the density level, which is a ratio of the rated musical pieces in whole pieces, represents the time information. PCC is calculated for each density level. AUC can be ranging from 0 to 1, and higher is better. In addition, this value will be 0.5 in random prediction.

4.3 Performance of Feature-Based Profiler

To evaluate the performance of Feature-Based Profiler, firstly, polynomial regression models based on the proposed user’s features are built using the dataset. The dataset is divided into 90% of training data and 10% of testing data with 10-fold cross-validation to determine the degree of the polynomial for each factor and brain type. The degree with the lowest RMSE is chosen for each model.

Figure 4.

Polynomial regression curve of preference score for each factor and brain type.

Figure 4 illustrates the polynomial regression curves against age for each factor and brain type. The plots on the graphs are mean preference scores for each generation (15–25, 25–35, 35–45, 45–55, and 55–65 years old). The graphs indicate that the trends of all factors over age differentiate by the brain types. In general, for type E, the Unpretentious preferences linearly increase along with age, whereas the Contemporary preference hardly changes. There are preference peaks for the Mellow, Sophisticated, and Intense factors. For type B, the Mellow, Sophisticated, and Contemporary preferences linearly decrease, whereas the preferences for Unpretentious and Intense factors increase. Also, for type S, the Mellow, Sophisticated, and Contemporary preferences overall decrease, whereas the Unpretentious preference increases along with age. The preference for the Intense factor hardly changes. This result shows that the trends of preference scores can be expressed in more detail by using both age and brain type.

To verify the effectiveness of using both age and brain type for the prediction of FBP, the following two evaluation metrics were considered: RMSE and PCC. There were three strategies for the evaluation: only use brain type, only use age, and use both age and brain type as the new user’s registration information. Tables 2 and 3 summarize the prediction performance of regression models in each of the three above strategies. It is revealed that the use of both age and brain type produces better prediction performance across all factors. In terms of the correlation coefficient, there are slightly positive correlations between observed and predicted preference scores. Although these values are not significantly high, there is no rating data in the new user case, that is, the RBP cannot be created. Therefore, these positive correlations indicate that when the use of both age and brain type, the predicted FPB is potentially effective for making the initial recommendations.

Table 2

RMSE of feature-based profile prediction

Feature	M	U	S	I	C
BT	0.1931	0.2392	0.2383	0.3119	0.3215
Age	0.1918	0.2372	0.2369	0.3142	0.3217
Age&BT	0.1911	0.2352	0.2326	0.3055	0.3171

Table 3

PCC of feature-based profile prediction

Feature	M	U	S	I	C
BT	0.0417	0.0819	0.1727	0.1962	0.1398
Age	0.1218	0.1494	0.2022	0.1575	0.1358
Age&BT	0.1472	0.2001	0.2751	0.2787	0.2149

4.4 Performance of weighted non-linear combination

There are two main points that will be confirmed in this evaluation: (1) the superior combination of FBP and RBP, in other words, HBP over either single FBP or RBP, and (2) the superior of non-linear combination HBP over linear combination HBP. In this evaluation, PCC is used as an evaluation metric which is calculated for density levels. The density level is varied from 0 to 1 with a step of 0.1 to simulate the increasing number of musical pieces experienced by the user from when the user newly joins the system. In other words, the more musical pieces are rated by the user, the higher the density level is. In practice, the rating scores of musical pieces for each density level are randomly sampled from each user to create the RBP. Then, the RBP is combined with the FBP to create the HBP with the linear and non-linear methods. Finally, the PCC is calculated between ground truth and predicted preference scores for each density level. This operation is performed 100 times for each density level to avoid data bias due to sampled rating scores. In this paper, the averaged PCC will be reported.

Figure 5.

Performance of profile prediction.

Figure 5 illustrates the prediction performance of the linear and sigmoidal HBP over different density levels compared to RBP. The dashed line represents the random prediction, and the FBP takes a constant value since it does not depend on the density level. Overall, it is obvious that both HBP always produces better predictions compared to the case of random and RBP. On AUC, the linear HBP and the sigmoidal HBP achieve 0.690 and 0.704 while the RBP is 0.573. Thus, throughout, the HBPs outperform the RBP.

The RBP gradually produces a better prediction after 0.2 of the density levels. Moreover, the RBP outperforms the FBP in 0.4 of the density levels. The HBPs are created by combining the FBP and the RBP based on the density level so that FBP contributes more to the small density level. Therefore, combining the FBP and RBP can produce a better prediction across the density levels. In addition, the sigmoidal HBP outperforms the linear one, especially in the new user case. This is because the sigmoid curve can accurately represent the changes of HBP to the recommendation along with density level.

4.5 Performance of rating predictor

To evaluate the whole proposed method, the performances of rating prediction are compared for the density level. The evaluation is performed under the same conditions as the above evaluation 4.4. After a preference profile is created based on the sampled rating data from the dataset, the rating scores for unrated musical pieces are predicted. Then, the PCC is calculated by combining the sampled rating scores and the predicted rating scores. In this paper, the proposed sigmoid HBP is compared with a traditional genre-based method used in [11] as a baseline method. The genrebased rating prediction is used as a traditional content-based method. The genres are the 26 categories used in [10]. The average rating score for each genre is used as a profile, and the rating score for unrated pieces is predicted using it.

Figure 6.

Performance of rating prediction.

Figure 6 illustrates the rating prediction performance of each method across the density levels. The result indicates the proposed HBP outperforms the baseline method overall. In terms of AUC, the proposed HBP obtains 0.660, whereas the genre-based prediction is 0.576. From Fig. 5, the sigmoidal HBP, which is the combination of FBP and RBP with non-linear weight, achieves better results than the other methods. In addition, the genre-based method depends on whether the user has rated musical pieces belonging to a specific genre. For example, if a user has never rated a musical piece that belongs to the rock genre, the system has no idea if he/she likes rock music. On the other hand, the proposed five-factor preference profile does not depend on a specific musical piece because all musical pieces have factor loadings for each factor. Therefore, the proposed rating prediction method based on the sigmoid HBP produces better results than the genre-based method.

5. Discussion

According to the above-mentioned evaluation results, it can be seen that the proposed method completely outperforms the baseline methods, especially in the new user case. Thereby, it generally demonstrates that the proposed prediction framework can provide better rating predictions for any type of user. In the following subsections, the effect of age and brain type for the five-factor musical preference and the effect of the non-linear combination of FBP and RBP and predictor in new user case are discussed in detail.

5.1 Effect of age and brain type for the five-factor musical preference

Music plays an important role to help young people explore their identities and form relationships with their friends [22, 23], whereas adults listen to music as relaxation and entertainment for emotion regulation [12]. According to [12], adolescents, who struggle to establish a sense of independence and autonomy, find appealing the rebellious connotations of the Intense factor. Mellow and Contemporary factors are most popular in early adulthood, who invest time and effort to developing intimate bonds of love [24]. This is because these factors of music reinforce desires for intimacy and also complement the settings where young people come together with the goal of establishing close relationships. Unpretentious factor, which has the relaxing and familial themes, and the Sophisticated factor, which is the aesthetic qualities, are attractive by middle adulthood. This is because they are at a life stage where family life is focused and also preoccupied with the challenge of establishing social status and career success [25].

According to [13], people prefer music that reflects their empathizing and systemizing tendencies. Because type E has a tendency to perceive and react to the emotional and mental states of others, they prefer music that reflects emotional depth. Elements of emotional depth are often heard in the Mellow and Unpretentious music. On the other hand, because type S has a tendency to construct and analyze systems, they would prefer music that contains intricate patterns and structures. These elements are often heard in Sophisticated music. Also, since they have lower levels of empathy, they are likely to prefer Intense music.

Figure 4 provides further insights by combining age and brain type. Preference score of the Mellow factor peaks in early adulthood (the 20s and 30s) for all brain types and decreases with age. Though types B and S decrease linearly, type E changes drastically with age. This is because type E, who perceives and reacts to the emotional and mental states of others, has a relatively large response to Mellow music, which is related to building intimate relations with their peers. Preference scores of the Unpretentious factor overall increase with age. Though type S does not prefer Unpretentious music compared to the other types, the preference scores are relatively high in the late 20s and 30s. Therefore, it turns out that even they prefer Unpretentious music at the life stage where family life is the focus. Sophisticated music, which includes complex patterns and structures, is preferred by all ages of type S. On the other hand, for type E, the preference scores are relatively high in middle adulthood people (the 40s) who are preoccupied with the challenge of establishing social status and career success. Regarding the Intense factor, types B and S have almost no change in preference score with age. Intense music is preferred in adolescents who are mentally unstable. Therefore, type E, who tends to respond emotionally and physiologically to music, likely has changed with age. Finally, the preference scores of the Contemporary factor overall decrease with age. However, in type E, the preference scores hardly change with age. Contemporary music, same as Mellow music, is related to building intimate relations with peers. Also, type E and type S have contradictory properties. Therefore, type E of the Mellow factor and type S of the Contemporary factor are symmetrical tendencies. From these findings, it can be seen that, by using both age and brain type, it is possible to express the more accurate preference score, which cannot be expressed by one of them.

In the proposed user’s features, there are some limitations that would be considered in future works. First, several regression models, which are the preferences for type B of Mellow factor and type E of Contemporary factor, do not fit the mean scores well. Second, the preferences for types B and S of the Intense factor and type E of the Contemporary factor have almost no change with age. To predict those accurate preferences, other types of human characteristics [8] should be considered as user’s features in future work.

5.2 Effect of nonlinear combination for hybrid-based profile

In the HBP (as described in Section 3.3), the sigmoid function is applied to combine FBP with RBP, the sigmoidal HBP outperforms the linear one in the new user case (as illustrated in Fig. 5). Accordingly, it turns out that the system would rely on the non-linear relation when it learns the user’s preference from ratings for musical pieces.

In the sigmoid function curve, the first phase is slow and painstaking. The second phase is a nearly exponential growth associated with simpler parts. Inflection points indicate that the growth slows down. Finally, the curve enters the asymptotic stage to the system-specific maximum. In the case of recommender systems, the initial learning from rating data (the new user case) is slow because the system does not know which musical piece is preferred by the user from a huge number of pieces, that is, RBP is unreliable. The second phase is an exponential growth associated with learning based on some musical pieces which have been rated. After the inflection point, learning slows down as the number of rated musical pieces increases. This is because the progress of learning makes RBP sufficiently reliable.

From these insights, the sigmoidal HBP can learn user’s preferences and create more accurate preference profiles even when the user’s rating data is small. Moreover, this means that the sigmoidal HBP potentially provides higher quality recommendations in the new user case.

6. Conclusion

In this paper, a method to predict users’ rating scores for musical pieces based on the five-factor musical preference is proposed to alleviate the new user cold-start problem in content-based music recommender systems. The preference profile is represented by the so-called five-factor MUSIC model. The results showed that the combination of age and brain type facilitates the prediction model to yield better prediction of the new user’s preference profile. In addition, the weighted non-linear combination of feature-based and rating-based profiles can allow content-based recommender systems to produce better rating score prediction, especially in the new user case where the user’s rating data is not available and relatively insufficient. These results indicate that the proposed method can potentially provide high-quality recommendations.

In the future, the user’s feature will be further investigated to predict more precise musical preference profiles in the feature-based profiler. In addition, the proposed method will be compared with other existing methods, which tried to alleviate the new user cold-start problem.

Footnotes

Acknowledgments

The authors would like to thank the IISA2021 general chairs for inviting the special issue of the Intelligent Decision Technologies Journal.

References

Gope

Jain

. A survey on solving cold start problem in recommender systems. In 2017 International Conference on Computing, Communication and Automation (ICCCA), 2017 May 5, IEEE, pp. 133-138.

Celma Herrada

. Music recommendation and discovery in the long tail. Universitat Pompeu Fabra, 2009 Feb 16.

Schafer

Frankowski

Herlocker

Sen

. Collaborative filtering recommender systems. In The adaptive web 2007, Springer, Berlin, Heidelberg, pp. 291-324.

Pazzani

Billsus

. Content-based recommendation systems. In The adaptive web 2007, Springer, Berlin, Heidelberg, pp. 325-341.

Schedl

Zamani

Chen

Deldjoo

Elahi

. Current challenges and visions in music recommender systems research. International Journal of Multimedia Information Retrieval.2018 Jun; 7(2): 95-116.

Rentfrow

Gosling

. The do re mi’s of everyday life: the structure and personality correlates of music preferences. Journal of Personality and Social Psychology.2003 Jun; 84(6): 1236.

Interview

. Jongeren’99, een generatie waar om gevochten wordt.

Laplante

. Improving music recommender systems: What can we learn from research on music tastes? In ISMIR2014 Oct, pp. 451-456.

Rentfrow

Goldberg

Levitin

. The structure of musical preferences: a five-factor model. Journal of Personality and Social Psychology.2011 Jun; 100(6): 1139.

10.

Rentfrow

Goldberg

Stillwell

Kosinski

Gosling

Levitin

. The song remains the same: A replication and extension of the MUSIC model. Music perception.2012 Dec 1; 30(2): 161-85.

11.

Soleymani

Aljanaki

Wiering

Veltkamp

. Content-based music recommendation using underlying music preference structure. In 2015 IEEE international conference on multimedia and expo (ICME) 2015 Jun 29, IEEE, pp. 1-6.

12.

Bonneville-Roussy

Rentfrow

Potter

. Music through the ages: Trends in musical engagement and preferences from adolescence through middle adulthood. Journal of Personality and Social Psychology.2013 Oct; 105(4): 703.

13.

Greenberg

Baron-Cohen

Stillwell

Kosinski

Rentfrow

. Musical preferences are linked to cognitive styles. PloS One.2015 Jul 22; 10(7): e0131151.

14.

Okada

Tan

Kamioka

. Five-Factor Musical Preference Prediction for Solving New User Cold-Start Problem in Content-Based Music Recommender System. In 2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA) 2021 Jul 12, IEEE, pp. 1-7.

15.

Chou

Yang

Jang

. Conditional preference nets for user and item cold start problems in music recommendation. In 2017 IEEE International Conference on Multimedia and Expo (ICME) 2017 Jul 10, IEEE, pp. 1147-1152.

16.

Haro

Xambó

Fuhrmann

Bogdanov

Gómez

Herrera

. The musical avatar: A visualization of musical preferences by means of audio content description. In Proceedings of the 5th Audio Mostly Conference: A Conference on Interaction with Sound 2010 Sep 15, pp. 1-8.

17.

Hoashi

Matsumoto

Inoue

. Personalization of user profiles for content-based music retrieval based on relevance feedback. InProceedings of the eleventh ACM international conference on Multimedia2003 Nov 2, pp. 110-119.

18.

Ferrer

Eerola

Vuoskoski

. Enhancing genre-based measures of music preference by user-defined liking and social tags. Psychology of Music.2013 Jul; 41(4): 499-518.

19.

Baron-Cohen

. Autism: the empathizing-systemizing (ES) theory. Annals of the New York Academy of Sciences. 2009 Mar 1; 1156(1): 68-80.

20.

Wheelwright

Baron-Cohen

Goldenfeld

Delaney

Fine

Smith

Weil

Wakabayashi

. Predicting autism spectrum quotient (AQ) from the systemizing quotient-revised (SQ-R) and empathy quotient (EQ). Brain Research.2006 Mar 24; 1079(1): 47-56.

21.

Baron-Cohen

Wheelwright

. The empathy quotient: an investigation of adults with Asperger syndrome or high functioning autism, and normal sex differences. Journal of Autism and Developmental Disorders.2004 Apr; 34(2): 163-75.

22.

Delsing

Ter Bogt

Engels

Meeus

. Adolescents’ music preferences and personality characteristics. European Journal of Personality: Published for the European Association of Personality Psychology.2008 Mar; 22(2): 109-30.

23.

Ter Bogt

Keijsers

Meeus

. Early adolescent music preferences and minor delinquency. Pediatrics.2013 Feb 1; 131(2): e380-9.

24.

Carstensen

. Social and emotional patterns in adulthood: support for socioemotional selectivity theory. Psychology and Aging.1992 Sep; 7(3): 331.

25.

Hogan

Roberts

. A socioanalytic model of maturity. Journal of Career Assessment.2004 May; 12(2): 207-17.