Abstract
Neighbourhood-based collaborative filtering (CF) methods typically rely only on user rating information for similarity calculation, without considering linguistic concepts (terms) that reflect user fuzzy preferences. However, in real-world decision-making processes, users often prefer to express their preferences for items linguistically rather than numerically. Inspired by this, we propose a probabilistic linguistic term set–based item similarity method that transforms absolute ratings into linguistic terms to capture the degree of importance users place on explicit aspects and opinions. Furthermore, we take into account the positive impact of users’ preferred consistency towards items on similarity results and introduce a Bhattacharyya coefficient–based item tendency to adjust semantic similarities, enhancing the reliability of predictions. In addition, we account for the asymmetric relation between items when selecting appropriate neighbours to optimise rating predictions. The experiments on two benchmark data sets indicate that our method outperforms existing similarity methods across various evaluation metrics. Specifically, compared with the state-of-the-art method, intuitionistic fuzzy set–based hybrid similarity model (IFS-HSM), the proposed model improves the performance by at least 2.1% and 1.9%, respectively, within the metrics mean absolute error (MAE) and F1. Moreover, our approach provides a new insight for measuring similarity between items from both qualitative and quantitative perspectives.
Keywords
1. Introduction
Decision-making is a common and essential activity in various aspects of life, involving ranking alternatives or selecting the most favourable options based on specific decision criteria [1]. However, as decision-making becomes more complex, individuals find it challenging to make reliable choices solely based on their own experiences. This challenge is particularly evident in digital content platforms like Yahoo Music (YM), Netflix and YouTube, where online users face a vast array of choices in songs, movies and videos, respectively [2–4]. Consequently, the rise of recommender systems (RSs) has occurred to assist individuals in making decisions by suggesting the most suitable products or services from overwhelming information sources [5,6]. An effective RS can significantly enhance user satisfaction, increase platform loyalty, attract new users and boost sales.
Collaborative filtering (CF) [7] is an efficient method used to filter out irrelevant information and address the issue of information overload. It is widely employed in both academia and industry for generating personalised recommendations. Among CF methods, the neighbourhood-based CF, also known as the K-nearest neighbour (KNN) method, has emerged as the most successful and widely applied approach on various online platforms [8]. The KNN method’s working principle is based on the assumption that if an item is favoured by a group of neighbourhood users similar to a target user, then the target user might also be interested in this item [9].
The reliability and accuracy of neighbourhood-based CF largely depend on the similarity calculation between items [10]. Classic similarity methods such as the Pearson correlation coefficient (PCC) and cosine similarity (COS) primarily rely on co-rated items for similarity evaluation and do not fully utilise all rating information [9,11]. As a result, inaccurate computations can arise in sparse data sets due to a lack of co-rated items. In addition, the influence of other non-co-rated items, which can provide valuable information such as user preferences and item tendency, cannot be neglected in measuring similarity [12]. Recent studies have indicated that fully leveraging all ratings in a given matrix significantly improves the performance of RSs [13,14]. However, previous methods have focused solely on rating probability distributions, neglecting the impact of item tendency (i.e. the numbers of positive and negative ratings on an item) in similarity calculations. Furthermore, they have not considered the interaction effect between a pair of items or users when selecting appropriate neighbours, which includes the influence of co-rated items and asymmetric relations.
In real-world decision-making processes, individuals often prefer to express their opinions or preferences linguistically due to the higher cost of obtaining precise quantitative information [15–17]. However, most RSs require online users to express their preferences through rating values, which may not entirely reflect users’ true preferences due to the complexity and subjectivity of user behaviours. Previous CF studies have mainly focused on rating-based similarity calculations, overlooking the significance of semantic information in reflecting users’ importance placed on specific aspects and opinions during decision-making.
Based on the background presented above, this study aims to address the following research questions:
RQ1. Can item-item similarity be measured from both qualitative and quantitative perspectives?
RQ2. Can the reliability of similarity calculations be further enhanced by introducing the influence of item tendency in a similarity model?
RQ3. Can the accuracy of rating predictions be improved by considering the asymmetry of the similarity model when selecting appropriate neighbours for target items?
To answer RQ1, we propose using probabilistic linguistic term sets (PLTSs) in CF to evaluate item-item similarity by transforming numerical information into linguistic information. This approach takes full advantage of all user information, avoids information loss and allows for the expression of user preferences with variable importance. For RQ2, we introduce a Bhattacharyya coefficient–based item tendency (BCIT) to adjust semantic similarity results, thereby improving the reliability of similarity calculations. Finally, for RQ3, we present an improved neighbourhood selection method (INSM) that considers the interaction effect between items by accounting for the asymmetric relation between items and the importance of co-rated items. This method facilitates the selection of more appropriate neighbours, leading to optimised prediction results compared with the traditional selection method.
The main contributions of this study are as follows:
Introduction of the concept and definitions of PLTSs in CF to calculate item-item similarity based on linguistic information, effectively reflecting users’ preferences during the decision-making process.
Proposal of the BCIT method, which demonstrates the consistency of users’ preferences towards items and improves the reliability of similarity calculations.
Design of the INSM, an INSM that emphasises the interaction between items and considers asymmetric relations, leading to more accurate neighbour selection and improved prediction results compared with traditional methods.
Empirical experiments on popular data sets to verify the superiority of our method over other similarity measures in various accuracy metrics.
The rest of this article is organised as follows. Section 2 reviews related works on neighbourhood-based CF, linguistic term sets (LTSs) and PLTSs. In section 3, we present the similarity decision model and the INSM. Section 4 discusses and analyzes the experimental results, and section 5 provides conclusions and directions for future work.
2. Related work
In this section, neighbourhood-based CF is first discussed. Then, we introduce some basic definitions related to LTSs and PLTSs, which are used to reflect the decision-makers’ (DMs) opinions in group decision-making. Finally, we elaborate on the use of PLTSs in RSs.
2.1. Neighbourhood-based CF
The KNN method is one of the most common forms of CF and involves four main steps as follows: (1) construction of a user-item rating matrix [rui]M×N, where M and N denote the number of users and items, respectively, and rui is a rating of an item i made by a user u; (2) calculation of item–item or user–user similarities using similarity measures; (3) selection of the top K neighbours according to the similarity values arranged in descending order and (4) prediction of ratings on unrated items and recommendation of the n items with the greatest predictions as a recommendation list. The PCC and COS methods and their variants are the most common similarity measures in neighbourhood-based CF [18]. These methods generate good recommendation results in a dense data set, as there are sufficient co-rated items. However, in a sparse data set, these methods usually have unsatisfactory performance in terms of accuracy of RSs, due to the rarity or absence of co-rated items.
To effectively improve the accuracy and credibility of RSs, various similarity methods have been proposed to address the issue of data sparsity. For instance, Liu et al. [19] utilised a heuristic similarity model with nonlinear functions to calculate the similarity between users, considering rating preferences and the influences of co-rated items compared with the Proximity–Impact–Popularity (PIP) method [20]. Fu et al. [21] have incorporated the natural mechanism of attention in a neighbourhood-based CF method to establish a highly accurate and rational RS. However, the reliance on co-rated items and the neglect of the effect of other non-co-rated items still limit its performance. To overcome these limitations, Patra et al. [9] introduced a Bhattacharyya coefficient-based collaborative filtering (BCF) method for sparse data, fully utilising all ratings made by two users and considering the effect of items. Deng et al. [13] proposed a K-medoids recommendation based on probability distribution to achieve good recommendations and alleviate the sparsity problem within an acceptable runtime. Singh et al. [22] modified the Bhattacharyya coefficient and proposed an improved item-based CF method to investigate the behaviour of similarity measures in different rating patterns. Wang et al. [14] introduced a similarity measure based on α-divergence to reduce the dependence on co-rated cases from the perspective of the probability density distribution of ratings. Guo et al. [23] designed a multi-criteria classification recommendation method using the Hellinger distance to efficiently calculate similarities between items within the same class. Inspired by this idea, in this study, a similarity decision method is developed to fully exploit all user information using probability linguistic terms.
Notably, all existing neighbourhood-based CF methods have not considered item tendency for similarity calculation, that is, the numbers of positive and negative ratings on an item. We believe that item tendency has a crucial effect on improving the reliability of similarity calculations. In addition, the selection of credible neighbours is an important step in determining the prediction accuracy [24]. However, existing methods have not taken the effect of interaction between items or users such as the proportion of the number of co-rated items to the number of times a target item was rated into consideration when selecting appropriate neighbours. Therefore, we have considered these two factors in the proposed approach to achieve improved recommendation results.
2.2. LTSs
LTSs are the basis of linguistic decision-making, and DMs can employ them to provide their views on, or preferences for, considered objects [25]. The additive LTS is one of the most commonly used approaches, is finite and ordered [26]
where
However, in most studies about LTSs, it is assumed that the importance or weight of all linguistic information is the same. Hesitancy or uncertainty may exist when DMs express their opinions using several possible linguistic terms and using different degrees of importance for the possible values can effectively and accurately reflect their preferences on an object [27]. Therefore, to address this drawback of LTSs, PLTSs [28] have been proposed to increase the probability of capturing accurate linguistic information and avoid information loss.
2.3. PLTSs
where L(k)(p(k)) represents the linguistic term L(k) with a probability of p(k), and #L(p) is the number of all linguistic terms in L(p). It is noted that only when the probabilistic distributions of all possible linguistic terms exist, otherwise,
To ensure the uniqueness of operational results, the ordered PLTS [28] is proposed to fix the positions of elements in a linguistic set.
In addition, for any two ordered PLTSs, the k values of their linguistic terms are often different in the process of decision-making, which poses challenges for operations using these PLTSs. To deal with this issue, the LTS of the smaller PLTS must be lengthened to make the number of linguistic terms equal in both sets [28].
Note that the added linguistic terms are L2(a) (a ∈ k) with the smallest subscript r2(a) in L2(p), and their probabilities p2(a) are zero.
Based on the above definitions, the degree of deviation between two PLTSs, L1(p) and L2(p), is given as [27
RSs also need to provide a series of decisions for online users based on desired products or services. To facilitate the assessment of user preference, RSs transform linguistic concepts into a series of numerical information to express user satisfaction with an item. For example, a rating scale from 1 to 5 on the data set MovieLens represents the five levels of user opinions: very low (1), low (2), medium (3), high (4) and very high (5). However, individuals often prefer to use semantic information to provide their opinions or preferences rather than numerical values in the real-world decision process [29,30]. This preference for semantic information highlights a gap in previous studies of CF, where few scholars have focused on linguistic information for similarity computation. For instance, Moses and Babu [12] presented a fuzzy linguistic recommendation algorithm to detect non-malicious or natural noise and did not use probabilistic linguistic terms to evaluate the similarity between items. Similarly, Guo et al. [31] proposed a similarity method based on the concept of intuitionistic fuzzy sets from the perspective of user preference probability to guarantee the quality of recommendations and achieve favourable system efficiency. Shojaei and Saneifar [32] presented a multi-level fuzzy similarity method based on popularity and significance to identify uncertainty for similarity calculations. Karthik and Ganapathy [33] introduced a fuzzy logic–based recommendation algorithm to dynamically predict items that users are currently most interested in. However, these methods did not take linguistic concepts into account in measuring similarity and ignored the effect of different user preferences.
Through the above analysis and identification of the literature gap, in this study, we apply the definitions of PLTSs to CF to measure the degree of deviation between items. Moreover, we introduce a BCIT [9,34] in the linguistic similarity calculations to evaluate the desirability (like and dislike) of items rated by users. To further investigate the impact of the nearest neighbours on prediction accuracy, we consider the effect of interactions between items when selecting appropriate neighbours. This involves accounting for the asymmetric relation between items and the influence of co-rated items, which allows us to refine the process of neighbour selection for improved prediction results.
3. Methodology
In this section, a PLTS-based similarity decision model is first introduced to measure item–item similarity. Then, a BCIT is used as a weight in the decision model to enhance the accuracy and reliability of the similarity calculations. Finally, the effect of interaction between items is considered in the process of neighbourhood selection to improve prediction accuracy.
3.1. PLTS-based similarity decision model
Through the introduction of PLTSs, PLTSs can be used in CF to evaluate item–item similarity and to provide uncertain decisions for active users using their previous rating preferences. In our proposed model, we first need to eliminate two terms (none and perfect) from the well-known set of seven linguistic terms due to the 1–5 rating scale used in our experiments. Then, according to Definition 1 presented in section 2.3, a PLTS of an item i in the RS is described as
where I(r)(pi(r)) represents the linguistic term I(r) or sr with a probability of pi(r),
Based on Definition 2 described in section 2.3, the linguistic terms I(r)(pi(r)) are reordered in descending order according to the values of rpi(r) to ensure the uniqueness of similarity results. As such, an ordered PLTS
Thus, we can evaluate the degree of deviation between the ordered PLTSs of items i and j,
where k represents a reordered position of the PLTSs, ri is the subscript of linguistic term sr for item i, and
To better distinguish the slight differences between items, especially between two pairs of similar items, the PLTS-based similarity model like the forgetting curve is proposed to calculate item–item similarity. Its advantage is presented in Appendix 1
3.2. BCIT
However, the PLTS-based model considers only the linguistic similarity between items and ignores user rating preferences for items, which will lead to inaccurate similarity results. The specific cases are presented in Appendix 1 to illustrate this issue. Thus, item tendency is evaluated to survey the preference of an item rated by a user [12]. Each item involves two tendencies: positive user preference (rating ≥ 3) and negative user preference (rating ≤ 3). To measure the two tendencies (PP and NP) of item i, the formulas based on a 1–5 rating scale are described as
where m is the number of times item i was rated in the system, and I(rk) represents the number of users who rate k on item i; rui denotes a rating of user u on rated item i. It is noted that a rating value of 3 is considered the rating of a dilemma, so it is commonly divided between positive and negative preferences.
To evaluate item–item tendency similarity, the BC is used in similarity computation. The BC has been successfully applied in CF to calculate item–item similarity using rating probability distribution [9]. Thus, a BCIT similarity is described as
where t represents a positive (P) or negative (N) item tendency, and pi,t is the probability of some tendency t for item i, the calculation is as follows
Because item tendency denotes the preference consistency of users towards items, the BCIT similarity (equation (9)) can be used as a similarity weight to adjust the PLTS-based similarity decision model (equation (6)) to increase the reliability of calculation results. Accordingly, the final similarity model PLTS-BCIT is given by
In Appendix 1, we illustrate the positive impact of the BCIT method on the calculations of the PLTS-based similarity model.
3.3. INSM
It is well-known that the credible neighbourhood has a crucial impact on the final predictions. The traditional neighbourhood selection method (TNSM) simply selects the top K neighbours with the greatest similarity values between a target item and its possible neighbours. However, this method cannot consider the effect of interaction between items, which directly affects the order of neighbours.
As an example, if sim(i, j) = sim(i, k), but the number of co-rated items for items i and j, |U(i, j)|, is greater than that of items i and k, |U(i, k)|, then item j may be a more appropriate neighbour to item i rather than item k. This example shows that the influence of co-rated items should be taken into account when selecting appropriate neighbours.
The rating numbers of different items can also be considered in neighbourhood selection. Generally, the similarity between items is found to be symmetric mode, such that sim(i, j) = sim(j, i). And in fact, this ignores a critical problem that is the effect of interaction between items is different or asymmetric, such that sim(i, j) ≠sim(j, i). Suppose that the rating number |Ui| of item i is greater than that |Uj| of item j; then, the influence of item i on user j is stronger than that of item j on user i. Thus, item i may be added to the nearest neighbour set of item j, but item j may not be selected as the nearest neighbour of item i. The specific examples are mentioned to show the effect of interaction between items on the neighbour selection in Appendix 1.
Based on the above analysis, an INSM needs to consider the number of co-rated items as well as the rating number of the target item to fully reflect the effect of interaction between items using a sigmoid function. Thus, the ISNM based on PLTS-BCIT is constructed as follows
where |U(i, j)| represents the number of users who rated both items i and j, and |Ui| is the number of times item i was rated. Note that the INSM is only adopted to find the top KNNs Cneii for item i according to its similarity values in descending order; it does not change the similarity values calculated by the PLTS-BCIT model.
To compute the prediction value pui for a special user u on unrated item i, the formula is given by
where
3.4. Discussion of the proposed similarity model
The proposed similarity decision method considers that linguistic information can express users’ fuzzy preferences on rated items better than numerical rating information. To obtain linguistic information of users on rated items without external methods or additional information, the concept of LTSs is used to transform user ratings into linguistic terms. This approach enhances the probability of capturing accurate linguistic information and avoids information loss by successfully applying the definitions of PLTSs in CF to compute item–item similarity. By emphasising the potential importance of non-co-rated items and the influence of co-rated items on similarity calculations, our method (equation (6)) effectively utilises all user information from the perspective of the probabilities of linguistic information, thus improving the utilisation of information and the accuracy of similarity results. Therefore, our method is more suitable for sparse data sets and addresses the dependency of co-rated items.
Moreover, the influence of item tendency cannot be ignored for linguistic similarity computation (equation (6)), that is, the numbers of positive and negative ratings on some item, as it reflects the preferences of users towards items. Thus, we design a BCIT similarity method (equation (9)) to adjust the similarity results of equation (6) and demonstrate the preference consistency of users on two items. This method leverages all ratings to obtain an accurate consistency result from the perspective of a probability distribution. Integrating equations (6) and (9) in our model aims to strengthen or weaken the PLTS-based similarity results and improve the accuracy and reliability of similarity calculations.
Although the similarity results have a great influence on the nearest neighbourhood selection, finding credible neighbours and obtaining good prediction results are also of great importance. TNSMs may not fully consider the asymmetric relations between items and the impact of co-rated items. To address this issue, we introduce an INSM (equation (12)) that takes into account the effect of interactions between items and attempts to reorder the neighbours of the target item. By considering these two factors, our method can select more appropriate neighbours, leading to optimised prediction results.
Through the above qualitative analysis and discussion, our similarity method considers linguistic information, item tendency, and the effect of interaction between items, which facilitates improving the quality of recommendations. Furthermore, we provide a mathematical discussion of our proposed method in Appendix 1 to illustrate its properties and advantages. As a result, the proposed method can effectively deal with data sparsity and possesses good flexibility and universality.
4. Experiments
This section first introduces the experimental setup including the data set description and evaluation indicators in sections 4.1 and 4.2. Later, the comparison experiments and discussion are presented in sections 4.3 and 4.4, respectively.
4.1. Data set preparation
In the experiments, two widely accepted data sets, MovieLens-100k 1 (ML-100k) and YM 2 , in their entirety to verify the effectiveness of our model compared with several widely used similarity methods under the same conditions. ML-100k is a standardised public and popular data set that is widely used in the field of RSs. The data set consists of 943 users, 1682 movies, and 100,000 ratings on the 1–5 rating scale that represents the five levels of user opinions for movies: very low (1), low (2), medium (3), high (4) and very high (5). Another public data set, called YM, includes 365,704 ratings on the same 1–5 rating scale from 15,400 users for 1000 songs.
The general testing methodology of RSs is that data sets are divided into two subsets: one is the training set, which includes 80% of the ratings randomly selected from each user’s rated items, and the other is the testing set, which consists of the remaining 20% of the data.
4.2. Evaluation indicator
To show the performance of our model compared with other methods, we use prediction accuracy and recommendation accuracy as evaluation metrics.
We first employ two prediction accuracy metrics, Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE), to measure the predictive effect of CFs. This quantifies the accuracy of predictions. The smaller MAE or RMSE value shows better predictive ability. Their formulas are given by
here, m is the number of users tested, n is the number of items to be predicted for user u, and rui and pui represent the actual and predicted ratings of item i made by user u.
Then, we adopt three widely used recommendation accuracy metrics, Precision, Recall, and F1-value, to evaluate the ability of a CF method to accurately recommend items. Precision is a measure of the preciseness or exactness of recommendation results. It is calculated as the percentage of relevant recommendations to all predicted recommendations. Recall is a measure of the completeness of recommendations. It is described as the ratio of relevant recommendations to all actual recommendations. Typically, to fully reflect the performance of a RS, F1-value is introduced as a comprehensive indicator to integrate the metrics Precision and Recall. A greater F1-value indicates better recommendation ability. The three metrics are given by
where Iar and Ipr are the number of actual and predicted recommendations for active users, respectively. Note that the recommendation mechanism in our experiments meets the following rule
where rui and pui are the actual and predicted ratings, respectively, and
4.3. Comparison results and analysis
The experiments are conducted on the data sets ML-100k and YM. The proposed model is compared with five widely used similarity measures in accuracy indicators described in section 4.2 to show the competitiveness of our scheme. The five similarity methods for comparison are presented as follows:
PCC: PCC [35] is usually used to measure the degree of linear correlation between items, but it relies on the co-rated items.
COS: COS [35] is often used to calculate the cosine angle between items, but it relies on the co-rated items.
BC: BCF method [9] is used to evaluate similarity between probability distributions of two items, and it can fully utilise all ratings.
KL: Kullback–Leibler divergence–based similarity method [11] is also used to measure consistency between probability distributions of two items, but it is an asymmetric similarity method compared with the BC method.
IFS-HSM: intuitionistic fuzzy set–based hybrid similarity model [31] is a hybrid semantic similarity method to compute item–item similarity under the condition of different co-rated items through integrating the advantages of both types of methods (Google similarity and KL similarity).
Prior to comparison with the other algorithms, we first verify our proposed model on two given data sets that the BCIT as the weight of PLTS-based similarity decision model has a positive impact on improving the reliability of similarity results and that INSM can greatly enhance the prediction accuracy due to the consideration of the interaction between items.
4.3.1. ML-100k data set
As depicted in Figure 1, the proposed PLTS-BCIT method using INSM outperforms other variants of the PLTS-BCIT method in terms of MAE and F1-value. The curves of PLTS (-BCIT) using TNSM and PLTS (-BCIT) using INSM show that INSM significantly improves the accuracy of RSs. Furthermore, the introduction of BCIT has a positive impact on accuracy when comparing PLTS-BCIT with the traditional PLTS. These experiments emphasise the importance of considering item tendency for similarity calculations and the effect of interaction between items during neighbourhood selection, thereby enhancing the recommendation quality. Based on these results, we select the best-performing scheme (PLTS-BCIT using INSM) for further investigation and comparison with five other similarity methods on the ML-100k data set.

The MAE and F1-value of our scheme under different conditions for the ML-100k, where TNSM and INSM represent traditional and our proposed improved neighbourhood selection methods, respectively.
Figure 2 displays the prediction accuracy of different similarity methods for various numbers of KNNs on the ML-100k data set. To understand the graph better, the x-axis represents the number of neighbours (K), and the y-axis represents the prediction accuracy metrics MAE and RMSE. From Figure 2, we observe that the PLTS-BCIT method demonstrates superior predictive ability, as it consistently outperforms other comparison methods in terms of prediction accuracy. Notably, traditional methods like PCC and COS exhibit the poorest performance, while state-of-the-art methods like BC, KL and IFS-HSM significantly enhance prediction accuracy compared with traditional methods. Among these, the IFS-HSM method achieves the most accurate predictions, except for our proposed model, with MAE = 0.753 and RMSE = 0.957 when K = 100. Compared with the IFS-HSM method, our scheme reduces the MAE and RMSE prediction errors by at least 2.1% and 1.8%, respectively.

Prediction accuracy of similarity measures on the data set ML-100k (MAE and RMSE).
Figure 3 illustrates the recommendation accuracy of various decision-making systems evaluated using the comprehensive F1-value metric on the ML-100k data set. It can be seen from Figure 3 that the proposed PLTS-BCIT model outperforms other methods in terms of F1-value, and the output values remain consistently high within the interval (0.67, 0.683), demonstrating the stability and reliability of our algorithm. The F1-values of PCC and COS methods are significantly lower than 0.45, while the recently proposed methods – BC, KL and IFS-HSM – achieve good recommendation accuracy with F1-values greater than 0.62. The closest result to our F1-value is achieved by the IFS-HSM method, with an F1-value of approximately 0.76 when K = 100.

Recommendation accuracy of similarity measures on the data set ML-100k (F1-value).
Table 1 shows performance results of all comparison methods on the ML-100k data set when the number of the nearest neighbours K is fixed to 20. It can be seen from Table 1 that the PLTS using TNSM, viz. PLTS-based item semantic similarity (equation (6)), outperforms the traditional similarity methods in the metrics MAE and F1. Moreover, we found that methods that take full advantage of all user information, such as BC, KL, IFS-HSM and our method (equation (6)), obtain better results, and IFS-HSM method has the best performance among them. This is because the method is a hybrid similarity model, and it considers other auxiliary conditions. However, these similarity methods rely on user ratings, otherwise they do not work; while our proposed similarity model can be used when only linguistic information is available. Meanwhile, it is noted from Table 1 that the proposed PLTS-BCIT using TNSM (equation (11)) is superior to the BC and KL-based similarity methods, indicating that the BCIT method introduced has a positive effect on PLTS-based similarity method (equation (6)). Also, it can be found from Table 1 that compared with other comparison methods, the PLTS and PLTS-BCIT using INSM achieve the better prediction results, showcasing the effectiveness of our INSM, and the PLTS-BCIT method is the best, indicating that the similarity decision model (equations (11) and (12)) we finally proposed facilitates improving the quality of recommendations.
Performance comparison on the ML-100k data set.
MAE: mean absolute error; PLTS: probabilistic linguistic term set; TNSM: traditional neighbourhood selection method; BCIT: Bhattacharyya coefficient–based item tendency; INSM: improved neighbourhood selection method; BC: Bhattacharyya coefficient; KL: Kullback–Leibler; PCC: Pearson correlation coefficient; COS: cosine similarity; IFS-HSM: intuitionistic fuzzy set–based hybrid similarity model. The best performance is in boldface.
4.3.2. YM data set
Similar to the data set ML-100k, we can see from Figure 4 that for the YM data set, our proposed algorithm (PLTS-BCIT using INSM) provides the best prediction and recommendation accuracy for any number of nearest neighbours K. Note that the differences between the various forms of this algorithm are more clearly compared to the ML-100k. This is particularly true for the MAE metric between cases where the consideration of item tendency differs. The neighbourhood selection method still has a crucial influence on the accuracy of the RS, and using a robust selection method can effectively improve prediction accuracy. Furthermore, the consideration of item tendency is essential in maximising the reliability of similarity results. Therefore, these two factors are reasonable and impactful for the PLTS-based similarity decision method.

The MAE and F1-value of our scheme under different conditions for the YM, where TNSM and INSM represent traditional and our proposed improved neighbourhood selection methods, respectively.
As with the ML-100k data set, we execute the prediction accuracy of all similarity methods on the YM data set by MAE and RMSE. These results are presented in Figure 5, and it can be seen that the PCC method has the greatest prediction error among all methods; our scheme has the lowest prediction error, with an error of less than 0.996 for K = 20. The recently proposed schemes BC, KL and IFS-HSM exhibit greater prediction accuracy than the early proposed methods PCC and COS in terms of these two metrics. Compared with the closest competitor, IFS-HSM, our scheme has at least 3.5% better accuracy in terms of MAE. Moreover, it is evident from Figure 6 that the recommendation ability of our scheme is better than the alternative comparison methods in terms of F1-value. The F1-value of PLTS-BCIT is greater than 0.6 for all K, with the best result of approximately 0.625 for K = 100. The IFS-HSM still is the most comparable method and has improved recommendation accuracy compared with BC, KL, PCC and COS methods.

Prediction accuracy of similarity measures on the data set YM (MAE and RMSE).

Recommendation accuracy of similarity measures on the data set YM (F1-value).
Table 2 shows performance comparison of all similarity methods on the YM data set when the number of the nearest neighbours K is set to 20. The analysis and discussion of comparison results in Table 2 are similar to those in Table 1.
Performance comparison on the YM data set.
MAE: mean absolute error; PLTS: probabilistic linguistic term set; TNSM: traditional neighbourhood selection method; BCIT: Bhattacharyya coefficient–based item tendency; INSM: improved neighbourhood selection method; BC: Bhattacharyya coefficient; KL: Kullback–Leibler; PCC: Pearson correlation coefficient; COS: cosine similarity; IFS-HSM: intuitionistic fuzzy set–based hybrid similarity model.The best performance is in boldface.
The above experiments are conducted on two public and popular data sets, ML-100k and YM, with sparsity levels of 6.3% and 2.37%, respectively. These experimental results and analysis show that the proposed scheme, PLTS-BCIT using INSM, has better recommendation results than other comparable similarity methods, including two traditional methods and three recently proposed methods. This verifies that our system can provide a reliable and accurate decision-making recommendation list for active users.
4.4. Discussion and implications
The evaluation results of the proposed model on two popular data sets have yielded several interesting findings. First, the state-of-the-art methods based on probability distribution outperform classic similarity measures that rely only on co-rated items, demonstrating that leveraging all ratings leads to improved recommendation performance. Second, a significant positive impact on the accuracy of RSs is observed when considering the asymmetry between items for similarity calculations. Specifically, methods like KL, IFS-HSM and PLTS-BCIT, which account for the differences of item interactions, outperform some symmetric similarity measures. This highlights the importance of selecting appropriate neighbourhoods, as item interactions tend to differ. Moreover, the PLTS method utilising the proposed INSM significantly enhances the accuracy of rating predictions compared with the PLTS method using TNSM. This result indicates that the INSM can be generalised to any similarity measure, offering potential improvements in prediction accuracy across different methods. Furthermore, while evaluating the semantic similarity between items from a qualitative perspective yields better prediction results than some comparative methods, it is insufficient since it overlooks the consistency of user preferences for items. Thus, incorporating item tendency into the model becomes crucial for adjusting similarity calculations and further enhancing the reliability of similarity measures.
In conclusion, these findings provide valuable insights into designing effective similarity measures to enhance the quality of RSs. In addition, this study introduces a novel approach for measuring item–item similarity from both qualitative and quantitative perspectives, offering promising avenues for future CF research.
5. Conclusion and future work
Users usually prefer to express their opinions regarding items linguistically rather than using numerical values in the actual decision-making process. In our study, we rely on the concept of LTSs to transform numerical ratings into linguistic terms. This approach avoids the use of external methods or additional information, such as user reviews, to obtain linguistic information, demonstrating its universality and flexibility.
To highlight the importance of different linguistic information and improve the accuracy of capturing information, we successfully apply the definitions of PLTSs to CF and propose a PLTS-based similarity decision model for evaluating the similarity between items. This model leverages the probabilities of linguistic information to fully utilise all user information, preventing information loss and effectively enhancing the utilisation of information and the accuracy of similarity calculations. Moreover, in our semantic similarity model, we incorporate a BCIT as a similarity weight to adjust the calculated results and emphasise users’ preference characteristics for items. This inclusion enhances the reliability of similarity computations and ensures the model captures user preferences more effectively. In addition, we design an INSM that considers the interaction between items when selecting appropriate neighbours for target items. This method not only emphasises the asymmetric relationship between items but also takes into account the importance of co-rated items. The proposed method is validated on two benchmark data sets, and the experiments demonstrate its superiority over alternative similarity methods across various metrics. As a result, our similarity decision model provides reliable and accurate rating predictions.
Future work will focus on two main aspects to enhance our similarity model. First, we will investigate the performance of the PLTS-based similarity model in multi-attribute decision-making scenarios, as attributes significantly impact user decision-making. Second, we recognise the potential bias or information loss resulting from directly converting numerical ratings into linguistic terms due to the subjective nature of language and varying interpretations by different users. To address this, our future work aims to establish clear definitions for each linguistic term, reducing ambiguity and further improving the accuracy and reliability of similarity calculations.
Footnotes
Appendix 1
Let m1 (n1), m2 (n2) and m3 (n3) be the number of all users with the rating value 1, 2 and 3 who rated item I (J), respectively, and m1 + m2 + m3 = m, n1 + n2 + n3 = n (m > n > 0 and each value is not equal to 0). We assume that the rating scale is 1–3. According to Definition 2 described in section 2.3, two ordered PLTSs
Thus, we can obtain the degree of deviation between the ordered PLTSs of items I and J using equation (5) mentioned in section 3.1, as follows
To analyse the property of the above equation (21), let only m1 = 0, and we have to increase the value of m3 to keep the m value constant. The result is given by
We can easily find from equations (21) and (22) that
Then, we explain why the like-forgetting curve model (see equation (6) mentioned in section 3.1) is adopted to evaluate the similarity between items I and J. Traditional method that transforms the deviation/distance into the similarity is described by
To visually compare the effectiveness of these two methods, we draw the graphs to show the similarity trends as Figure 7.
It can be seen from Figure 7 that the downwards trend of the red plot is more obvious than that of the blue plot as the deviation increases, which indicates that our proposed method is more sensitive to deviations between items. Thus, the proposed method can better distinguish the slight differences between items compared with equation (23). This has a significant impact on the neighbourhood selection.
Next, we discuss the advantage of the proposed BCIT method (see equation (9) mentioned in section 3.2) in our model. Here, we take two extreme cases to illustrate the impact of item tendency on the linguistic similarity between items.
Let only m2 and n1 be not equal to 0, and other values are set to 0. Thus, m2 = m, n1 = n and
We get the item tendency similarity between I and J using the BCIT method, as follows
Thus, the final similarity between items is 0.26 using equation (11) mentioned in section 3.2.
Let only m3 and n1 be not equal to 0, and other values are set to 0. Thus, m3 = m, n1 = n, and
It can be found from the two cases that item tendency must be taken into account in our scheme to show the preference consistency of users on two items. Thus, the proposed BCIT method can be used to adjust the linguistic similarity between items to enhance the credibility of calculations.
Finally, we study the property of the proposed ISNM model (see equation (12) mentioned in section 3.3). Let
Let
As an example, let m1, m2 and m3 be 20, 5 and 5, respectively, and let n1, n2 and n3 be 4, 1 and 1, respectively; let the number |U(I, J)| of rated items I and J both be 6. We get the final similarity between I and J as 1, and the results of their INSM are
Thus, the proposed INSM method considering the effect of interaction between items can better select appropriate neighbours for the target items to optimise prediction results.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was supported by the National Natural Science Foundation of China (grant nos 62272077 and 72301050), the Science and Technology Research Program of Chongqing Municipal Education Commission (grant nos KJQN202300605 and KJQN202100603) and the China Postdoctoral Science Foundation (grant no. 2021M702321).
