Bayesian recommender system for social information sharing: Incorporating tag-based personalized interest and social relationships

Abstract

Personal information management enables users to manage and classify information via the social tagging. The personal information management platform has recently successfully adopted social networks, enabling users to conveniently share their preferences of information with each other. The emerging social networks generate new concepts for designing modern recommender systems in personal information management and sharing platforms.

To design a recommender mechanism for the personal information management and sharing platforms, this work incorporates tag-based personalized interest and social network relationships into a modified Bayesian probability model.

The proposed system is demonstrated with experimental datasets obtained from a popular social resource sharing website. The performances of the proposed system are evaluated based on the word2vec word embedding model. Experimental results indicate that incorporating social network information and personalized tag-based preference with the Bayesian model can improve the recommendation quality for social information sharing websites.

Keywords

Knowledge filtering recommender systems social networks folksonomy social tagging personal information management Bayesian classifier

1. Introduction

Personal information management (PIM) enables users to collect, highlight, access and share a variety of information on a variety of devices. Many people nowadays manage their personal information on the Internet via Web 2.0 service platforms, including Delicious, Diigo and Pinboard. These systems are also known as the social information/resource sharing or the social annotation systems, which allow users to store, organize, share and discover internet resources with user-defined tags [9]. The use of user-defined tags, also known as “Tagging”, is a characteristic of Web 2.0 technology, allowing users to collectively tag and classify internet resources [13, 25, 33, 34]. Users of these tag-based social information management websites can easily classify, store and retrieve their interested resources (such as websites, documents, photos, and videos), and these activities are essential parts of knowledge management cycle [31].

Social networks are growing rapidly in various applications, including Web 2.0 PIM services. Users not only accumulate internet resources, but also follow or are followed by other users. This social knowledge network is based on personal interests or preferences with a bundle of tags and internet resource bookmarks. A social resource sharing network enlarges the scale of sharing information, extending opportunities to find similar users or interested internet resources, and also enhances the knowledge management cycle: capture, refine, store, manage, disseminate, and create knowledge [8, 31]. More importantly, social networks raise a challenge to designing a modern recommender system that can harness interest-based personalized recommendation systems.

Recommender systems using tag-based personal interest to model user profile and internet resources have been proposed [12, 16]. In contrast with those works, this investigation presents a recommender system that is associated with social networks on social information sharing platforms. Some previous studies focus on the use of social networks to improve recommender systems [20, 22, 6, 36]. Unlike previous researches, the main idea of this study is to employ social network information in both the collaborative stage and the content-based recommendation stage. In the collaborative stage, the nearest neighbors for the target user are discovered from the social networks; in the content-based stage, the recommended items are generated with a modified Bayesian probability scheme that combines the tag-based personal interest, relation strength of social network and the popularity of items from the target user’s social neighbors.

This study aims to investigate the recommendation mechanism that integrates social relationships, tag-based interests and item popularity on social information management systems for knowledge (information) filtering. Therefore, Section 2 of this paper describes related works on personal information management, tagging, social network and recommender systems. Section 3 develops a hybrid collaborative and content-based recommender system by using the social network and Bayesian probability to predict items that interest the target users. Section 4 analyzes the performance of the proposed Bayesian recommender system. Section 5 draws conclusions and suggests future research.

2. Related works

2.1 Tag-based social information sharing system

People like to accumulate, classify, store, search, retrieve and share knowledge or information [11]. In an environment of huge internet resources, a social information sharing system is a helpful tool for people to gather, classify, store, search, retrieve and share information or resources. In these tag-based systems, users tag information or resources for future retrieval and sharing [5, 21]. This tagging plays a significant role as part of useful functions for personal information management. Users, resource items, and tags are three major roles in web 2.0 where users label the resource items using social tags. On the internet, various applications support tagging including photos (Flickr), bibliographic references (CiteULike), bookmarks (Delicious, Diigo, and Pinboard), merchandise (Amazon) and videos (YouTube).

Users gather and classify social resource items using tags, which convey information about the characteristics, content and creation of an internet resource [10]. Therefore, a user’s interest can be partially modeled by analyzing the user’s tag information [12]. Based on the personalized tag information, this investigation constructs a recommender system for the social information sharing websites.

2.2 Tag-based recommender systems

Recommender systems are active information filtering systems to suggest resource items (film, television, music, books, news, web pages) that interest the user. Recommender systems are developed to handle information overload, and to provide personalized contents and services to users [2, 3, 28].

Recommender systems make recommendations by three basic steps: acquiring preferences from the user’s input data; computing recommendations using proper techniques, and presenting recommendations to users [30]. Three basic categories of recommendation techniques are content-based filtering (CBF), collaborative filtering (CF) and hybrid-based recommender systems combining CBF and CF [2, 35].

The recommender systems discover the useful information in the social bookmark platform to raise productivity and decrease information overload for users. A tagging system, which is useful for managing and filtering the resource items, can be combined with recommender systems to model users’ interest profiles on the resource sharing platforms. Researchers have used the social tags to classify blogs [5], enhance information retrieval [17, 25], and to improve personalized recommendations [32]. The tags accumulated by the user represent part of this user’s preference or interest in the social bookmarking website. Hence, [16] adopted tag-based user profiles in the collaborative filtering-based recommender systems to alleviate the limitation of the cold-start and sparsity problems, and [34] employed the clustered social ranking to support new users of Web 2.0 websites finding content of interest.

2.3 Social networks for knowledge sharing

Disseminating knowledge is an important step in a cycle that a functioning knowledge management system follows. Through the knowledge management cycle, the knowledge learnt is gradually refined over a period of time [31]. Tag-based system provides functions of knowledge store and management, while social networks enable disseminating knowledge. In addition to the three key components of a tag-based system namely users, resources and tags, a social network is another indispensable function for an innovative social information sharing system. People share information with their friends (fans or followers), and learn new information from their friends in the social information systems. Knowledge or information learning and sharing in social information systems often occurs through connections, dialogue and social interaction. Social networks have two basic relationship types: following and follower. Target users follow people in social media because they like to read or collect the information or knowledge from the people that they follow. This is called “following” for this target user. In contrast, a target user may be followed by followers (also known as “fans”), because the followers are interested in this target user’s information. Majority of people use this social media mainly to seek or collect knowledge. A user with a large number of followers is a popular person whose internet resources or knowledge repository are of interest to other users.

The recommender systems need to filter information to avoid providing useless information to users on a social information sharing platform. This triggers our research to incorporate the social network to design a hybrid recommender system that performs collaborative filtering by analyzing the social network, and content-based filtering by deriving the target user’s preferences. Some researchers have worked in this field [6, 20, 22, 36]. However, unlike previous research, this investigation focuses on the combination of personal preference, social network and item popularity in a modified Bayesian probability model.

3. Tag-based Bayesian probability incorporated with social network information for resource items recommendation

How to discover similar users from the social networks and then suggest items to target users is a critical issue for a recommender system in personal information management platforms. The aim of this investigation is to recommend resource bookmark items that may interest users via social networks.

This study contributes to the modification of the Bayesian recommendation mechanism that tightly integrates social relationships, tag-based interests and item popularity on social information management systems for knowledge (information) filtering. The recommendation procedure includes two main steps: (1) selecting the target user’s collaborative neighbors from the “friend network” comprising followings and followers on the target user’s social network, and (2) suggesting resource items to the target user according to the proposed modified Bayesian probability.

The collaborative stage identifies a target user’s similar neighbors from followings and followers on the social network. The following and follower on the social network provide valuable information for recommending resource items to target users. In the stage of suggesting resource items, the Bayesian probability incorporating the social network information and tag-based personalized preference is computed from the resource items gathered by the target user’s neighbors.

Finding similar users from the social networks may significantly reduce the computation time compared to that of calculating similarity with every user in the platform. The number of candidate items can also be reduced in the item suggestion stage.

3.1 Proposed recommendation procedure

The proposed procedure has two main steps, collaborative filtering and content-based filtering. (1) In collaborative filtering, the system searches social networks to find collaborative users with similar interests to the target user. (2) In content-based filtering, the system recommends resource items that are similar to the target user’s interest via the Bayesian probability model. The main steps are as follows:

Step 1:
Prepare data and set predefined parameters.

Step 1.1:
Prepare item, user, tag information: User-Tag (Eq. (2)), and Item-User (Eq. (7)) and Item-Tag (Eq. (13))
Step 1.1:
Find the target user’s relation type with other users from friend network. Set weights for the four relation types (Eq. (1)).
Step 1.2:
Set predefined system parameters $N$ , $M$ , ${\theta}_{fg}$ , ${\theta}_{fr}$ , ${\theta}_{fg+fr}$ , ${\theta}_{\textit{indf}}$ , $\alpha$ , $\beta$ , $\tau$ and the word vector size.

For each target user, perform Step 2 and Step 3:

Step 2:
Finding collaborative neighbors via the social network for the target user.

Step 2.1:
Determine the tag-based similarity between the target user and the users on the target user’s friend network (following, follower) using Eq. (4). The maximum layer of friend network is predefined as two layers.
Step 2.2:
Adjust tag-based similarity with social relation type by Eq. (5).
Step 2.3:
Rank neighbors using the above adjusted-tag-based similarity.
Step 2.3:
Select top- $N$ neighbors for the target user according to the rank of adjusted-tag-based similarity.

Step 3:
Suggest content-based bookmark items to the target users based on the proposed Bayesian probability.

Step 3.1:
For each candidate items from the top- $N$ neighbors, perform Steps 3.1.1–3.1.5

Step 3.1.1:
Adjust item-tag value with adjusted-tag-based similarity by Eq. (10).
Step 3.1.2:
Calculate item probability using Eq. (12).
Step 3.1.3:
Calculate item-tag value by Eq. (18).
Step 3.1.4:
Calculate conditional probability using Eq. (20).
Step 3.1.5:
Compute recommendation score RScore with Eq. (21).

Step 3.2:
Rank candidate items according to RScore.
Step 3.3:
Suggest top- $M$ items to the target user based on their RScore rank.

3.2 Finding similar neighbors via social network

The social network in the social bookmarking website reveals a part of a user’s knowledge preferences. This section describes the collaborative filtering procedures, and demonstrates the use of social networks in collaborative filtering.

An essential part of the collaborative filtering stage is finding similar neighbors via the social network for the target users. This process has the following steps: obtain the user’s social relationship on the social network; establish user’s tag-based interest profile, and compute and rank the similarity score of the target user’s neighbors.

The study collects candidate users from the target user’s “friends” (followings and followers) on the social network. A social resource sharing website has two user relations, followers and followings. Users follow people that they like, and are followed by others who are interested in them. Unlike other social network platforms such as Facebook, users on a social resource sharing website usually follow people due to the preference of knowledge content. Therefore, this study exploits the implicit and explicit information provided by the social relationship on the target user’s friend network.

Previous studies (e.g., [12]) focus on collaborative filtering via similarity searches on various users, which exhausts huge search time. This study finds similar users from a user’s social network. The small number of users on a target user’s social network significantly reduces the time taken to search for similar users. Additionally, the recommendation accuracy is maintained, because users in the same social network generally have common preferences on social information sharing systems.

Figure 1.

Social relations between target user and the other users.

3.2.1 User’s relation type and strength on the social network

To understand the strength of social relationship for the target user, ${\textit{user}}_{t}$ , this study defines four kinds of relationships between ${\textit{user}}_{t}$ and another user on the social network, ${\textit{user}}_{v}$ , as follows (as depicted in Fig. 1). (1) following: ${\textit{user}}_{t}$ follows ${\textit{user}}_{v}$ ; (2) follower: ${\textit{user}}_{v}$ follows ${\textit{user}}_{t}$ ; (3) following and follower: ${\textit{user}}_{t}$ and ${\textit{user}}_{v}$ are followed by each other; (4) indirect friend: friends (followings or followers) on or beyond the second layer. These relations are assigned different weights of relation strength. The predefined weight of relationship, ${\textit{Rel}}_{{\textit{user}}_{t},{\textit{user}}_{v}}$ , can be represented as follows:

$\displaystyle{\textit{Rel}}_{{\textit{user}}_{t},{\textit{user}}_{v}}=\left\{% \begin{array}[]{ll}{\theta}_{fg},&\textit{following}\\ {\theta}_{fr},&\textit{follower}\\ {\theta}_{fg+fr},&\textit{following and follower}\\ {\theta}_{\textit{indf}},&\textit{indirect friend}\\ \end{array}\right.$ (1)

where $0<\theta_{i}<1$ , $\sum{\theta}_{i}=1.0$ , $i\in\{fg,fr,fg+fr,\textit{indf}\}$ .

3.2.2 Tag-based user interest profile

The user’s tag information includes the tag name and corresponding items collected by this user. A user accumulates a set of tags of interested, implying that the tag-based information can represent the user’s preferences on knowledge collection. A user’s overall preference can be defined as the distribution of tag frequencies. A tag with high frequency means that the user is strongly interested in the category of knowledge represented by this tag. The User-Tag (UT) matrix records the frequencies of tags owned by the user. The frequency of ${\textit{tag}}_{k}$ for target user ${\textit{user}}_{t}$ is given by the number of items tagged by the ${\textit{user}}_{t}$ as follows.

$\displaystyle{UT}_{{\textit{user}}_{t},{\textit{tag}}_{k}}=\textit{Number of % items bookmarked by }{\textit{user}}_{t}\textit{ using }{\textit{tag}}_{k}$ (2)

The user’s tag frequency can be converted into other weighted values, including normalized term frequency and term frequency-inverse user frequency (TF-IUF). This study adopts a logarithmic normalized term frequency, defined as the following formula:

$\displaystyle{\textit{UTLog}}_{{\textit{user}}_{t},{\textit{tag}}_{k}}=\left\{% \begin{array}[]{ll}0,&{UT}_{{\textit{user}}_{t},{\textit{tag}}_{k}}=0\\ 1+\log({UT}_{{\textit{user}}_{t},{\textit{tag}}_{k}}),&\textit{otherwise}\\ \end{array}\right.$ (3)

3.2.3 Tag-based user similarity

To determine a target user’s collaborative neighbors from the social networks, the tag-based similarity of the target user to other users from the friend network should be calculated first. This work represents the user’s interest similarity using the tag-based cosine similarity. The cosine similarity between ${\textit{user}}_{t}$ and ${\textit{user}}_{v}$ is given by the inner product of the normalized tag frequencies of the two users as follows:

$\displaystyle{\textit{UserSim}}_{{\textit{user}}_{t},{\textit{user}}_{v}}=% \frac{\sum_{{\textit{tag}}_{k}\in T}{\left({\textit{UTLog}}_{{\textit{user}}_{% t},{\textit{tag}}_{k}}\cdot{\textit{UTLog}}_{{\textit{user}}_{v},{\textit{tag}% }_{k}}\right)}}{\sqrt{\sum_{{\textit{tag}}_{k}\in T}{{\textit{UTLog}}^{2}_{{% \textit{user}}_{t},{\textit{tag}}_{k}}}}\sqrt{\sum_{{\textit{tag}}_{k}\in T}{{% \textit{UTLog}}^{2}_{{\textit{user}}_{v},{\textit{tag}}_{k}}}}}$ (4)

where $T$ denotes a set of common tags owned by both users.

3.2.4 Adjusted tag-based user interest similarity

To determine the target user’s similar neighbors on the social networks, this study utilizes the adjusted tag-based interest similarity, which combines the user similarity and the strength of social relations introduced in the above section, as follows.

$\displaystyle{\textit{AdjUserSim}}_{{\textit{user}}_{t},{\textit{user}}_{v}}={% \alpha*\textit{UserSim}}_{{\textit{user}}_{t},{\textit{user}}_{v}}+(1-\alpha)*% {\textit{Rel}}_{{\textit{user}}_{t},{\textit{user}}_{v}}$ (5)

where $\alpha(0\leqslant\alpha\leqslant 1)$ denotes a predefined weight that determines the relative importance of both positive-valued variables.

Based on the adjusted tag-based user interest similarity, users can be ranked from the predefined number of layers, and the numbers of followings and followers on the social network. The top $N$ users can then be selected as the target user’s similar neighbors.

The reason for combining the tag-based similarity and the user’s relation strength is as follows. When ${\textit{user}}_{t}$ follows ${\textit{user}}_{v}$ , ${\textit{user}}_{t}$ may not tag those items that belong to ${\textit{user}}_{v}$ , because ${\textit{user}}_{t}$ can easily discover the items he/she likes from ${\textit{user}}_{v}$ . In this case, ${\textit{user}}_{t}$ has a low tag-based cosine similarity between ${\textit{user}}_{v}$ , however, the two users are linked by an implicit interest similarity. The adjusted tag-based user interest similarity takes advantage of the tag-based interest preference and the relation strength on the social network.

3.3 Recommending bookmark items via Bayesian probability

The purpose of this section is to recommend items to the target user. Following the previous process of finding similar neighbors via the target user’s social network, this study suggests items from the similar neighbors’ collection based on Bayesian probability.

3.3.1 Item probability based on the Bayes’ theorem

Bayesian classifiers can predict class membership probabilities, such as the probability that a given tuple belongs to a particular class [2]. This study modified the Bayesian probability based on the Naöve Bayes classifier to suggest items to the target users.

If $I=\{\textit{item}_{i}|i\in 1,2,\dots,N_{I}\}$ is a set of items collected by the target user’s neighbors, and $T=\linebreak\{\textit{tag}_{k}|k\in 1,2,\dots,N_{T}\}$ denotes a set of tags owned by the target user, then the conditional probability, $P\left({\textit{item}_{i}|T}\right)$ , denotes the item probability that is conditioned on $T$ . Items with high conditioned probability are recommended to the target user. According to Bayes’ theorem, $P\left({\textit{item}_{i}|T}\right)$ is computed as follows:

$\displaystyle P\left({\textit{item}_{i}|T}\right)=\frac{P\left({\textit{item}_% {i}}\right)P\left(T|\textit{item}_{i}\right)}{P\left(T\right)}$ (6)

This formula incorporates the strength of relationship on the social network, and the tag-based personalized interest, which is unlike previous studies (e.g., [12, 16]).

The Bayesian probability focuses on the fraction part of the above formula, namely the probabilities of $P\left({\textit{item}}_{i}\right)$ and $P\left(T|{\textit{item}}_{i}\right)$ which are described in detail in the next sections.

3.3.2 Item probability

The prior probability of the item’s probability, $P\left({\textit{item}}_{i}\right)$ , can be obtained by the frequency (popularity) of an item from the Item-User (IU) matrix, where the ownership of ${\textit{item}}_{i}$ is defined as follows:

$\displaystyle{IU}_{{\textit{item}}_{i},{\textit{user}}_{u}}=\left\{\begin{% array}[]{ll}1,&\textit{if item }i\textit{ is owned by }{\textit{user}}_{u}\\ 0,&\textit{otherwise}\\ \end{array}\right.$ (7)

Traditionally, the item frequency is the total number of users that own an item as follows:

$\displaystyle\textit{ItemFreq}=\sum_{{\textit{user}}_{u}\in U}{{IU}_{{\textit{% item}}_{i},{\textit{user}}_{u}}}$ (8)

where ${U=\{\textit{user}}_{u}|u\in 1,2,\dots,N_{U}\}$ denotes a set of users from the target user’s friend network.

Thus, the relative item probability can be calculated as follows:

$\displaystyle P\left({\textit{item}}_{i}\right)=\frac{\sum_{{\textit{user}}_{u% }\in U}{{IU}_{{\textit{item}}_{i},{\textit{user}}_{u}}}}{\sum_{{\textit{user}}% _{u}\in U}{\sum_{{\textit{item}}_{i}\in I}{({IU}_{{\textit{item}}_{i},{\textit% {user}}_{u}})}}}$ (9)

Unlike traditional approaches for computing the item probability, the proposed method weights the ${IU}_{{\textit{item}}_{i},{\textit{user}}_{u}}$ by incorporating the adjusted user similarity, AdjUserSim (Eq. (5)). A larger AdjUserSim corresponds to a more important item is to the target user. Thus, the adjusted user similarity between the target user and his/her neighbors can be used to weight an item’s relative importance to the target user. The new ${IU}^{\prime}_{{\textit{item}}_{i},{\textit{user}}_{u}}$ can be defined as follows:

$\displaystyle{IU}^{\prime}_{{\textit{item}}_{i},{\textit{user}}_{u}}={IU}_{{% \textit{item}}_{i},{\textit{user}}_{u}}\times(\beta+{\textit{AdjUserSim}}_{{% \textit{user}}_{t},{\textit{user}}_{u}})$ (10)

where $\beta>0$ denotes a predefined scale value to determine the relative weight between the two variables.

The new item frequency is obtained from ${IU}^{\prime}_{{\textit{item}}_{i},{\textit{user}}_{u}}$ , as follows:

$\displaystyle\textit{ItemFreq}^{\prime}=\sum_{{\textit{user}}_{u}\in U}{{IU}^{% \prime}_{{\textit{item}}_{i},{\textit{user}}_{u}}}$ (11)

Thus, the weighted relative item probability can be calculated as follows:

$\displaystyle P\left({\textit{item}}_{i}\right)=\frac{\sum_{{\textit{user}}_{u% }\in U}{{IU}^{\prime}_{{\textit{item}}_{i},{\textit{user}}_{u}}}}{\sum_{{% \textit{user}}_{u}\in U}{\sum_{{\textit{item}}_{i}\in I}{({IU}^{\prime}_{{% \textit{item}}_{i},{\textit{user}}_{u}})}}}$ (12)

3.3.3 Conditioned probability incorporated tag-based personalized interest

The condition probability, $P\left(T|{\textit{item}}_{i}\right)$ , is a prior tag probability conditioned on the item. To decrease computation cost in evaluating $P\left(T|{\textit{item}}_{i}\right)$ , this study makes the naive assumption of tag attribute conditional independence, i.e., the attributes have no dependence relationships.

Each item is labeled with several tags by all users on the social network. The item’s tag information (item profile) is store in the item-tag (IT) matrix, which consists of the tag name and its corresponding frequency tagged by users. The item frequency of ${\textit{item}}_{i}$ labeled using ${\textit{tag}}_{k}$ is as follows:

$\displaystyle{IT}_{{\textit{item}}_{i},{\textit{tag}}_{k}}=\text{Number of % users labelling }{\textit{item}}_{i}\text{ using }{\textit{tag}}_{k}$ (13)

This study adapted the logarithmic normalized term frequency, defined as the following formula:

$\displaystyle{\textit{ITLog}}_{{\textit{item}}_{i},{\textit{tag}}_{k}}=\left\{% \begin{array}[]{ll}0,&{IT}_{{\textit{item}}_{i},{\textit{tag}}_{k}}=0\\ 1+\log({IT}_{{\textit{item}}_{i},{\textit{tag}}_{k}}),&\textit{otherwise}\\ \end{array}\right.$ (14)

Traditionally, the relative probability of ${\textit{tag}}_{k}$ for ${\textit{item}}_{i}$ can be computed as follows:

$\displaystyle\frac{{\textit{ITLog}}_{{\textit{item}}_{i},{\textit{tag}}_{k}}}{% \sum_{{\textit{item}}_{j}\in I}{{\textit{ITLog}}_{{\textit{item}}_{j},{\textit% {tag}}_{k}}}}$ (15)

Thus, the traditional condition probability $P\left(T|{\textit{item}}_{i}\right)$ can be represented as follows:

$\displaystyle P\left(T|{\textit{item}}_{i}\right)=\prod_{{\textit{tag}}_{k}\in T% }{{\left(\frac{{\textit{ITLog}}_{{\textit{item}}_{i},{\textit{tag}}_{k}}}{\sum% _{{\textit{item}}_{j}\in I}{{\textit{ITLog}}_{{\textit{item}}_{j},{\textit{tag% }}_{k}}}}\right)}}$ (16)

In contrast with the traditional approach for calculating the item-tag frequency, this study incorporates the target user’s personalized interest to adjust the value of the item-tag information. The personalized tag-based preference for the target user is stored in the user-tag (UT) matrix, where the tag distribution indicates the target user’s preferences of interest. The normalized frequency of ${\textit{tag}}_{k}$ tagged by the target user ${\textit{user}}_{t}$ is expressed as Eq. (3). The relative percentage weight of the target user’s interest is expressed as follows:

$\displaystyle\textit{WUT}_{{\textit{user}}_{t},{\textit{tag}}_{k}}=\textit{% UTLog}_{{\textit{user}}_{t},{\textit{tag}}_{k}}\left/\sum_{{\textit{tag}}_{k}% \in T}{{\textit{UTLog}}_{{\textit{user}}_{t},{\textit{tag}}_{k}}}\right.$ (17)

Thus, the weighted item-tag information, represented as ${IT}^{\prime}_{{\textit{item}}_{i},{\textit{tag}}_{k}}$ , is weighted as follows:

$\displaystyle{\textit{ITLog}}^{\prime}_{{\textit{item}}_{i},{\textit{tag}}_{k}% }={\textit{ITLog}}_{{\textit{item}}_{i},{\textit{tag}}_{k}}\times(\tau+{% \textit{WUT}}_{{\textit{user}}_{t},{\textit{tag}}_{k}})$ (18)

where $\tau>0$ denotes a predefined value to determine the relative importance between IT and WUT.

Thus, the condition probability $P\left(T|{\textit{item}}_{i}\right)$ can be defined as the following new formula:

$\displaystyle P\left(T|{\textit{item}}_{i}\right)=\prod_{{\textit{tag}}_{k}\in T% }{{\left(\frac{{\textit{ITLog}}^{\prime}_{{\textit{item}}_{i},{\textit{tag}}_{% k}}}{\sum_{{\textit{item}}_{i}\in I}{{\textit{ITLog}}^{\prime}_{{\textit{item}% }_{i},{\textit{tag}}_{k}}}}\right)}}$ (19)

This study also adopts the Laplacian correction to avoid calculating probability values of zero as follows:

$\displaystyle P\left(T|{\textit{item}}_{i}\right)=\prod_{{\textit{tag}}_{k}\in T% }{{\left(\frac{{1+\textit{ITLog}}^{\prime}_{{\textit{item}}_{i},{\textit{tag}}% _{k}}}{\left|I\right|+\sum_{{\textit{item}}_{i}\in I}{{\textit{ITLog}}^{\prime% }_{{\textit{item}}_{i},{\textit{tag}}_{k}}}}\right)}}$ (20)

where $\left|I\right|$ denotes the size of the item set.

3.3.4 Ranked posterior probability and recommending items

Ranking the posterior probability of $P\left({\textit{item}}_{i}|T\right)$ is equivalent to ranking its numerator part, $P({\textit{item}}_{i})\times P\left(T|{\textit{item}}_{i}\right)$ . The ranking score is expressed by combining Eqs (12) and (20) according to the following equation, and is named as recommendation score (RScore).

$\displaystyle\textit{RScore}=\frac{\sum_{{\textit{user}}_{u}\in U}{{IU}^{% \prime}_{{\textit{item}}_{i},{\textit{user}}_{u}}}}{\sum_{{\textit{user}}_{u}% \in U}{\sum_{{\textit{item}}_{i}\in I}{({IU}^{\prime}_{{\textit{item}}_{i},{% \textit{user}}_{u}})}}}\times\prod_{{\textit{tag}}_{k}\in T}{{\left(\frac{{1+% \textit{ITLog}}^{\prime}_{{\textit{item}}_{i},{\textit{tag}}_{k}}}{\left|I% \right|+\sum_{{\textit{item}}_{i}\in I}{{\textit{ITLog}}^{\prime}_{{\textit{% item}}_{i},{\textit{tag}}_{k}}}}\right)}}$ (21)

Based on the RScore, this study recommends the top $M$ items that the target user has not yet collected.

4. Experiments and evaluations

4.1 Evaluation metrics of the recommendation quality

This study suggests the top $M$ items that the target user has not yet collected. Recall, precision and F1-measure are three performance measures in information retrieval. They are obtained by calculating the number of correctly predicted items. Unlike e-commence, where the number of product items is limited, users on social information sharing platforms collect huge and diverse resource items. Social followings can also reduce duplicate work on tagging items, because users can easily find the items they like from the users that they follow on the social networks. This implies that users need not explicitly collect the same items. Therefore, traditional performance measures on the social information sharing platforms is not necessary.

The quality of the recommended items can alternatively be evaluated by calculating the similarity of keywords between the recommended items and the target user’s items using a word embedding model. Word2vec is a popular word embedding model in natural language processing, which was created by a team of researchers led by Tomas Mikolov at Google [18, 23, 24]. It is trained by a shallow, two-layer neural networks to reconstruct the linguistic contexts of words by mapping words to vectors of real numbers.

Word2vec can produce a distributed representation of words using either of two model architectures, continuous bag-of-words (CBOW) or continuous skip-gram. In the continuous bag-of-words architecture, the model predicts the current word from a window of surrounding context words. The order of the context words does not influence prediction. This study adapts the CBOW model, because it is faster than continuous skip-gram. Wikipedia is also employed to train corpora of text, because the Wikipedia comprises large number of terms.

In the word2vec model, each unique word in a corpus is assigned a corresponding vector in the word vector space. A word vector is positioned in the vector space such that words that share common contexts in the corpus are located in close proximity to one another in the space.

A word vector is represented as: $\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\rightharpoonup$}\over{% v}}_{k}{=(x_{1},x_{2},\dots,\ x}_{d},\dots,x_{N_{D}})$ , where $N_{D}$ denotes the dimension size of the word vector. An item consists of many keywords, so its vector can be inferred by computing a simple mean of the projection weight vectors of the given words as follows:

$\displaystyle\lx@stackrel{{\scriptstyle\raisebox{1.7pt}{\rule{14.226378pt}{0.2% 84528pt}}\hskip-3.983386pt\rightharpoonup}}{{vec}}_{i}=\left.\left(\sum_{k% \mathrm{\in}K_{{\textit{item}}_{i}}}{\mathord{\buildrel\lower 3.0pt\hbox{$% \scriptscriptstyle\rightharpoonup$}\over{v}}_{k}}\right)\right/||K_{{\textit{% item}}_{i}}||$ (22)

where $\mathord{\buildrel\lower 3.0pt\hbox{$\scriptscriptstyle\rightharpoonup$}\over{% v}}_{k}$ : word vector for keyword in $K_{{\textit{item}}_{i}}$ , $K_{{\textit{item}}_{i}}$ : keywords set of ${\textit{item}}_{i}$ .

The cosine similarity between ${\textit{item}}_{i}$ and ${\textit{item}}_{j}$ is defined as follows:

$\displaystyle\textit{cosSim}(\lx@stackrel{{\scriptstyle\raisebox{1.7pt}{\rule{% 14.226378pt}{0.284528pt}}\hskip-3.983386pt\rightharpoonup}}{{vec}}_{i},% \lx@stackrel{{\scriptstyle\raisebox{1.7pt}{\rule{14.226378pt}{0.284528pt}}% \hskip-3.983386pt\rightharpoonup}}{{vec}}_{j})=\frac{\lx@stackrel{{% \scriptstyle\raisebox{1.7pt}{\rule{14.226378pt}{0.284528pt}}\hskip-3.983386pt% \rightharpoonup}}{{vec}}_{i}\cdot\lx@stackrel{{\scriptstyle\raisebox{1.7pt}{% \rule{14.226378pt}{0.284528pt}}\hskip-3.983386pt\rightharpoonup}}{{vec}}_{j}}{% ||\lx@stackrel{{\scriptstyle\raisebox{1.7pt}{\rule{14.226378pt}{0.284528pt}}% \hskip-3.983386pt\rightharpoonup}}{{vec}}_{i}||\,||\lx@stackrel{{\scriptstyle% \raisebox{1.7pt}{\rule{14.226378pt}{0.284528pt}}\hskip-3.983386pt% \rightharpoonup}}{{vec}}_{j}||}$ (23)

The above cosine similarity ranges from $-$ 1 to 1, and can be scaled between 0 and 1 as follows:

$\displaystyle\textit{scaled}\_\textit{cosSim}(\lx@stackrel{{\scriptstyle% \raisebox{1.7pt}{\rule{14.226378pt}{0.284528pt}}\hskip-3.983386pt% \rightharpoonup}}{{vec}}_{i},\lx@stackrel{{\scriptstyle\raisebox{1.7pt}{\rule{% 14.226378pt}{0.284528pt}}\hskip-3.983386pt\rightharpoonup}}{{vec}}_{j})=1-% \frac{{\textit{cos}}^{-1}\left(\textit{cosSim}\left(\lx@stackrel{{\scriptstyle% \raisebox{1.7pt}{\rule{14.226378pt}{0.284528pt}}\hskip-3.983386pt% \rightharpoonup}}{{vec}}_{i},\lx@stackrel{{\scriptstyle\raisebox{1.7pt}{\rule{% 14.226378pt}{0.284528pt}}\hskip-3.983386pt\rightharpoonup}}{{vec}}_{j}\right)% \right)}{\pi}$ (24)

Item’s tags and keywords (mainly from title, subtitle, header, meta tag with keywords and description, and other content text defined with paragraph tags) are used to evaluate whether recommended items are appropriate for the target user’s bookmark preferences. The item is described by the content keywords and its tags, labeled by users who collect it. Thus, the tags and keywords of the items are adopted to evaluate recommendation quality. The keywords are extracted from an item, removing the stop words to retain only the noun keywords. Some linkage items without tags or text content are skipped. The frequent tags and keywords of each item are included to calculate similarity between items. The recommendation quality is measured from the similarity between the recommended items (represented as REC) and the target user’s recent collected items (represented as TAR). Thus, the recommendation quality for a target user is measured by calculating the average inter-group item similarity between REC and TAR, as follows:

$\displaystyle{\textit{quality}}_{{\textit{user}}_{t}}=\frac{1}{||\textit{REC}|% |\,||\textit{TAR}||}\sum_{i\in\textit{REC}}{\sum_{j\in\textit{TAR}}{\textit{% scaled}\_\textit{cosSim}(\lx@stackrel{{\scriptstyle\raisebox{1.7pt}{\rule{14.2% 26378pt}{0.284528pt}}\hskip-3.983386pt\rightharpoonup}}{{vec}}_{i},% \lx@stackrel{{\scriptstyle\raisebox{1.7pt}{\rule{14.226378pt}{0.284528pt}}% \hskip-3.983386pt\rightharpoonup}}{{vec}}_{j})}}$ (25)

A higher quality means that the recommendation items are more similar to the target user’s current collected items, that is, the recommendation is better.

4.2 Data set and parameter setting

This study collected users, social relationships of users, and basic profile information of user-tag, item-user and item-tag from the Diigo social network information sharing website (https://www.diigo.com). The experimental dataset comprised 547 target users. Each user had an average of 265 tags, 411 items and 36 direct or indirect friends (followings and followers) within the target user’s 2-layer network. This study limited the network size within two layers.

The proposed model selected the top 15 neighbors from the target user’s social network. From the similar neighbors, the Bayesian probability of each candidate item was calculated, and then recommended the top 25 items with highest recommendation score to the target user. Other necessary parameters were set as ${\theta}_{fg}=$ 0.25, ${\theta}_{fr}=$ 0.25, ${\theta}_{fg+fr}=$ 0.25, ${\theta}_{\textit{indf}}=$ 0.25, $\alpha=$ 0.8, $\beta=$ 1.0, $\tau=$ 1.0, and the word vector size of the word2vec model was set to 120.

4.3 Experimental results

4.3.1 Comparisons with traditional approaches

To evaluate the relative improvement of the proposed model, this study built the proposed model and other four baseline models, namely popular and random recommenders with and without the target user’s friend network, as described in Table 1. The first two baseline models were based on the same target user’s friend network as follows. (1) The popular recommender suggested the popular items from neighbors (users in the friend network) to the target user, without considering the Bayesian probability. (2) The random recommender recommended random items from neighbors (users in the friend network) to the target user, also without considering the Bayesian probability.

Table 1
Description and abbreviation of studied models

	Model description	Abbreviation
I	Proposed Bayesian model	Bayesian
II	Popular model with neighbors on friend network	Popular_FN
III	Random model with neighbors on friend network	Random_FN
IV	Popular model with users outside friend network	Popular_NoneFN
V	Random model with users outside friend network	Random_NoneFN

Table 2

Result of paired $t$ test for the studied models

Model	Bayesian	Popular_FN	Random_FN	Popular_NoneFN	Random_NoneFN
Bayesian	–	Sig.	Sig.	Sig.	Sig.
Popular_FN		–	Not Sig.	Sig.	Sig.
Random_FN			–	Not Sig.	Sig.
Popular_NoneFN				–	Not Sig.
Random_NoneFN					–

This study also constructed two random and popular models without considering the target user’s friend network. These popular and random models suggested items to each target user from 15 randomly selected users, which did not belong to the target user’s two-layered friend network. All models were based on the same size of collaborative users, the same size of suggesting items, and the same system parameters.

Figure 2 compares the word2vec cosine quality among the proposed, popular and random models. The study employed the paired T test with 0.05 significance level to examine the difference of average recommendation quality, as listed in Table 2.

The experimental results indicate that the proposed model performed significantly better than the other baseline models as shown in Table 2. Our proposed Bayesian probability model, incorporating the tag-based personalized interest and weighted social relationship information, can improve recommendation quality. The relative improvement of the proposed model over other models was 82.1%, 96.5%, 224.2% and 234.7%, respectively, as shown in Table 3.

Table 3

Relative improvement of the proposed model against other baseline models

Proposed model against the following models	Relative improvement
Bayesian against Popular_FN	82.1%
Bayesian against Random_FN	96.5%
Bayesian against Popular_NoneFN	224.2%
Bayesian against Random_NoneFN	234.7%

Figure 2.

Recommendation quality (word2vec similarity) of the studied models.

The social network information used in the collaborative stage can improve the recommendation quality. The popular and random models considering friend network had significantly better recommendation quality than the popular and random models without considering the friend network.

The popular model had slightly better recommendation quality than the random model, but the differences between popular and random models were not significant in this study.

4.3.2 Analysis on adjusted user similarity with social relation

Equation (5) combines the cosine similarity and social relation type to find similar neighbors at the collaborative stage. The predefined parameter $\alpha$ may impact the recommendation quality. This study examined several possible values of parameter $\alpha$ , as shown in Fig. 3. The word2vec similarity results indicate that recommendation quality slightly degraded when $\alpha$ decreases. This finding indicates that the cosine similarity is critical in collaborative stage, with the maximum recommendation quality occurring when $\alpha=$ 1.0 (without considering social relation).

Figure 3.

Result of average diversity and quality for tuning parameter $\alpha$ .

The recommendation quality of $\alpha\leqslant$ 0.5 was lower than that of $\alpha\geqslant$ 0.5, but was still better than the other baseline approaches described in the previous section. Although combining the cosine similarity with the social relation may slightly degrade recommendation quality, it may increase the “diversity” of the candidate neighbors and the suggested items for the target user.

Diversity measures the dissimilarity of recommended items for a user [7, 27]. This similarity is often determined using the item’s content, but can also be determined using how similarly users rate items [1]. Unlike previous research, this study measured the diversity among recommended items using the item vector similarity based on the word2vec vector of the item’s tags and keywords. The diversity of recommended items was measured by calculating the item-similarity within the recommendation lists for the target user, as follows:

$\displaystyle{\textit{diversity}}_{{\textit{user}}_{t}}=1-\frac{1}{||\textit{% REC}||(||\textit{REC}||-1)}\sum_{i\in\textit{REC}}{\sum_{j\in\textit{REC},j% \neq i}{\textit{scaled}\_\textit{cosSim}(\lx@stackrel{{\scriptstyle\raisebox{1% .7pt}{\rule{14.226378pt}{0.284528pt}}\hskip-3.983386pt\rightharpoonup}}{{vec}}% _{i},\lx@stackrel{{\scriptstyle\raisebox{1.7pt}{\rule{14.226378pt}{0.284528pt}% }\hskip-3.983386pt\rightharpoonup}}{{vec}}_{j})}}$ (26)

Figure 3 shows that the recommendation diversity slightly degraded and the recommendation slightly upgraded generally when $\alpha$ increased. Tuning $\alpha$ can thus balance the similarity and diversity to a certain degree. This study used $\alpha=$ 0.8 as a benchmark model for comparison with other models.

4.3.3 Analysis of model without adjusting IU and IT information

(1) Analysis of model without adjusting item-user information

To compare the recommendation quality without considering the adjusted user similarity, this study calculates item probability from the traditional item-user information, ${IU}_{{\textit{item}}_{i},{\textit{user}}_{u}}$ (Eq. (7)). As indicated in Fig. 4, the Traditional-IU model had slightly (but not significantly) lower recommendation quality than the proposed model using ${IU}^{\prime}_{{\textit{item}}_{i},{\textit{user}}_{u}}$ in Eq. (10). This indicates that using the proposed adjusted user similarity between the target user and his/her neighbors to weight the relative importance of items to the target user can slightly improve the recommendation quality of Bayesian probability.

Figure 4.

Recommendation quality of models without adjusting IU and IT information.

Figure 5.

Word vector, keyword co-occurrence and item precision of 15 sampled target users.

(2) Analysis of model without weighting item-tag frequency

To examine the recommendation quality of model without weighting the item-tag frequency, this study calculated the condition probability from the traditional item-tag frequency (Eq. (16)). Namely, the item-tag values were not adjusted with WUT as defined in Eq. (18). This approach affected the result of Bayesian probability in Eq. (21). As shown in Fig. 4, the Traditional-IT model had significantly inferior recommendation quality compared with the proposed model using ${\textit{ITLog}}^{\prime}_{{\textit{item}}_{i},{\textit{tag}}_{k}}$ . This result indicates that the proposed model benefits from adjusting the value of item-tag information according to the target user’s personalized interest preference. The merit of using the target user’s personalized tag-based preference to weight the candidate items is to emphasize the relative importance of the candidate item’s corresponding tags, thus improving the recommendation quality of Bayesian probability.

4.3.4 Comparison of evaluation measures with word vector similarity, keyword co-occurrence and item precision

The word vector is adequate for evaluating the recommendation quality of item content. This section compares the word vector similarity with the traditional keyword co-occurrence and item precision measures. The co-occurrence measure is defined as follows:

$\displaystyle\textit{co-occurence}=\frac{||\textit{TAR}\cap\textit{REC}||}{||% \textit{TAR}\bm{\cup}\textit{REC}||}$ (27)

High co-occurrence means a large number of common keywords for the TAR and REC. In other words, the corpus contains many same-word vectors exists, thus has a large word vector similarity.

Another traditional measure is item precision, which is the ratio that the recommended items successfully hit (predict) the target user’s item URLs, defined as follows.

$\displaystyle\textit{precision}=\frac{\textit{Number of hit items in % recommeded items}}{\textit{Number of recommended items}}$ (28)

Figure 5 illustrates the word vector, keyword co-occurrence, and item precision from fifteen sampled target users. For the first nine samples with non-zero item precision, the three measures generally correlate in each other. The correlation coefficient between word vector and keyword co-occurrence was 0.92, and the correlation coefficient between word vector and item precision was 0.89. For the last six samples with zero item precision, the correlation coefficient between word vector and keyword co-occurrence was 0.70, and there was no correlation between word vector and item precision. As mentioned in the previous section, because a user’s collection may have a wide variety of items, the recommended item rarely hits the target user’s item, but may be similar to this in content. In this case, using the item precision fails to measure the recommendation quality, but the word vector measure as an alternative and complement can still evaluate the recommendation quality in terms of content similarity.

5. Conclusions and future research

5.1 Conclusions

Social information sharing platforms implement main functions in a cycle of knowledge management, and enable users to store, create, and disseminate any information on the internet, including articles, reports, documents, photos and videos. Modern social information sharing platforms allow people to follow other users, or be followed by fans, based on common interest via the social networks. This study designs recommendation systems for social networks incorporating personalized tag-based interest, based on Bayesian probability.

The experimental results indicate the proposed Bayesian recommender system is promising for social information sharing platforms. Social relations on a target user’s social network help find interest similar users in collaborative filtering. The Bayesian model can incorporate a target user’s similarity strength with friends on social networks, and tag-based personalized preference.

This study redesigns the Bayesian recommendation mechanism to tightly integrate social relationships, tag-based interests and item popularity into social information management systems. The proposed method can be applied in applications that provide tagging systems and social networks, such as blogs, books, articles, documents, pictures, audios and videos.

5.2 Future research

Future work will include investigating extensions to the proposed model, including enlarging the layers of social networks, analyzing the relationships in social networks, and combining users’ other explicit and implicit interests (e.g., clicking, sharing, forwarding, commends, responses). In addition, future research aims to alleviate the problem of ambiguity and tag synonyms in tagging systems, and cold start problems in recommender systems [14, 19, 26, 29]. Employing state-of-the-art word embedding technology e.g., word2vec [23, 24] and fasttext [4, 15] to integrate the users’ tag-based interest model in both collaborative and content-based filtering systems is a promising way to further enhance the proposed recommender system.

Footnotes

Acknowledgments

The authors would also like to thank Mr. W.-C. Chen, Mr. C.-H. Liao, and Mr. H.-J. Chiang, members of the Business Intelligence and Big Data Laboratory in NKUST, for collecting experimental data and training word2vec model.

References

Adamopoulos

and Tuzhilin

, On unexpectedness in recommender systems: Or how to better expect the unexpected, ACM Transactions on Intelligent Systems and Technology, Special Section on Novelty and Diversity in Recommender Systems 4(5) (2015): 54.

Adomavicius

and Tuzhilin

, Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions, IEEE Transactions on Knowledge and Data Engineering 17(6) (2005), 734–749.

Balabanovic

and Shoham

, Fab: content-based, collaborative recommendation, Communications of the ACM 40(3) (1997), 66–72.

Bojanowski

Grave

Joulin

and Mikolov

, Enriching word vectors with subword information, Transactions of the Association for Computational Linguistics 5 (2017), 135–146.

Brooks

C.H.

and Montanez

, An analysis of the effectiveness of tagging in blogs, in: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, Vol. 6, 2006, pp. 9–14.

Chen

Liu

Yang

and Zou

, Inferring tag co-occurrence relationship across heterogeneous social networks, Applied Soft Computing, 66 (2018), 512–524.

Chen

and He

, Personality and recommendation diversity, in: Emotions and Personality in Personalized Services, Springer International Publishing, 2016, pp. 201–225.

Dalkir

, Knowledge Management In Theory and Practice, Boston: Elsevier/Butterworth-Heinemann, 2005.

Gemmell

Schimoler

Mobasher

and Burke

, Tag-based resource recommendation in social annotation applications, in: User Modeling, Adaptation and Personalization, Girona, Spain, 2011.

10.

Golder

S.A.

and Huberman

B.A.

, The structure of collaborative tagging systems, Journal of Information Science 32 (2006), 198–208.

11.

Grundspenkis

, Agent based approach for organization and personal knowledge modelling: knowledge management perspective, Journal of Intelligent Manufacturing 18(4) (2007), 451–457.

12.

Huang

C.-L.

Yeh

P.-H.

Lin

C.-W.

and Wu

D.-C.

, Utilizing user tag-based interests in recommender systems for social resource sharing websites, Knowledge-Based Systems 56(2) (2014), 86–96.

13.

Jäschke

Hotho

Schmitz

Ganter

and Stumme

, Discovering shared conceptualizations in folksonomies, Journal of Web Semantics 6(1) (2008), 38–53.

14.

and Shen

, Addressing cold-start: Scalable recommendation with tags and keywords, Knowledge-Based Systems 83 (2015), 42–50.

15.

Joulin

Grave

Bojanowski

and Mikolov

, Bag of tricks for efficient text classification, in: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Vol. 2, pp. 427–431.

16.

Kim

H.-N.

A.-T.

and Jo

G.-S.

, Collaborative filtering based on collaborative tagging for enhancing the quality of recommendation, Electronic Commerce Research and Applications 9(1) (2010), 73–83.

17.

Lamere

, Social tagging and music information retrieval, Journal of New Music Research 37(2) (2008), 101–114.

18.

and Mikolov

, Distributed representations of sentences and documents, in: Proceedings of the 31st International Conference on Machine Learning (ICML-14), 2014, pp. 1188–1196.

19.

Lika

Kolomvatsos

and Hadjiefthymiades

, Facing the cold start problem in recommender systems, Expert Systems with Applications 41(4) (2014), 2065–2073.

20.

Jia

Zhang

and Lin

, Combining tag correlation and user social relation for microblog recommendation, Information Sciences 385 (2017), 325–337.

21.

Marlow

Naaman

Boyd

and Davis

, HT06, tagging paper, taxonomy, Flickr, academic article, to read, in: Proceedings of the Seventeenth Conference on Hypertext and Hypermedia, 2006, pp. 31–40.

22.

Mezghani

Péninou

Zayani

C.A.

Amous

and Sèdes

, Producing relevant interests from social networks by mining users’ tagging behaviour: A first step towards adapting social information, Data & Knowledge Engineering 108 (2017), 15–29.

23.

Mikolov

Chen

Corrado

and Dean

, Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781v3, 2013.

24.

Mikolov

Sutskever

Chen

Corrado

and Dean

, Distributed representations of words and phrases and their compositionality, in: Advances in Neural Information Processing Systems, 2013, pp. 3111–3119.

25.

Morrison

P.J.

, Tagging and searching: Search retrieval effectiveness of folksonomies on the World Wide Web, Information Processing & Management 44(4) (2008), 1562–1579.

26.

Pereira

A.L.V.

and Hruschka

E.R.

, Simultaneous co-clustering and learning to address the cold start problem in recommender systems, Knowledge-Based Systems 82 (2015), 11–19.

27.

Pouli

Baras

J.S.

and Arvanitis

, Increasing recommendation accuracy and diversity via social networks hyperbolic embedding, in: IEEE 11𝑡ℎ Consumer Communications and Networking Conference, 2014, pp. 225–232.

28.

Rashid

A.M.

Albert

Cosley

Lam

S.K.

and Konsta

S.M.

, Getting to know you: Learning new user preferences in recommender systems, in: Proceedings of the 7th International Conference on Intelligent User Interfaces, 2002, pp. 127–134.

29.

Son

L.H.

, Dealing with the new user cold-start problem in recommender systems: A comparative review, Information Systems 58 (2016), 87–104.

30.

Wei

Huang

and Fu

, A survey of e-commerce recommender systems, in: IEEE International Conference on Service Systems and Service Management, 2007, pp. 1–5.

31.

Wetherbe

J.C.

Turban

Leidner

D.E.

and McLean

E.R.

, Information technology for management: Transforming organizations in the digital economy, New York: Wiley. ISBN 0-471-78712-4, 2007.

32.

and Zhang

Z.K.

, Enhancing personalized recommendations on weighted social tagging networks, Physics Procedia 3(5) (2010), 1877–1885.

33.

Zanardi

and Capra

, A scalable tag-based recommender system for new users of the social web, in: 22nd International Conference on Database and Expert Systems Applications, Toulouse, France, 2011.

34.

Zanardi

and Capra

, Social ranking: Uncovering relevant content using tag-based recommender systems, in: 2nd ACM International Conference on Recommender Systems, Lausanne, Switzerland, 2008, pp. 51–58.

35.

Zhang

Yao

and Sun

, Deep learning based recommender system: A survey and new perspectives, arXiv preprint arXiv:1707.07435, 2017.

36.

Zhang

Gao

Guo

and Sun

, Combining content and relation analysis for recommendation in social tagging systems, Physica A: Statistical Mechanics and its Applications 391(22) (2012), 5759–5768.

Bayesian recommender system for social information sharing: Incorporating tag-based personalized interest and social relationships

Abstract

Keywords

1. Introduction

2. Related works

2.1 Tag-based social information sharing system

2.2 Tag-based recommender systems

2.3 Social networks for knowledge sharing

3. Tag-based Bayesian probability incorporated with social network information for resource items recommendation

3.1 Proposed recommendation procedure

3.3.1 Item probability based on the Bayes’ theorem

4.1 Evaluation metrics of the recommendation quality

4.3 Experimental results

4.3.1 Comparisons with traditional approaches

Table 1 Description and abbreviation of studied models

(1) Analysis of model without adjusting item-user information

(2) Analysis of model without weighting item-tag frequency

5.1 Conclusions

5.2 Future research

Footnotes

Acknowledgments

References

Table 1
Description and abbreviation of studied models