A Social Network-Based Recommender System (SNRS) 
Jianming He and Wesley W. Chu 
 
 
Computer Science Department  
University of California, Los Angeles, CA 90095 
jmhek@cs.ucla.edu, wwc@cs.ucla.edu 
Abstract. Social influence plays an important role in product marketing. However, it has 
rarely been considered in traditional recommender systems. In this paper we present a new 
paradigm  of  recommender  systems  which  can  utilize  information  in  social  networks,  in-
cluding user  preferences,  item's  general  acceptance,  and  influence  from  social  friends.  A 
probabilistic model is developed to make personalized recommendations from such infor-
mation. We extract data from a real online social network, and our analysis of this large da-
taset reveals that friends have a tendency to select the same items and give similar ratings. 
Experimental results on this dataset show that our proposed system not only improves the 
prediction accuracy of recommender systems but also remedies the data sparsity and cold-
start issues inherent in collaborative filtering. Furthermore, we propose to improve the per-
formance of our system by applying semantic filtering of social networks, and validate its 
improvement via a class project experiment. In this experiment we demonstrate how rele-
vant friends can be selected for inference based on the semantics of friend relationships and 
finer-grained user ratings. Such technologies can be deployed by most content providers. 
1  Introduction 
In order to overcome information overload, recommender systems have become a 
key tool for providing users with personalized recommendations on items such as 
movies, music, books, news, and web pages. Intrigued by many practical applica-
tions,  researchers  have  developed  algorithms  and  systems  over  the  last  decade. 
Some of them have been commercialized by online venders such as Amazon.com, 
Netflix.com,  and  IMDb.com.  These  systems  predict  user  preferences  (often 
represented as numeric ratings) for new items based on the user's past ratings on 
other items. There are typically two types of algorithms for recommender systems 
--  content-based methods  and  collaborative filtering.  Content-based  methods 
measure the similarity of the  recommended item (target item) to the ones that a 
target user (i.e., user who receives recommendations) likes or dislikes [25, 22, 30] 
based on item attributes. On the other hand, collaborative filtering finds users with 
tastes that are similar to the target user’s based on their past ratings. Collaborative 
filtering will then make recommendations to the target user based on the opinions 
of those similar users [3, 5, 27].  
Despite all of these efforts, recommender systems still face many challenging 
problems. First, there are demands for further improvements on the prediction ac-
2  
curacy  of  recommender  systems.  In  October  2006,  Netflix  announced  an  open 
competition with the grand prize of $1,000,000 for the best algorithm that predicts 
user ratings for films (http://www.netflixprize.com). The improvement in the pre-
diction accuracy can increase user satisfaction, which in turn leads to higher prof-
its for those e-commerce websites. Second, algorithms for recommender systems 
suffer from many issues. For example, in order to measure item similarity, con-
tent-based methods rely on explicit item descriptions. However, such descriptions 
may be difficult to obtain for items like ideas or opinions. Collaborative filtering 
has  the data sparsity  problem  and  the cold-start  problem  [1].  In  contrast  to  the 
huge number of items in recommender systems, each user normally only rates a 
few. Therefore, the user/item rating matrix is typically very sparse. It is difficult 
for  recommender  systems  to  accurately  measure  user  similarities  from  those  li-
mited number of reviews. A related problem is the cold-start problem. Even for a 
system that is not particularly sparse, when a user initially joins, the system has 
none or perhaps only a few reviews from this user. Therefore, the system cannot 
accurately interpret this user's preference.  
To tackle those problems, two approaches have been proposed [3, 29, 21, 23]. 
The first approach is to condense the user/item rating matrix through dimensional-
ity reduction techniques such as Singular Value Decomposition (SVD) [3, 29]. By 
clustering users or items according to their latent structure, unrepresentative users 
or items can be discarded, and thus the user/item matrix becomes denser. Howev-
er, these methods do not significantly improve the performance of recommender 
systems, and sometimes make the performance even worse. 
The second approach is to "enrich" the user/item rating matrix by 1) introduc-
ing default ratings or implicit user ratings, e.g., the time spent on reading articles 
[23]; 2) using half-baked rating predictions from content-based methods [21]; or 
3)  exploiting  transitive  associations  among  users  through  their  past  transactions 
and feedback [12]. These methods improve the performance of recommender sys-
tems to some extent. In this paper we try to solve these problems from a different 
perspective. In particular,  we propose a new paradigm of  recommender systems 
by utilizing information in social networks, especially that of social influence. 
Traditional recommender systems do not take into consideration explicit so-
cial relations among users, yet the importance of social influence in product mar-
keting  has  long  been  recognized  [32,  35].  Intuitively,  when  we  want  to  buy  a 
product that is not familiar, we often consult with our friends who have already 
had experience with the product, since they are those that we can reach for imme-
diate advice. When friends recommend a product to us, we also tend to accept the 
recommendation because their inputs are trustworthy. Many marketing strategies 
that have leveraged this aspect of human nature have achieved great success. One 
classic  example  is  the  Hotmail's  free  email  service.  The  marketing  strategy  of 
Hotmail is to attach a promotion message at the bottom of every outgoing email: 
“Get your private, free email at http://www.hotmail.com.” People who receive the 
email will sign up and then further propagate this promotion message. As a result, 
the number of Hotmail user accounts grew from zero to 12 million in 18 months 
on  only  a  $500,000  advertising  budget—thereby  out-performing  many  conven-
3 
tional  marketing  strategies  [14].  Thus,  social  influences  play  a  key  role  when 
people are making decisions of adopting products. 
Additionally, the integration of social networks can theoretically improve the 
performance of current recommender systems. First, in terms of the prediction ac-
curacy, the additional information about users and their friends obtained from so-
cial  networks  improves  the  understanding  of  user  behaviors  and  ratings.  There-
fore,  we  can  model  and  interpret  user  preferences  more  precisely,  and  thus 
improve  the  prediction  accuracy.  Second,  with  friend  information  in  social  net-
works, it is no longer necessary to find similar users by measuring their rating si-
milarity, because  the  fact that two people are friends already indicates that they 
have things in common. Thus, the data sparsity problem can be alleviated. Finally, 
for the cold-start issue, even if a user has no past reviews, recommender system 
still  can  make  recommendations  to  the  user  based  on  the  preferences  of  his/her 
friends  if it integrates  with social networks.  All of these intuitions and observa-
tions motivate us to design a new paradigm of recommender systems that can take 
advantage of information in social networks.  
The recent emergence of online social networks (OSNs) gives us an opportu-
nity to investigate the role of social influence in recommender systems. With the 
increasing  popularity  of  Web  2.0,  many  OSNs,  such  as  Myspace.com,  Face-
book.com,  and  Linkedin.com  have  emerged.  Members  in  those  networks  have 
their own personalized space where they not only publish their biographies, hob-
bies, interests, blogs, etc., but also list their friends. Friends or visitors can visit 
these  personal  spaces  and  leave  comments.  Note  that  in  this  paper  we  define 
friends as any two users who are connected by an explicit social link. We define 
immediate friends as those friends who are just one hop away from each other in a 
social network graph, and distant friends as friends who are multiple hops away. 
OSNs provide platforms where people can place themselves on exhibit and main-
tain connections with friends. As OSNs continue to gain more popularity, the un-
precedented  amount  of  personal  information  and  social  relations  improve  social 
science research where it was once limited by a lack of data.  
In our research, we are interested in the role of explicit social relations in re-
commender systems, such as how user preferences or ratings are correlated with 
those of friends, and how to use such correlations to design a better recommender 
system.  In  particular,  we  design  an  algorithm  framework  which  makes  recom-
mendations based on user's own preferences, the general acceptance of the target 
item, and the opinions from social friends. We crawl a real online social network 
from Yelp.com, and perform extensive analysis on this dataset. Some of the key 
questions, such as whether or not friends tend to select the same item, and whether 
or not friends tend to give similar ratings, have been studied in this dataset. We al-
so use this dataset to evaluate the performance of our proposed system on the pre-
diction accuracy, data sparsity, and cold-start. The experimental results of our sys-
tem show significant improvement against traditional collaborative filtering in all 
of  those  aspects.  For  example,  the  prediction  accuracy  has  improved  by  17.8% 
compared to traditional collaborative filtering. Furthermore, we propose to use the 
semantics  of  friend  relationships  and  finer-grained  user  ratings  to  improve  the 
prediction accuracy.   
4  
The remainder of the paper is organized as follows. First, in Section 2 we give 
a background of  traditional collaborative filtering algorithms. Then  we  formally 
propose a social network-based recommender system in Section 3. In Section 4 we 
introduce the dataset that we crawled from Yelp, and present some analytical stu-
dies on this dataset. Following that, we evaluate the performance of the proposed 
system on the Yelp dataset in Section 5. In Section 6 we propose to further im-
prove the prediction accuracy of the system by applying semantic filtering of so-
cial networks, and validate its improvement via a class experiment. In Section 7 
we review related studies, and conclude in Section 8.  
2  Background 
After the pioneering work in the Grouplens project in 1994 [27], collaborative fil-
tering (CF) soon became one of the most popular algorithms in recommender sys-
tems. Many variations of this algorithm have also been proposed [2, 21, 11, 36, 
13]. In this paper we will use the traditional CF as one of the comparison methods. 
Therefore, the remainder of this section will focus on this algorithm. 
The assumption of CF is that people who agree in the past tend to agree again 
in the future. Therefore, CF first finds users with taste similar to the target user's. 
CF will then make recommendations to the target user by predicting the target us-
er's rating  to the target item  based on the ratings of  his/her top-K  similar  users. 
User ratings are often represented by discrete values within a certain range, e.g., 
one  to  five.  A  one  indicates  an  extreme  dislike  to  the  target  item,  while  a  five 
shows high praise. Let RUI be the rating of the target user U on the target item I. 
Thus, RUI is estimated as the weighted sum of the votes of similar users as follows. 
R
UI
=
R
U
+
Z
×
(
R
VI
−
R
V
)
, 
(1) 
 
VUw
)
(
,
∑
V
Ψ∈
 
where  UR and 
VR represent the average ratings of the target user U and every us-
er V in U's neighborhood, Ψ, which consists of the top-K similar users of U. w(U, 
 is a normalizing 
V) is the weight between users U and V, and Z =  ∑V
constant to normalize total weight to one. Specifically, w(U, V) can be defined us-
ing the Pearson correlation coefficient [27]. 
 
1
VUw
(
)
,
(2) 
VUw
(
,
)
=
 
(
∑
I
R
(
UI
I
∑
R
UI
−
R
U
2
−
R
U
)
R
)(
VI
∑
(
I
−
R
V
)
R
VI
2
−
R
V
)
 
5 
where the summations over I are over the common items for which both user U 
and V have voted. 
Other variations to this algorithm include different weighting techniques. For 
example, when two users have less than 50 co-rated items, [11] proposed to insert 
a significance weighting factor of n/50 to the original weight, where n is the num-
ber of co-rated items. As we can see, traditional collaborative filtering and its var-
iations do not utilize the semantic friend relations among users in recommender 
systems; however, this is essential to the buying decisions of users. In the follow-
ing  sections,  we  are  going  to  present  a  new  paradigm  of  recommender  systems 
which improves the performance of traditional recommender systems by using the 
information in social networks.   
3  A Social Network-Based Recommender System 
Before we introduce the system, let us first show a typical scenario. Angela wants 
to watch a movie on a weekend. Her favorite movies are dramas. From the Inter-
net, she finds two movies particularly interesting, "Revolutionary Road" and "The 
Curious Case of Benjamin Button." These two movies are all highly rated in the 
message  board  at  Yahoo  Movies.  Because  she  cannot  decide  which  movie  to 
watch, she calls her best friend Linda whom she often hangs out with. Linda has 
not viewed these two movies either, but she knew that one of her officemates had 
just  watched  "Revolutionary  Road"  and  highly  recommended  it.  So  Linda  sug-
gests "Why don't  we  go to  watch Revolutionary  Road together?" Angela is cer-
tainly willing to take Linda’s recommendation, and therefore has a fun night at the 
movies with her friend. If we review this scenario, we can see at least three factors 
that  really  contribute  to  the  Angela's  final  decision.  The  first  factor  is  Angela's 
own preference for drama movies. If Angela did not like drama movies, she would 
be  less  likely  to  pick  something  like  "Revolutionary  Road"  to  begin  with.  The 
second factor is the public reviews on these two movies. If these movies received 
horrible reviews, Angela would most likely lose interest and stop any further in-
vestigation.  Finally,  it  is  the  recommendation  from  Angela's  friend,  Linda,  that 
makes Angela finally choose "Revolutionary Road." Interestingly, Linda's opinion 
is also influenced by her officemate. If we recall the decisions that we make in our 
daily life, such as finding restaurants, buying a house, and looking for jobs, many 
of them are actually influenced by these three factors.  
Figure 3.1 further illustrates how these three factors impact customers' final 
buying decisions. Intuitively, a customer's buying decision or rating is decided by 
both  his/her  own  preference  for  similar  items  and  his/her  knowledge  about  the 
characteristics of the target item. A user's preference, such as Angela’s interest in 
drama  movies,  is  usually  reflected  from  the  user’s  past  ratings  to  other  similar 
items,  e.g.  the  number  of  drama  movies  that  Angela  previously  viewed  and  the 
average rating that Angela gave to those movies. Knowledge about the target item 
can be obtained from public media such as magazines, television, and the Internet. 
Meanwhile, the feedbacks from friends are another source of knowledge regarding 
6  
the item, and they are often more trustworthy than advertisements. When a user 
starts considering the feedbacks from his/her friends, he/she is then influenced by 
his/her friends. Note that this influence is not limited to that from our immediate 
friends. Distant friends can also cast their influence indirectly to us; e.g., Angela 
was influenced by Linda's officemate in the previous scenario. Each one of these 
three factors has an impact on a user’s final buying decision. If the impact from all 
of them is positive, it is very likely that the target user will select the item. On the 
contrary, if any has a negative influence, e.g., very low ratings in other user re-
views, the chance that the target user will select the item will decrease. With such 
an  understanding  in  mind,  we  are  going  to  propose  a  social  network-based  re-
commender system (SNRS) in the following subsections. As we mentioned, social 
influences can come from not only immediate friends but also distant friends. The 
techniques  for  handling  these  types  of  influences  are  different.  We  shall  begin 
with the immediate friend inference, in which we only consider influences from 
immediate friends. Then, in the distant friend inference, we will describe how we 
incorporate influences from distant friends via leveraging the immediate friend in-
ference.  
Figure 3.1: The three factors that influence a customer’s buying decision: user preference 
for similar items, information regarding the target item from the public media, and feed-
backs from friends. 
 
 
3.1 Immediate Friend Inference 
We introduce the following naming conventions for the variables used in this pa-
per. We use capitalize letters to represent  variables, and use capitalize and bold 
7 
letters to represent the corresponding sets. The value for each variable or variable 
set is represented by the corresponding lowercase letter. 
Formally, let us consider a social network as a graph G = (U, E) in which U 
represents nodes (users) and E represents links (social relations). Each user U in U 
has a set of attributes AU as well as immediate neighbors (friends) N(U) such that 
if V ∈ N(U), (U, V) ∈ E. The values of attributes AU are represented as aU. More-
over, a recommender system contains the records of users’ ratings, which can be 
represented by a triple relation of T = (U, I, R) in which U is the users in the so-
cial network G; I is the set of items (products or services), and each item I in I has 
a set of attributes A'I. R stands for the ratings such that each RUI in R is user U’s 
rating on item I. RUI has a numeric value k (e.g. k∈{1, 2,… 5}). Moreover, we de-
fine I(U) as the set of items that user U has reviewed, and refer to the set of re-
viewers  of  item I  as U(I).  The  goal  of  this  recommender  system  is  to  predict 
Pr(RUI = k | A’=a'I, A=aU, {RVI = rVI : ∀V∈U(I)∩N(U)}); i.e., the probability dis-
tribution of the target user U's rating on the target item I given the attribute values 
of item I, the attribute values of user U, and the ratings on item I rated by U's im-
mediate friends. Once we obtain this distribution, RUI is calculated as the expecta-
tion of the distribution. Items with high estimated ratings will be recommended to 
the target user, and users with high estimated ratings on the target item are the po-
tential buyers. 
In order to estimate Pr(RUI = k | A’=a'I, A=aU, {RVI = rVI : ∀V∈U(I)∩N(U)}), 
we adopt the naive Bayes assumption which assumes that the influences from item 
attribute values, user attribute values, and immediate friends' ratings are indepen-
dent. Although this assumption simplifies the correlations among these variables, 
the naive Bayes model has been shown to be quite effective in many applications 
including  textual  document  classification  [16].  By  making  this  assumption,  the 
original conditional probability can be factorized as follows,  
Pr(
R
UI
=
aAaA
'
U
=
=
,
'
I
{,
R
VI
=
r
VI
:
∈∀
V
U
I
)(
∩
N
U
(
)})
(3) 
 
|
k
1
Z
Pr(
R
UI
=
=
Pr(
=
k
|
aA
'
=
)'
I
×
Pr(
R
UI
Rk
{|
VI
=
r
VI
:
∈∀
V
=
R
UI
U
I
)(
×
 
k
|
∩
)
=
aA
U
)})
U
(
N
 
First, Pr(RU = k | A’= a'I,) is the conditional probability that the target user U 
will give a rating k to an item with the same attribute values as item I. This proba-
bility represents U's preference for items similar to I. Because this value depends 
on the attribute values of items rather than an individual item, we drop the sub-
script I in RUI for simplification. Second, Pr(RI = k | A = aU) is the probability that 
the target item I will receive a rating value k from a reviewer whose attribute val-
ues are the same as U. This probability reflects the general acceptance of the target 
item I  by  users  like U.  For  the  same  reason,  because  this  value  depends  on  the 
attribute values of users rather than a specific user, we drop the subscript U in RUI. 
Finally, Pr(RUI = k | {RVI = rVI : ∀V∈ U(I)∩N(U)}) is the probability that the tar-
get user U gives a rating value k to the target item I given the ratings of U's imme-
diate friends on item I. This is where we actually take social influences into con-
8  
sideration in our system. In addition, Z is a normalizing constant. We shall present 
the methods to estimate each of the factors in the following subsections.   
3.1.1  User Preference 
As we pointed out, Pr(RU = k | A' = a'I) measures the target user U's preference 
for the items similar to item I. For example, if we want to know how high Angela 
will rate "Revolutionary Road," Pr(RU = k | A' = a'I) gives us a hint of how likely 
it is that Angela will give a rating k to a drama movie which is also casted by Kate 
Winslet. To estimate this probability, we adopt the naive Bayes assumption again. 
We assume that the item attributes in A', e.g., category and cast, are independent 
of each other. Therefore, we have 
 
(4) 
(5) 
(6) 
Pr(
R
U
=
k
|
aA
'
=
)'
I
=
Pr(
R
U
=
k
Pr(
R
U
=
nj
∏ =
k
)
=
×
j
1
=
AA
'
Pr(
,'
1
2
Pr(
,...,
RA
|
'
j
U
A
)'
n
AA
Pr(
'
,'
)
×
1
2
AA
Pr(
,'
'
,...,
1
k
)
=
2
=
k
)
n
|
RA
,...,
'
U
A
)'
n
,
A
,'{'
=
1
AA
'
,...,
A
}'
n
2
 
 
where Pr(A'1, A'2, ..., A'n) can be treated as a normalizing constant, Pr(RU = k) is 
the prior probability that U gives a rating k, and Pr(A'j | RU = k) is the conditional 
probability that each item attribute A'j in A' has a value a'j given U rated k; e.g., 
Pr(movie type = drama | RU = 4) is the probability that a movie will be a type of 
drama movie, given that U gives a rating 4. The last two probabilities can be esti-
mated from counting the review ratings of the target user U. Specifically,  
Pr(
R
U
=
k
)
=
I
(
k
1)
+
n
+
,
and 
 
R
=
U
I
U
)
(
 
Pr(
A
'
j
=
Ra
'
U
|
j
=
k
)
=
I
(
A
'
j
I
Ra
'
=
U
R
k
(
)
U
,
j
=
=
+
k
m
1)
+
,
 
 
where |I(U)| is the number of reviews of user U's in the training set, |I(RU = k)| is 
the number of reviews that user U gives a rating value k, and |I (A'j = a'j, RU = k)| 
is the number of reviews that U gives a rating value k while attribute A'j of the cor-
responding target item has a value a'j. Notice that we insert an extra value 1 to the 
numerators in both equations, and add n, the range of review ratings to the deno-
minator in Eq. (5), and m, the range of A'j's values, to the denominator in Eq. (6). 
This method is also known as Laplace estimate, a well-known technique in esti-
mating probabilities [7], especially on a small size of training samples. Because of 
Laplace estimate, "strong" probabilities, like 0 or 1, from direct probability com-
putation can be avoided.