Foundations and Trends in Information Retrieval
Vol. 2, No 1-2 (2008) 1–135
c 2008 Bo Pang and Lillian Lee. This is a pre-publication version; there
are formatting and potentially small wording differences from the final
version.
DOI: xxxxxx
Opinion mining and sentiment analysis
Bo Pang1 and Lillian Lee2
1 Yahoo! Research, 701 First Ave. Sunnyvale, CA 94089, U.S.A., bopang@yahoo-inc.com
2 Computer Science Department, Cornell University, Ithaca, NY 14853, U.S.A., llee@cs.cornell.edu
Abstract
An important part of our information-gathering behavior has always been to find out what other people
think. With the growing availability and popularity of opinion-rich resources such as online review sites and
personal blogs, new opportunities and challenges arise as people now can, and do, actively use information
technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of
opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment,
and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new
systems that deal directly with opinions as a first-class object.
This survey covers techniques and approaches that promise to directly enable opinion-oriented information-
seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-
aware applications, as compared to those that are already present in more traditional fact-based analysis. We
include material on summarization of evaluative text and on broader issues regarding privacy, manipulation,
and economic impact that the development of opinion-oriented information-access services gives rise to. To
facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is
also provided.
Contents
Table of Contents
1
Introduction
1.1 The demand for information on opinions and sentiment
1.2 What might be involved? An example examination of the construction of an opinion/review
search engine
1.3 Our charge and approach
1.4 Early history
1.5 A note on terminology: Opinion mining, sentiment analysis, subjectivity, and all that
2 Applications
2.1 Applications to review-related websites
2.2 Applications as a sub-component technology
2.3 Applications in business and government intelligence
2.4 Applications across different domains
3 General challenges
3.1 Contrasts with standard fact-based textual analysis
3.2 Factors that make opinion mining difficult
4 Classification and extraction
Part One: Fundamentals
4.1 Problem formulations and key concepts
4.1.1
4.1.2
Sentiment polarity and degrees of positivity
Subjectivity detection and opinion identification
i
i
1
1
3
4
4
5
7
7
7
8
9
10
10
11
15
16
16
16
18
4.1.3
Joint topic-sentiment analysis
4.1.4 Viewpoints and perspectives
4.1.5 Other non-factual information in text
4.2 Features
4.2.1 Term presence vs. frequency
4.2.2 Term-based features beyond term unigrams
4.2.3
4.2.4
4.2.5 Negation
4.2.6 Topic-oriented features
Parts of speech
Syntax
Part Two: Approaches
4.3 The impact of labeled data
4.4 Domain adaptation and topic-sentiment interaction
4.4.1 Domain considerations
4.4.2 Topic (and sub-topic or feature) considerations
4.5 Unsupervised approaches
4.5.1 Unsupervised lexicon induction
4.5.2 Other unsupervised approaches
4.6 Classification based on relationship information
4.6.1 Relationships between sentences and between documents
4.6.2 Relationships between discourse participants
4.6.3 Relationships between product features
4.6.4 Relationships between classes
Incorporating discourse structure
4.7
4.8 Language models
4.9 Special considerations for extraction
4.9.1
4.9.2
Identifying product features and opinions in reviews
Problems involving opinion holders
5 Summarization
5.1 Single-document opinion-oriented summarization
5.2 Multi-document opinion-oriented summarization
Some problem considerations
5.2.1
5.2.2 Textual summaries
5.2.3 Non-textual summaries
5.2.4 Review(er) quality
6 Broader implications
6.1 Economic impact of reviews
Surveys summarizing relevant economic literature
6.1.1
6.1.2 Economic-impact studies employing automated text analysis
6.1.3
Interactions with word of mouth (WOM)
ii
19
19
20
20
21
21
21
22
22
23
23
24
25
25
26
27
27
28
29
29
29
30
31
32
32
33
35
35
37
37
38
39
41
43
49
55
56
58
58
59
6.2
Implications for manipulation
7 Publicly available resources
7.1 Datasets
7.1.1 Acquiring labels for data
7.1.2 An annotated list of datasets
7.2 Evaluation campaigns
7.2.1 TREC opinion-related competitions
7.2.2 NTCIR opinion-related competitions
7.3 Lexical resources
7.4 Tutorials, bibliographies, and other references
8 Concluding remarks
References
59
61
61
61
62
65
65
66
66
67
69
71
iii
1
Introduction
Romance should never begin with sentiment. It should begin with science and end with a
settlement. — Oscar Wilde, An Ideal Husband
1.1 The demand for information on opinions and sentiment
“What other people think” has always been an important piece of information for most of us during the
decision-making process. Long before awareness of the World Wide Web became widespread, many of us
asked our friends to recommend an auto mechanic or to explain who they were planning to vote for in
local elections, requested reference letters regarding job applicants from colleagues, or consulted Consumer
Reports to decide what dishwasher to buy. But the Internet and the Web have now (among other things) made
it possible to find out about the opinions and experiences of those in the vast pool of people that are neither
our personal acquaintances nor well-known professional critics — that is, people we have never heard of.
And conversely, more and more people are making their opinions available to strangers via the Internet.
Indeed, according to two surveys of more than 2000 American adults each [63, 127],
once;
• 81% of Internet users (or 60% of Americans) have done online research on a product at least
• 20% (15% of all Americans) do so on a typical day;
• among readers of online reviews of restaurants, hotels, and various services (e.g., travel agen-
cies or doctors), between 73% and 87% report that reviews had a significant influence on their
purchase;1
• consumers report being willing to pay from 20% to 99% more for a 5-star-rated item than a
• 32% have provided a rating on a product, service, or person via an online ratings system, and 30%
(including 18% of online senior citizens) have posted an online comment or review regarding a
product or service .2
4-star-rated item (the variance stems from what type of item or service is considered);
1Section 6.1 discusses quantitative analyses of actual economic impact, as opposed to consumer perception.
2Interestingly, Hitlin and Rainie [123] report that “Individuals who have rated something online are also more skeptical of the information that is
1
We hasten to point out that consumption of goods and services is not the only motivation behind people’s
seeking out or expressing opinions online. A need for political information is another important factor.
For example, in a survey of over 2500 American adults, Rainie and Horrigan [249] studied the 31% of
Americans — over 60 million people — that were 2006 campaign internet users, defined as those who
gathered information about the 2006 elections online and exchanged views via email. Of these,
• 28% said that a major reason for these online activities was to get perspectives from within
their community, and 34% said that a major reason was to get perspectives from outside their
community;
• 27% had looked online for the endorsements or ratings of external organizations;
• 28% say that most of the sites they use share their point of view, but 29% said that most of the
sites they use challenge their point of view, indicating that many people are not simply looking
for validations of their pre-existing opinions; and
• 8% posted their own political commentary online.
The user hunger for and reliance upon online advice and recommendations that the data above reveals
is merely one reason behind the surge of interest in new systems that deal directly with opinions as a first-
class object. But, Horrigan [127] reports that while a majority of American internet users report positive
experiences during online product research, at the same time, 58% also report that online information was
missing, impossible to find, confusing, and/or overwhelming. Thus, there is a clear need to aid consumers of
products and of information by building better information-access systems than are currently in existence.
The interest that individual users show in online opinions about products and services, and the potential
influence such opinions wield, is something that vendors of these items are paying more and more attention
to [124]. The following excerpt from a whitepaper is illustrative of the envisioned possibilities, or at the least
the rhetoric surrounding the possibilities:
With the explosion of Web 2.0 platforms such as blogs, discussion forums, peer-to-peer net-
works, and various other types of social media ... consumers have at their disposal a soapbox
of unprecedented reach and power by which to share their brand experiences and opinions,
positive or negative, regarding any product or service. As major companies are increas-
ingly coming to realize, these consumer voices can wield enormous influence in shaping
the opinions of other consumers — and, ultimately, their brand loyalties, their purchase de-
cisions, and their own brand advocacy. ... companies can respond to the consumer insights
they generate through social media monitoring and analysis by modifying their marketing
messages, brand positioning, product development, and other activities accordingly. [328]
But industry analysts note that the leveraging of new media for the purpose of tracking product image
requires new technologies; here is a representative snippet describing their concerns:
Marketers have always needed to monitor media for information related to their brands —
whether it’s for public relations activities, fraud violations3, or competitive intelligence.
But fragmenting media and changing consumer behavior have crippled traditional monitor-
ing methods. Technorati estimates that 75,000 new blogs are created daily, along with 1.2
available on the Web”.
3Presumably, the author means “the detection or prevention of fraud violations”, as opposed to the commission thereof.
2
million new posts each day, many discussing consumer opinions on products and services.
Tactics [of the traditional sort] such as clipping services, field agents, and ad hoc research
simply can’t keep pace. [154]
Thus, aside from individuals, an additional audience for systems capable of automatically analyzing con-
sumer sentiment, as expressed in no small part in online venues, are companies anxious to understand how
their products and services are perceived.
1.2 What might be involved? An example examination of the construction of an
opinion/review search engine
Creating systems that can process subjective information effectively requires overcoming a number of novel
challenges. To illustrate some of these challenges, let us consider the concrete example of what building an
opinion- or review-search application could involve. As we have discussed, such an application would fill an
important and prevalent information need, whether one restricts attention to blog search [213] or considers
the more general types of search that have been described above.
The development of a complete review- or opinion-search application might involve attacking each of
the following problems.
(1) If the application is integrated into a general-purpose search engine, then one would need to
determine whether the user is in fact looking for subjective material. This may or may not be a
difficult problem in and of itself: perhaps queries of this type will tend to contain indicator terms
like “review”, “reviews”, or “opinions”, or perhaps the application would provide a “checkbox” to
the user so that he or she could indicate directly that reviews are what is desired; but in general,
query classification is a difficult problem — indeed, it was the subject of the 2005 KDD Cup
challenge [185].
(2) Besides the still-open problem of determining which documents are topically relevant to an
opinion-oriented query, an additional challenge we face in our new setting is simultaneously
or subsequently determining which documents or portions of documents contain review-like
or opinionated material. Sometimes this is relatively easy, as in texts fetched from review-
aggregation sites in which review-oriented information is presented in relatively stereotyped for-
mat: examples include Epinions.com and Amazon.com. However, blogs also notoriously contain
quite a bit of subjective content and thus are another obvious place to look (and are more rele-
vant than shopping sites for queries that concern politics, people, or other non-products), but the
desired material within blogs can vary quite widely in content, style, presentation, and even level
of grammaticality.
(3) Once one has target documents in hand, one is still faced with the problem of identifying the
overall sentiment expressed by these documents and/or the specific opinions regarding particular
features or aspects of the items or topics in question, as necessary. Again, while some sites make
this kind of extraction easier — for instance, user reviews posted to Yahoo! Movies must specify
grades for pre-defined sets of characteristics of films — more free-form text can be much harder
for computers to analyze, and indeed can pose additional challenges; for example, if quotations
are included in a newspaper article, care must be taken to attribute the views expressed in each
quotation to the correct entity.
(4) Finally, the system needs to present the sentiment information it has garnered in some reasonable
3
summary fashion. This can involve some or all of the following actions:
(a) aggregation of “votes” that may be registered on different scales (e.g., one reviewer uses
a star system, but another uses letter grades)
(b) selective highlighting of some opinions
(c) representation of points of disagreement and points of consensus
(d) identification of communities of opinion holders
(e) accounting for different levels of authority among opinion holders
Note that it might be more appropriate to produce a visualization of sentiment data rather than a
textual summary of it, whereas textual summaries are what is usually created in standard topic-
based multi-document summarization.
1.3 Our charge and approach
Challenges (2), (3), and (4) in the above list are very active areas of research, and the bulk of this survey is
devoted to reviewing work in these three sub-fields. However, due to space limitations and the focus of the
journal series in which this survey appears, we do not and cannot aim to be completely comprehensive.
In particular, when we began to write this survey, we were directly charged to focus on information-
access applications, as opposed to work of more purely linguistic interest. We stress that the importance of
work in the latter vein is absolutely not in question.
Given our mandate, the reader will not be surprised that we describe the applications that sentiment-
analysis systems can facilitate and review many kinds of approaches to a variety of opinion-oriented classi-
fication problems. We have also chosen to attempt to draw attention to single- and multi-document summa-
rization of evaluative text, especially since interesting considerations regarding graphical visualization arise.
Finally, we move beyond just the technical issues, devoting significant attention to the broader implications
that the development of opinion-oriented information-access services have: we look at questions of privacy,
manipulation, and whether or not reviews can have measurable economic impact.
1.4 Early history
Although the area of sentiment analysis and opinion mining has recently enjoyed a huge burst of research
activity, there has been a steady undercurrent of interest for quite a while. One could count early projects
on beliefs as forerunners of the area [48, 318]. Later work focused mostly on interpretation of metaphor,
narrative, point of view, affect, evidentiality in text, and related areas [121, 133, 149, 263, 308, 311, 312,
313, 314].
The year 2001 or so seems to mark the beginning of widespread awareness of the research problems and
opportunities that sentiment analysis and opinion mining raise [51, 66, 69, 79, 192, 215, 221, 235, 292, 297,
299, 307, 327, inter alia], and subsequently there have been literally hundreds of papers published on the
subject.
Factors behind this “land rush” include:
• the rise of machine learning methods in natural language processing and information retrieval;
• the availability of datasets for machine learning algorithms to be trained on, due to the blossoming
of the World Wide Web and, specifically, the development of review-aggregation web-sites; and,
of course
4