2002 ThumbsUporThumbsDown

Jump to navigation Jump to search

Subject Headings: Sentiment Analysis Task.




This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down). The classification of a review is predicted by the average semantic orientation of the phrases in the review that contain adjectives or adverbs. A phrase has a positive semantic orientation when it has good associations (e.g., "subtle nuances") and a negative semantic orientation when it has bad associations (e.g., "very cavalier"). In this paper, the semantic orientation of a phrase is calculated as the mutual information between the given phrase and the word "excellent" minus the mutual information between the given phrase and the word "poor". A review is classified as recommended if the average semantic orientation of its phrases is positive. The algorithm achieves an average accuracy of 74% when evaluated on 410 reviews from Epinions, sampled from four different domains (reviews of automobiles, banks, movies, and travel destinations). The accuracy ranges from 84% for automobile reviews to 66% for movie reviews.


  • 1. A. Agresti. (1996). An introduction to categorical data analysis. New York: Wiley.
  • 2. Eric Brill, Some advances in transformation-based part of speech tagging, Proceedings of the twelfth national conference on Artificial intelligence (vol. 1), p.722-727, October 1994, Seattle, Washington, United States
  • 3. Kenneth W. Church, Patrick Hanks, Word association norms, mutual information, and lexicography, Proceedings of the 27th annual meeting on Association for Computational Linguistics, p.76-83, June 26-29, 1989, Vancouver, British Columbia, Canada doi:10.3115/981623.981633
  • 4. Eibe Frank, Mark Hall, A Simple Approach to Ordinal Classification, Proceedings of the 12th European Conference on Machine Learning, p.145-156, September 05-07, 2001
  • 5. Vasileios Hatzivassiloglou, Kathleen R. McKeown, Predicting the semantic orientation of adjectives, Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics, p.174-181, July 07-12, 1997, Madrid, Spain
  • 6. Vasileios Hatzivassiloglou, Janyce M. Wiebe, Effects of adjective orientation and gradability on sentence subjectivity, Proceedings of the 18th conference on Computational linguistics, p.299-305, July 31-August 04, 2000, Saarbrücken, Germany doi:10.3115/990820.990864
  • 7. Marti A. Hearst, Direction-based text interpretation as an information access refinement, Text-based intelligent systems: current research and practice in information extraction and retrieval, Lawrence Erlbaum Associates, Inc., Mahwah, NJ, 1992
  • 8. Thomas K. Landauer, & Dumais, S. T. (1997). A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211--240.
  • 9. Santorini, B. (1995). Part-of-Speech Tagging Guidelines for the Penn Treebank Project (3rd revision, 2nd printing). Technical Report, Department of Computer and Information Science, University of Pennsylvania.
  • 10. Spertus, E. (1997). Smokey: Automatic recognition of hostile messages. Proceedings of the Conference on Innovative Applications of Artificial Intelligence (pp. 1058--1065). Menlo Park, CA: AAAI Press.
  • 11. Tong, R. M. (2001). An operational system for detecting and tracking opinions in on-line discussions. Working Notes of the ACM SIGIR 2001 Workshop on Operational Text Classification (pp. 1--6). New York, NY: ACM.
  • 12. Peter D. Turney, Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL, Proceedings of the 12th European Conference on Machine Learning, p.491-502, September 05-07, 2001
  • 13. Janyce Wiebe, Learning Subjective Adjectives from Corpora, Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, p.735-740, July 30-August 03, 2000
  • 14. Janyce Wiebe, Rebecca Bruce, Matthew Bell, Melanie Martin, Theresa Wilson, A corpus study of evaluative and speculative language, Proceedings of the Second SIGdial Workshop on Discourse and Dialogue, p.1-10, September 01-02, 2001, Aalborg, Denmark doi:10.3115/1118078.1118104,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2002 ThumbsUporThumbsDownPeter D. TurneyThumbs up or Thumbs Down?: Semantic orientation applied to unsupervised classification of reviewsProceedings of the 40th Annual Meeting on Association for Computational Linguisticshttp://acl.ldc.upenn.edu/acl2002/MAIN/pdfs/Main425.pdf10.3115/1073083.10731532002