2008 CombinationalCollaborativeFilte

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Abstract

Rapid growth in the amount of data available on social networking sites has made information retrieval increasingly challenging for users. In this paper, we propose a collaborative filtering method, Combinational Collaborative Filtering (CCF), to perform personalized community recommendations by considering multiple types of co-occurrences in social data at the same time. This filtering method fuses semantic and user information, then applies a hybrid training strategy that combines Gibbs sampling and Expectation-Maximization algorithm. To handle the large-scale dataset, parallel computing is used to speed up the model training. Through an empirical study on the Orkut dataset, we show CCF to be both effective and scalable.

References

  • 1. Alexa Internet. Http://www.alexa.com/.
  • 2. David M. Blei, Michael I. Jordan, Variational Methods for the Dirichlet Process, Proceedings of the Twenty-first International Conference on Machine Learning, p.12, July 04-08, 2004, Banff, Alberta, Canada doi:10.1145/1015330.1015439
  • 3. David M. Blei, Andrew Y. Ng, Michael I. Jordan, Latent Dirichlet Allocation, The Journal of Machine Learning Research, 3, p.993-1022, 3/1/2003 doi:10.1162/jmlr.2003.3.4-5.993
  • 4. David Cohn, Huan Chang, Learning to Probabilistically Identify Authoritative Documents, Proceedings of the Seventeenth International Conference on Machine Learning, p.167-174, June 29-July 02, 2000
  • 5. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1--38, 1977.
  • 6. S. Geman and D. Geman. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721--741, 1984.
  • 7. Thomas Hofmann, Probabilistic Latent Semantic Indexing, Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, p.50-57, August 15-19, 1999, Berkeley, California, United States doi:10.1145/312624.312649
  • 8. A. McCallum, A. Corrada-Emmanuel, and X. Wang. The Author-recipient-topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email. Technical Report, Computer Science, University of Massachusetts Amherst, 2004.
  • 9. D. Newman, A. Asuncion, P. Smyth, and M. Welling. Distributed Inference for Latent Dirichlet Allocation. In NIPS, 2007.
  • 10. Ellen Spertus, Mehran Sahami, Orkut Buyukkokten, Evaluating Similarity Measures: A Large-scale Study in the Orkut Social Network, Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, August 21-24, 2005, Chicago, Illinois, USA doi:10.1145/1081870.1081956
  • 11. Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, Thomas Griffiths, Probabilistic Author-topic Models for Information Discovery, Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 22-25, 2004, Seattle, WA, USA doi:10.1145/1014052.1014087
  • 12. Alexander Strehl, Joydeep Ghosh, Cluster Ensembles --- a Knowledge Reuse Framework for Combining Multiple Partitions, The Journal of Machine Learning Research, 3, p.583-617, 3/1/2003 doi:10.1162/153244303321897735
  • 13. Shi Zhong, Joydeep Ghosh, Generative Model-based Document Clustering: A Comparative Study, Knowledge and Information Systems, v.8 n.3, p.374-384, September 2005 doi:10.1007/s10115-004-0194-1,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2008 CombinationalCollaborativeFilteEdward Y. Chang
Wen-Yen Chen
Dong Zhang
Combinational Collaborative Filtering for Personalized Community RecommendationKDD-2008 Proceedings10.1145/1401890.14019092008