2016 FactorizationMeetstheItemEmbedd

(Liang et al., 2016) ⇒ Dawen Liang, Jaan Altosaar, Laurent Charlin, and David M. Blei. (2016). “Factorization Meets the Item Embedding: Regularizing Matrix Factorization with Item Co-occurrence.” In: Proceedings of the 10th ACM Conference on Recommender Systems. ISBN:978-1-4503-4035-9 doi:10.1145/2959100.2959182

Subject Headings: CoFactor.

Notes

presentation https://www.slideshare.net/cheerz/factorization-meets-the-item-embedding-regularizing-matrix-factorization-with-item-cooccurrence

Cited By

Quotes

Author Keywords

Collaborative filtering; matrix factorization; item embedding; implicit feedback.

Abstract

Matrix factorization (MF) models and their extensions are standard in modern recommender systems. MF models decompose the observed user-item interaction matrix into user and item latent factors. In this paper, we propose a co-factorization model, CoFactor, which jointly decomposes the user-item interaction matrix and the item-item co-occurrence matrix with shared item latent factors. For each pair of items, the co-occurrence matrix encodes the number of users that have consumed both items. CoFactor is inspired by the recent success of word embedding models (e.g., word2vec) which can be interpreted as factorizing the word co-occurrence matrix. We show that this model significantly improves the performance over MF models on several datasets with little additional computational overhead. We provide qualitative results that explain how CoFactor improves the quality of the inferred factors and characterize the circumstances where it provides the most significant improvements.

1. INTRODUCTION

Recommender systems model users through their preferences for items. User preferences are often encoded as sets of user-item-preference triplets. For instance \user A gave item B a 4-star rating" or in the case of implicit data, which we focus on, \user A clicked on item B". The task of interest is to predict missing user-item preferences given the observed triplets. Predicted preferences can then be used downstream to fuel recommendations.

The preference triplets can be seen as the sparse representation of a user-item preference matrix (or click matrix ). Predicting preferences can be seen as filling in the missing entries of this matrix. Models such as matrix factorization| which decompose the preference matrix into user and item factors [10, 20]|are standard for preference prediction: their performance is high [10], maximum a posteriori inference can be done efficiently with closed-form updates [9], and they can be composed to incorporate additional side information (e.g., [1, 4, 13, 23, 26]).

Encoding user preferences in a matrix and modeling it with matrix factorization is a particular modeling assumption. In this paper we explore an alternative which models item co-occurrence across users. We posit that pairs of items which are often consumed in tandem by different users are similar. This is similar to modeling a set of documents (users) as a bag of co-occurring words (items). In that context frequently co-occurring words are likely to be about the same topic. For example, in a corpus of scientific papers \planet" and \Pluto" are likely to frequently co-occur. A similar idea has been explored in recommendations for next-item prediction [24]. Item co-occurrence information is, in principle, available to matrix factorization, but it may not be easy to infer from the click matrix: matrix factorization models are bi-linear with limited modeling capacity.

We propose a co-factorization model, CoFactor, which si- multaneously factorizes both the click matrix and the item co-occurrence matrix. The factorization of item co-occurrence is inspired by the recent models for learning word embedding from sequences of words [14, 11]. We learn item embedding using the sets of items each user has consumed (or rated), and the co-occurrence counts of these items across users in the data.

We show that learning CoFactor from data can be done efficiently with coordinate updates. We use a sequence of closed-form updates which scale in the number of observed preference triplets. CoFactor outperforms matrix factorization [9] across datasets of user clicking on scientific articles, rating movies, and listening to music. We also provide exploratory results to better understand the effectiveness of our method. This shows that we outperform standard matrix factorization due to the ability of our model to exploit co-occurrence patterns for rare items (items not consumed by many users).

2. THE COFACTOR MODEL

…

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2016 FactorizationMeetstheItemEmbedd	Laurent Charlin Dawen Liang Jaan Altosaar David M. Blei			Factorization Meets the Item Embedding: Regularizing Matrix Factorization with Item Co-occurrence				10.1145/2959100.2959182		2016