2017 DeepFMAFactorizationMachinebase

From GM-RKB
Jump to navigation Jump to search

Subject Headings: DeepFM, Neural Recommender Algorithm, CTR Prediction Algorithm.

Notes

Cited By

Quotes

Abstract

Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or require expertise feature engineering. In this paper, we show that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions. The proposed model, DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture. Compared to the latest Wide & Deep model from Google, DeepFM has a shared input to its "wide" and "deep" parts, with no need of feature engineering besides raw features. Comprehensive experiments are conducted to demonstrate the effectiveness and efficiency of DeepFM over the existing models for CTR prediction, on both benchmark data and commercial data.

1. Introduction

4 Related Work

In this paper, a new deep neural network is proposed for CTR prediction. The most related domains are CTR prediction and deep learning in recommender system. In this section, we discuss related work in these two domains.

CTR prediction plays an important role in recommender system (Richardson et al., 2007; Juan et al., 2016; McMahan et al., 2013). Besides generalized linear models and FM, a few other models are proposed for CTR prediction, such as tree-based model (He et al., 2014), tensor based model (Rendle and Schmidt-Thieme, 2010), support vector machine (Chang et al., 2010), and bayesian model (Graepel et al., 2010).

The other related domain is deep learning in recommender systems. In Section 1 and Section 2.2, several deep learning models for CTR prediction are already mentioned, thus we do not discuss about them here. Several deep learning models are proposed in recommendation tasks other than CTR prediction (e.g., [ Covington et al., 2016; Salakhutdinov et al., 2007; van den Oord et al., 2013; Wu et al., 2016; Zheng et al., 2016; Wu et al., 2017; Zheng et al., 2017). (Salakhutdinov et al., 2007; Sedhain et al., 2015; Wang et al., 2015) propose to improve Collaborative Filtering via deep learning. The authors of (Wang and Wang, 2014; van den Oord et al., 2013) extract content feature by deep learning to improve the performance of music recommendation. (Chen et al., 2016) devises a deep learning network to consider both image feature and basic feature of display adverting. (Covington et al., 2016) develops a two-stage deep learning framework for YouTube video recommendation.

The other related domain is deep learning in recommender systems. In Section 1 and Section 2.2, several deep learning models for CTR prediction are already mentioned, thus we do not discuss about them here. Several deep learning models are proposed in recommendation tasks other than CTR prediction (e.g., [Covington et al., 2016; Salakhutdinov et al., 2007; van den Oord et al., 2013; Wu et al., 2016; Zheng et al., 2016; Wu et al., 2017; Zheng et al., 2017]). [Salakhutdinov et al., 2007; Sedhain et al., 2015; Wang et al., 2015] propose to improve Collaborative Filtering via deep learning. The authors of [Wang andWang, 2014; van den Oord et al., 2013] extract content feature by deep learning to improve the performance of music recommendation. [Chen et al., 2016] devises a deep learning network to consider both image feature and basic feature of display adverting. [Covington et al., 2016] develops a two-stage deep learning framework for YouTube video recommendation.

5 Conclusions

In this paper, we proposed DeepFM, a Factorization-Machine based Neural Network for CTR prediction, to overcome the shortcomings of the state-of-the-art models and to achieve better performance. DeepFM trains a deep component and an FM component jointly. It gains performance improvement from these advantages: 1) it does not need any pre-training; 2) it learns both high- and low-order feature interactions; 3) it introduces a sharing strategy of feature embedding to avoid feature engineering. We conducted extensive experiments on two real-world datasets (Criteo dataset and a commercial App Store dataset) to compare the effectiveness and efficiency of DeepFM and the state-of-the-art models. Our experiment results demonstrate that 1) DeepFM outperforms the state-ofthe- art models in terms of AUC and Logloss on both datasets; 2) The efficiency of DeepFM is comparable to the most efficient deep model in the state-of-the-art.

There are two interesting directions for future study. One is exploring some strategies (such as introducing pooling layers) to strengthen the ability of learning most useful highorder feature interactions. The other is to train DeepFM on a GPU cluster for large-scale problems.

References

  • 1. Nicolas Boulanger-Lewandowski, Yoshua Bengio, and Pascal Vincent. Audio Chord Recognition with Recurrent Neural Networks. In ISMIR, Pages 335-340, 2013.
  • 2. Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael Ringgaard, Chih-Jen Lin, Training and Testing Low-degree Polynomial Data Mappings via Linear SVM, The Journal of Machine Learning Research, 11, p.1471-1490, 3/1/2010
  • 3. Junxuan Chen, Baigui Sun, Hao Li, Hongtao Lu, Xian-Sheng Hua, Deep CTR Prediction in Display Advertising, Proceedings of the 2016 ACM on Multimedia Conference, October 15-19, 2016, Amsterdam, The Netherlands
  • 4. Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. Wide & Deep Learning for Recommender Systems. CoRR, Abs/1606.07792, 2016.
  • 5. Paul Covington, Jay Adams, Emre Sargin, Deep Neural Networks for YouTube Recommendations, Proceedings of the 10th ACM Conference on Recommender Systems, September 15-19, 2016, Boston, Massachusetts, USA
  • 6. Thore Graepel, Joaquin Quiñonero Candela, Thomas Borchert, Ralf Herbrich, Web-scale Bayesian Click-through Rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine, Proceedings of the 27th International Conference on International Conference on Machine Learning, p.13-20, June 21-24, 2010, Haifa, Israel
  • 7. Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, Joaquin Quiñonero Candela, Practical Lessons from Predicting Clicks on Ads at Facebook, Proceedings of the Eighth International Workshop on Data Mining for Online Advertising, p.1-9, August 24-27, 2014, New York, NY, USA
  • 8. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. In CVPR, Pages 770-778, 2016.
  • 9. Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, Chih-Jen Lin, Field-aware Factorization Machines for CTR Prediction, Proceedings of the 10th ACM Conference on Recommender Systems, September 15-19, 2016, Boston, Massachusetts, USA
  • 10. Hugo Larochelle, Yoshua Bengio, Jérôme Louradour, Pascal Lamblin, Exploring Strategies for Training Deep Neural Networks, The Journal of Machine Learning Research, 10, p.1-40, 12/1/2009
  • 11. Qiang Liu, Feng Yu, Shu Wu, Liang Wang, A Convolutional Click Prediction Model, Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, October 18-23, 2015, Melbourne, Australia
  • 12. H. Brendan McMahan, Gary Holt, D. Sculley, Michael Young, Dietmar Ebner, Julian Grady, Lan Nie, Todd Phillips, Eugene Davydov, Daniel Golovin, Sharat Chikkerur, Dan Liu, Martin Wattenberg, Arnar Mar Hrafnkelsson, Tom Boulos, Jeremy Kubica, Ad Click Prediction: A View from the Trenches, Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 11-14, 2013, Chicago, Illinois, USA
  • 13. Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. Product-based Neural Networks for User Response Prediction. CoRR, Abs/1611.00144, 2016.
  • 14. Steffen Rendle, Lars Schmidt-Thieme, Pairwise Interaction Tensor Factorization for Personalized Tag Recommendation, Proceedings of the Third ACM International Conference on Web Search and Data Mining, February 04-06, 2010, New York, New York, USA
  • 15. Steffen Rendle, Factorization Machines, Proceedings of the 2010 IEEE International Conference on Data Mining, p.995-1000, December 13-17, 2010
  • 16. Matthew Richardson, Ewa Dominowska, Robert Ragno, Predicting Clicks: Estimating the Click-through Rate for New Ads, Proceedings of the 16th International Conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
  • 17. Ruslan Salakhutdinov, Andriy Mnih, Geoffrey Hinton, Restricted Boltzmann Machines for Collaborative Filtering, Proceedings of the 24th International Conference on Machine Learning, p.791-798, June 20-24, 2007, Corvalis, Oregon, USA
  • 18. Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, Lexing Xie, AutoRec: Autoencoders Meet Collaborative Filtering, Proceedings of the 24th International Conference on World Wide Web, May 18-22, 2015, Florence, Italy
  • 19. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan Salakhutdinov, Dropout: A Simple Way to Prevent Neural Networks from Overfitting, The Journal of Machine Learning Research, v.15 n.1, p.1929-1958, January 2014
  • 20. Aäron Van Den Oord, Sander Dieleman, Benjamin Schrauwen, Deep Content-based Music Recommendation, Proceedings of the 26th International Conference on Neural Information Processing Systems, p.2643-2651, December 05-10, 2013, Lake Tahoe, Nevada
  • 21. Xinxi Wang, Ye Wang, Improving Content-based and Hybrid Music Recommendation Using Deep Learning, Proceedings of the 22nd ACM International Conference on Multimedia, November 03-07, 2014, Orlando, Florida, USA
  • 22. Hao Wang, Naiyan Wang, Dit-Yan Yeung, Collaborative Deep Learning for Recommender Systems, Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 10-13, 2015, Sydney, NSW, Australia
  • 23. Yao Wu, Christopher DuBois, Alice X. Zheng, Martin Ester, Collaborative Denoising Auto-Encoders for Top-N Recommender Systems, Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, February 22-25, 2016, San Francisco, California, USA
  • 24. Chao-Yuan Wu, Amr Ahmed, Alex Beutel, Alexander J. Smola, How Jing, Recurrent Recommender Networks, Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, February 06-10, 2017, Cambridge, United Kingdom
  • 25. Yuyu Zhang, Hanjun Dai, Chang Xu, Jun Feng, Taifeng Wang, Jiang Bian, Bin Wang, Tie-Yan Liu, Sequential Click Prediction for Sponsored Search with Recurrent Neural Networks, Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, p.1369-1375, July 27-31, 2014, Québec City, Québec, Canada
  • 26. Weinan Zhang, Tianming Du, and Jun Wang. Deep Learning over Multi-field Categorical Data - - A Case Study on User Response Prediction. In ECIR, 2016.
  • 27. Yin Zheng, Yu-Jin Zhang, and Hugo Larochelle. A Deep and Autoregressive Approach for Topic Modeling of Multimodal Data. IEEE Trans. Pattern Anal. Mach. Intell., 38(6):1056-1069, 2016.
  • 28. Lei Zheng, Vahid Noroozi, Philip S. Yu, Joint Deep Modeling of Users and Items Using Reviews for Recommendation, Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, February 06-10, 2017, Cambridge, United Kingdom

}};


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2017 DeepFMAFactorizationMachinebaseYunming Ye
Huifeng Guo
Ruiming Tang
Zhenguo Li
Xiuqiang He
DeepFM: A Factorization-machine based Neural Network for CTR Prediction2017