2017 GBCENTGradientBoostedCategorica

From GM-RKB
Jump to navigation Jump to search

Subject Headings: GB-CENT.

Notes

Cited By

Quotes

Abstract

Latent factor models and decision tree based models are widely used in tasks of prediction, ranking and recommendation. Latent factor models have the advantage of interpreting categorical features by a low-dimensional representation, while such an interpretation does not naturally fit numerical features. In contrast, decision tree based models enjoy the advantage of capturing the nonlinear interactions of numerical features, while their capability of handling categorical features is limited by the cardinality of those features. Since in real-world applications we usually have both abundant numerical features and categorical features with large cardinality (e.g. geolocations, IDs, tags etc.), we design a new model, called GB-CENT, which leverages latent factor embedding and tree components to achieve the merits of both while avoiding their demerits. With two real-world data sets, we demonstrate that GB-CENT can effectively (i.e. fast and accurately) achieve better accuracy than state-of-the-art matrix factorization, decision tree based models and their ensemble.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2017 GBCENTGradientBoostedCategoricaLiangjie Hong
Yue Shi
Qian Zhao
GB-CENT: Gradient Boosted Categorical Embedding and Numerical Trees10.1145/3038912.30526682017