2018 OfflineABTestingforRecommenderS

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Offline A/B Testing.

Notes

Cited By

Quotes

Abstract

Online A/B testing evaluates the impact of a new technology by running it in a real production environment and testing its performance on a subset of the users of the platform. It is a well-known practice to run a preliminary offline evaluation on historical data to iterate faster on new ideas, and to detect poor policies in order to avoid losing money or breaking the system. For such offline evaluations, we are interested in methods that can compute offline an estimate of the potential uplift of performance generated by a new technology. Offline performance can be measured using estimators known as counterfactual or off-policy estimators. Traditional counterfactual estimators, such as capped importance sampling or normalised importance sampling, exhibit unsatisfying bias-variance compromises when experimenting on personalized product recommendation systems. To overcome this issue, we model the bias incurred by these estimators rather than bound it in the worst case, which leads us to propose a new counterfactual estimator. We provide a benchmark of the different estimators showing their correlation with business metrics observed by running online A/B tests on a large-scale commercial recommender system.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2018 OfflineABTestingforRecommenderSAlexandre Gilotte
Clément Calauzènes
Thomas Nedelec
Alexandre Abraham
Simon Dollé
Offline A/B Testing for Recommender Systems10.1145/3159652.31596872018