2009 ExploreExploitSchemesforWebCont

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Explore-Exploit Algorithm, Explore-Exploit.

Notes

Cited By

Quotes

Abstract

We propose novel multi-armed bandit (explore / exploit) schemes to maximize total clicks on a content module published regularly on Yahoo! Intuitively, one can “explore” each candidate item by displaying it to a small fraction of user visits to estimate the item's click-through rate (CTR), and then “exploiyhigh CTR items in order to maximize clicks. While bandit methods that seek to find the optimal trade-off between explore and exploit have been studied for decades, existing solutions are not satisfactory for web content publishing applications where dynamic set of items with short lifetimes, delayed feedback and non-stationary reward (CTR) distributions are typical. In this paper, we develop a Bayesian solution and extend several existing schemes to our setting. Through extensive evaluation with nine bandit schemes, we show that our Bayesian solution is uniformly better in several scenarios. We also study the empirical characteristics of our schemes and provide useful insights on the strengths and weaknesses of each. Finally, we validate our results with a "side-by-side” comparison of schemes through live experiments conducted on a random sample of real user visits to Yahoo !

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 ExploreExploitSchemesforWebContBee-Chung Chen
Deepak Agarwal
Pradheep Elango
Explore/Exploit Schemes for Web Content Optimization10.1109/ICDM.2009.522009