2011 LearningtoTradeOffBetweenExplor

From GM-RKB

Jump to navigation Jump to search

(Valizadegan et al., 2011) ⇒ Hamed Valizadegan, Rong Jin, and Shijun Wang. (2011). “Learning to Trade Off Between Exploration and Exploitation in Multiclass Bandit Prediction.” In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011) Journal. ISBN:978-1-4503-0813-7 doi:10.1145/2020408.2020445

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Algorithms; bandit feedback; exploration vs. exploitation; multi-class classification; online learning; parameter learning; theory

Abstract

We study multi-class bandit prediction, an online learning problem where the learner only receives a partial feedback in each trial indicating whether the predicted class label is correct. The exploration vs. exploitation tradeoff strategy is a well-known technique for online learning with incomplete feedback (i.e., bandit setup). Banditron [8], a multi-class online learning algorithm for bandit setting, maximizes the run-time gain by balancing between exploration and exploitation with a fixed tradeoff parameter. The performance of Banditron can be quite sensitive to the choice of the tradeoff parameter and therefore effective algorithms to automatically tune this parameter is desirable. In this paper, we propose three learning strategies to automatically adjust the tradeoff parameter for Banditron. Our extensive empirical study with multiple real-world data sets verifies the efficacy of the proposed approach in learning the exploration vs. exploitation tradeoff parameter.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2011 LearningtoTradeOffBetweenExplor	Rong Jin Hamed Valizadegan Shijun Wang			Learning to Trade Off Between Exploration and Exploitation in Multiclass Bandit Prediction				10.1145/2020408.2020445		2011

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=2011_LearningtoTradeOffBetweenExplor&oldid=845159"

Facts

... more about "2011 LearningtoTradeOffBetweenExplor"

Hamed Valizadegan +, Rong Jin + and Shijun Wang +

10.1145/2020408.2020445 +

Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining +

Learning to Trade Off Between Exploration and Exploitation in Multiclass Bandit Prediction +

2011 +