Double Thompson Sampling Algorithm

From GM-RKB
Jump to navigation Jump to search

A Double Thompson Sampling Algorithm is a multi-armed bandit algorithm that extends the Thompson Sampling strategy by maintaining two separate probability models for each action, aimed at reducing variance in the action selection process and improving exploration efficiency.



References

2024

2023

  • (Kim et al., 2023) ⇒ W Kim, K Lee, and MC Paik. (2023). "Double Doubly Robust Thompson Sampling for Generalized Linear Contextual Bandits.” In: Proceedings of the AAAI Conference on Artificial Intelligence. [1](http://ojs.aaai.org/index.php/AAAI/article/view/21556)
    • It presents an algorithm, the double doubly robust Thompson sampling algorithm for generalized linear contextual bandits, extending DR Thompson sampling with enhanced performance in uncertain environments.

2021

2016