Difference between revisions of "2008 OnlineMultiagentLearningAgainst"

From GM-RKB
Jump to: navigation, search
(ContinuousReplacement)
(Tag: continuous replacement)
m (Text replacement - " Yoav Shoham" to " Yoav Shoham")
 
Line 38: Line 38:
 
* 11. Littman, M.L.: Markov Games As a Framework for Multi-agent Reinforcement Learning. In: Proceedings of the 11th International Conference on Machine Learning (ML 1994), New Brunswick, NJ, Pp. 157-163. Morgan Kaufmann, San Francisco (1994)
 
* 11. Littman, M.L.: Markov Games As a Framework for Multi-agent Reinforcement Learning. In: Proceedings of the 11th International Conference on Machine Learning (ML 1994), New Brunswick, NJ, Pp. 157-163. Morgan Kaufmann, San Francisco (1994)
 
* 12. Nash Jr., J.F.: Equilibrium Points in N-person Games. In: Classics in Game Theory (1997)
 
* 12. Nash Jr., J.F.: Equilibrium Points in N-person Games. In: Classics in Game Theory (1997)
* 13. Rob Powers, Yoav Shoham, Learning Against Opponents with Bounded Memory, Proceedings of the 19th International Joint Conference on Artificial Intelligence, p.817-822, July 30-August 05, 2005, Edinburgh, Scotland
+
* 13. Rob Powers, [[Yoav Shoham]], Learning Against Opponents with Bounded Memory, Proceedings of the 19th International Joint Conference on Artificial Intelligence, p.817-822, July 30-August 05, 2005, Edinburgh, Scotland
 
* 14. Satinder Singh, Michael Kearns, Yishay Mansour, Nash Convergence of Gradient Dynamics in General-pum Games, Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, p.541-548, June 30-July 03, 2000, Stanford, California
 
* 14. Satinder Singh, Michael Kearns, Yishay Mansour, Nash Convergence of Gradient Dynamics in General-pum Games, Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, p.541-548, June 30-July 03, 2000, Stanford, California
 
* 15. Michael L. Littman, Peter Stone, Implicit Negotiation in Repeated Games, Revised Papers from the 8th International Workshop on Intelligent Agents VIII, p.393-404, August 01-03, 2001
 
* 15. Michael L. Littman, Peter Stone, Implicit Negotiation in Repeated Games, Revised Papers from the 8th International Workshop on Intelligent Agents VIII, p.393-404, August 01-03, 2001
* 16. Thuc Vu, Rob Powers, Yoav Shoham, Learning Against Multiple Opponents, Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, May 08-12, 2006, Hakodate, Japan
+
* 16. Thuc Vu, Rob Powers, [[Yoav Shoham]], Learning Against Multiple Opponents, Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, May 08-12, 2006, Hakodate, Japan
  
 
}}
 
}}

Latest revision as of 14:19, 13 August 2019

Subject Headings: Multi-Agent Learning Algorithm, LoE-AIM Algorithm.

Notes

Cited By

Quotes

Keywords

Abstract

The traditional agenda in Multiagent Learning (MAL) has been to develop learners that guarantee convergence to an equilibrium in self-play or that converge to playing the best response against an opponent] using one of a fixed set of known targeted strategies. This paper introduces an algorithm called Learn or Exploit for Adversary Induced Markov Decision Process (LoE-AIM) that targets optimality against any learning opponent that can be treated as a memory bounded adversary. LoE-AIM makes no prior assumptions about the opponent and is tailored to optimally exploit any adversary which induces a Markov decision process in the state space of joint histories. LoE-AIM either explores and gathers new information about the opponent or converges to the best response to the partially learned opponent strategy in repeated play. We further extend LoE-AIM to account for online repeated interactions against the same adversary with plays against other adversaries interleaved in between. LoE-AIM-repeated stores learned knowledge about an adversary, identifies the adversary in case of repeated interaction, and reuses the stored knowledge about the behavior of the adversary to enhance learning in the current epoch of play. LoE-AIM and LoE-AIM-repeated are fully implemented, with results demonstrating their superiority over other existing MAL algorithms.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2008 OnlineMultiagentLearningAgainstDoran Chakraborty
Peter Stone
Online Multiagent Learning Against Memory Bounded Adversaries10.1007/978-3-540-87479-9_322008
AuthorDoran Chakraborty + and Peter Stone +
doi10.1007/978-3-540-87479-9_32 +
titleOnline Multiagent Learning Against Memory Bounded Adversaries +
year2008 +