Deep Net Reinforcement Learning Algorithm
(Redirected from deep neural net learning algorithm)
Jump to navigation
Jump to search
A Deep Net Reinforcement Learning Algorithm is a NNet reinforcement learning algorithm that is a deep neural net learning algorithm.
- Context:
- It can be implemented by a Deep Net Reinforcement Learning System (to solve a deep net reinforcement learning task).
- Example(s):
- the one used by AlphaGo Zero.
- Policy Gradients Algorithm, (Williams, 1992).
- DQN Algorithm, (Mnih et al., 2015)..
- …
- Counter-Example(s):
- See: Shallow NNet Reinforcement Learning Algorithm.
References
2017
- https://davidbarber.github.io/blog/2017/11/07/Learning-From-Scratch-by-Thinking-Fast-and-Slow-with-Deep-Learning-and-Tree-Search/
- QUOTE: In current Deep Reinforcement Learning (RL) algorithms such as Policy Gradients and DQN, neural networks make action selections with no lookahead; this is analogous to System 1. Unlike human intuition, their training does not benefit from a ‘System 2’ to suggest strong policies.
2016a
- (Heinrich & Silver, 2016) ⇒ Johannes Heinrich, and David Silver. (2016). “Deep Reinforcement Learning from Self-play in Imperfect-information Games.” In: Proceedings of NIPS Deep Reinforcement Learning Workshop.
- QUOTE: In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged.
2016b
- (Mnih et al., 2016) ⇒ Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Tim Harley, Timothy P. Lillicrap, David Silver, and Koray Kavukcuoglu. (2016). “Asynchronous Methods for Deep Reinforcement Learning.” In: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48.
2015
- (Mnih et al., 2015) ⇒ Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. (2015). “Human-level Control through Deep Reinforcement Learning.” In: Nature, 518(7540).
2013
- (Mnih et al., 2013) ⇒ Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. (2013). “Playing Atari with Deep Reinforcement Learning.” arXiv preprint arXiv:1312.5602
1992
- (Williams, 1992) ⇒ Ronald J. Williams. (1992). “Simple Statistical Gradient-following Algorithms for Connectionist Reinforcement Learning.” In: Machine Learning, 8(3-4).