2018 IMPALAScalableDistributedDeepRL

From GM-RKB

Jump to navigation Jump to search

(Espeholt et al., 2018) ⇒ Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Volodymir Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, and Koray Kavukcuoglu. (2018). “IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures.” In: arXiv preprint arXiv:1802.01561.

Subject Headings: IMPALA Framework.

Notes

Cited By

http://scholar.google.com/scholar?q=%222018%22+IMPALA%3A+Scalable+Distributed+Deep-RL+with+Importance+Weighted+Actor-Learner+Architectures

2018

https://deepmind.com/blog/impala-scalable-distributed-deeprl-dmlab-30/
- QUOTE: … In order to tackle the challenging DMLab-30 suite, we developed a new distributed agent called IMPALA that maximises data throughput using an efficient distributed architecture with TensorFlow. IMPALA is inspired by the popular A3C architecture which uses multiple distributed actors to learn the agent’s parameters. In models like this, each of the actors uses a clone of the policy parameters to act in the environment. Periodically, actors pause their exploration to share the gradients they have computed with a central parameter server that applies updates (see figure below).

Quotes

Abstract

In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters. A key challenge is to handle the increased amount of data and extended training time. We have developed a new distributed agent IMPALA (Importance Weighted Actor-Learner Architecture) that not only uses resources more efficiently in single-machine training but also scales to thousands of machines without sacrificing data efficiency or resource utilisation. We achieve stable learning at high throughput by combining decoupled acting and learning with a novel off-policy correction method called V-trace. We demonstrate the effectiveness of IMPALA for multi-task reinforcement learning on DMLab-30 (a set of 30 tasks from the DeepMind Lab environment (Beattie et al., 2016)) and Atari-57 (all available Atari games in Arcade Learning Environment (Bellemare et al., 2013a)). Our results show that IMPALA is able to achieve better performance than previous agents with less data, and crucially exhibits positive transfer between tasks as a result of its multi-task approach.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2018 IMPALAScalableDistributedDeepRL	Koray Kavukcuoglu Shane Legg Tim Harley Karen Simonyan Lasse Espeholt Hubert Soyer Remi Munos Volodymir Mnih Tom Ward Yotam Doron Vlad Firoiu Iain Dunning			IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=2018_IMPALAScalableDistributedDeepRL&oldid=691782"

Facts

... more about "2018 IMPALAScalableDistributedDeepRL"

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures +

2018 +