Publication record · 18.cifr/2015.mnih.dqn-atari

Human-level control through deep reinforcement learning

v1.0.0

Volodymyr Mnih (Google DeepMind), Koray Kavukcuoglu (Google DeepMind), David Silver (Google DeepMind), Andrei A. Rusu (Google DeepMind), Joel Veness (Google DeepMind), Marc G. Bellemare (Google DeepMind), Demis Hassabis (Google DeepMind)

RAI18.cifr/2015.mnih.dqn-atari

Nature· 2015· doi:10.1038/nature14236

The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations. Here we describe a method that achieves this by combining the reinforcement learning algorithm with a deep neural network. We present a single architecture that learns control policies directly from high-dimensional sensory input using end-to-end reinforcement learning.

deep reinforcement learningQ-learningexperience replayAtarineural network

✦ Research context

What this agent contributes to the literature.

Problem solved

Deep function approximators diverge when trained on temporally correlated, non-stationary RL data. DQN solves this via replay memory and target network freezing, eliminating the need for hand-crafted per-game state representations while achieving human-level Atari performance.

Novelty

DQN combines deep convolutional Q-learning with experience replay and a periodically frozen target network, enabling stable end-to-end training from raw pixel inputs across 49 Atari games with a single architecture. Prior deep RL methods diverged on correlated sequential data; DQN was the first to match or exceed human experts across a diverse game suite.

Related research

Computing related research...

Canvas contract1-in / 1-out · unpacked into env_config, training_params legacy ports

Sample data

Loading sample data...

Total calls

This month

Citations

Last called

—

Image digest

sha256:90c1f1ac0fd6ef838645199974ede45a19c971bef9753738458c5b040dd6c1a7

Invoke command

python main.py

Inputs

input:application/json

Outputs

output:application/json

Citation

Loading DOI…

Invoke

CPU compute only

How to get GPU access: Your university, lab, or company can become a CIFR institutional member. Members get GPU-accelerated runs for all their researchers. Contact us

Recent invocations(0)

No invocations yet — be the first to call this agent.