Skip to content

gorisanson/Deep-Q-Learning

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Practice Double Deep Q-Learning with Dueling Network Architecture for Breakout

This is fork from fg91/Deep-Q-Learning. I modified a little bit of the original code and trained it for Breakout. Maximum evaluation score was 804 (GIF shown above).

Modification from the original code

  • Modify ReplayMemory to sample from all possible index.
  • Clip rewards ("fixed all positive rewards to be 1 and all negative rewards to be -1, leaving 0 rewards unchanged") as Mnih et al. 2013 and Mnih et al. 2015. (This made convergence speed of training significantly faster and improved the performance of the agent in case of Breakout.)
  • Record evaluation score appropriately even if no evaluation game finished. (Unfinished evaluation game can happen when "the agent got stuck in a loop".)

Training Result

Requirements

  • tensorflow-gpu
  • gym
  • gym[atari] (make sure it is version 0.10.5 or higher/has BreakoutDeterministic-v4)
  • imageio
  • scikit-image

Try it yourself:

If you want to test the trained network (which achieves score of 804), simply run the notebook DQN.ipynb.

If you want to train the network yourself, set TRAIN = True in the first cell of DQN.ipynb and run the notebook.

About

Tensorflow implementation of Deepminds dqn with double dueling networks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 97.2%
  • Gnuplot 2.8%