README

This repository includes an implementation of Q-Learning and policy gradient to solve this Open AI gym environment https://gym.openai.com/envs/CarRacing-v0/ using PyTorch.

More details in the source file: https://github.com/openai/gym/blob/master/gym/envs/box2d/car_racing.py

Requirements

Tested with:

Ubuntu 18.04
Nvidia RTX 2070 card
Cuda 10.2
CuDNN 7.6.5

pip install -r requirements.txt

How to play the game

./play.sh

Q-Learning

TODO:

Describe action spaces
Add more insights (e.g. increasing the experience buffer works)

How to train

TODO

How to test

TODO

Pretrained models

Models, configurations and outputs on Google Drive.

Configuration changes

Model name	Main configuration improvement(s)
model_basic_openai_stop_expl	Stop after 50 negative consecutive rewards

Results

Model name	Training episodes	Test average score (10 runs *)	Short example	Notes
model_basic_openai_stop_expl	450	690		It fails the tight curves, but not every time. It is able to rejoin the track from the grass in some situations.
model_basic_openai_stop_expl	500	723		Almost perfect guide, the limit on the score is the prudence on gas. It fails in rare tight curve situations.
model_basic_openai_stop_expl	550	N/A	N/A	More on gas, but completely wrong with only 50 more training episodes.

(*) Test runs should be around 100 to be reliable.

Policy gradient

TODO

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

README

Requirements

How to play the game

Q-Learning

How to train

How to test

Pretrained models

Configuration changes

Results

Policy gradient

Files

README.md

Latest commit

History

README.md

File metadata and controls

README

Requirements

How to play the game

Q-Learning

How to train

How to test

Pretrained models

Configuration changes

Results

Policy gradient