Skip to content
/ evo-ant Public

REINFORCE-ES applied to openai ant environment

Notifications You must be signed in to change notification settings

mauicv/evo-ant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Evo-Ant

Evolutionary Strategy applied to AntBulletEnv-v0 using gerel REINFORCE-ES algorithm

Solution is 2500 points over 1000 steps. Solves after around 3774 generations.

learning-curves

Note Training is set to run 500 steps for each episode hence 1250 is target. Generation 3774 typically achieves 2500 when run for 1000 steps.


Use

To run pre-trained solution use:

python play.py

To train a solution first run python test.py. This run different number of batches to batch_sizes to compute the best combination on the number of processors present. It may Take a while. Will give output that looks like:

BATCH_SIZE=100, BATCHES=1, time: 9.764189958572388
BATCH_SIZE=50, BATCHES=2, time: 7.418277740478516
BATCH_SIZE=25, BATCHES=4, time: 7.183583974838257
BATCH_SIZE=20, BATCHES=5, time: 6.599231243133545
BATCH_SIZE=10, BATCHES=10, time: 7.606550693511963
BATCH_SIZE=5, BATCHES=20, time: 7.6629064083099365
BATCH_SIZE=2, BATCHES=50, time: 8.292027711868286
BATCH_SIZE=1, BATCHES=100, time: 9.574803113937378

Choose the BATCH_SIZE value that gets the smallest time and update it in the train.py file. Then run

python train.py

This will take a while unless you have a lot of CPUs. Training will create a file called ant_RES_data and save each generation as it evolves to it. You can graph training performance by running:

python graph.py

To run a particular generation from ant_RES_data, say 10 use:

python play.py --dir=ant_RES_data --generation=10

About

REINFORCE-ES applied to openai ant environment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages