Skip to content

jkterry1/Butterfly-Baselines

Repository files navigation

This repo includes baseline learning code for all the PettingZoo Butterfly environments (except Prison, which is a toy debugging environment) based on parameter shared PPO via Stable Baslines 3 and SuperSuit.

To train all four Butterfly environments for five runs each:

sh train_all.sh

The specifically included environments/verisons are pistonball_v4, knights_archers_zombies_v7, prospector_v4 and cooperative_pong_v3.

To train individual environments:

python train.py --env-name=pistonball_v4 --n-runs=5 --n-evaluations=100 --timesteps=2000000 --num-cpus=8 --num-eval-cpus=4 --num-vec-envs=4

The above example trains pistonball_v4 for 5 runs, with 2000000 timesteps and 100 evaluations per run, on 8 cpus, with four more cpus for the evaluations, and four parallel environments per cpu, and saves the results of the evaluations to data/ENV_NAME/run_x.

To modify other hyperparameters e.g. learning rate, activation function, network size: modify config/ENV_NAME.json Note that the current ones are the result of a hyperparameter tuning search with Optuna via RL Baselines3 Zoo.

To plot learning and evaluations in an environment from the data folder:

python plot.py --env-name=pistonball_v4 --n-runs=10

To plot learning curves for all four environments:

sh plot_all.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published