TD-Gammon

I used an http server that runs on the Guest machine (Ubuntu), to receive commands and interact with the gnubg program.
In this way, it's possible to send commands from the Host machine (in my case MacOS).

The file bridge.py should be executed on the Guest Machine (the machine where gnubg is installed).

On Ubuntu:

gnubg -t -p /path/to/bridge.py

It runs the gnubg with the command-line instead of using the graphical interface (-t) and evaluates a Python code file and exits (-p).
For a list of parameters of gnubg, run gnubg --help.

The python script bridge.py creates an http server, running on localhost:8001.
If you want to modify the host and the port, change the following line in bridge.py:

if __name__ == "__main__":
    HOST = 'localhost' # <-- YOUR HOST HERE
    PORT = 8001  # <-- YOUR PORT HERE
    run(host=HOST, port=PORT)

The file td_gammon/gnubg/gnubg_backgammon.py sends messages/commands to gnubg and parses the response.

Usage

Run python /path/to/main.py --help for a list of parameters.

Train TD-Network

To train a neural network with a single layer with 40 hidden units, for 100000 games/episodes and save the model every 10000, run the following command:

(tdgammon) $ python /path/to/main.py train --save_path ./saved_models/exp1 --save_step 10000 --episodes 100000 --name exp1 --type nn --lr 0.1 --hidden_units 40

Run python /path/to/main.py train --help for a list of parameters available for training.

Evaluate Agent(s)

To evaluate an already trained models, you have to options: evaluate models to play against each other or evaluate one model against gnubg.
Run python /path/to/main.py evaluate --help for a list of parameters available for evaluation.

Agent vs Agent

To evaluate two model to play against each other you have to specify the path where the models are saved with the corresponding number of hidden units.

(tdgammon) $ python /path/to/main.py evaluate --episodes 50 --hidden_units_agent0 40 --hidden_units_agent1 40 --type nn --model_agent0 path/to/saved_models/agent0.tar --model_agent1 path/to/saved_models/agent1.tar

Agent vs gnubg

To evaluate one model to play against gnubg, first you have to run gnubg with the script bridge as input.
On Ubuntu (or where gnubg is installed)

gnubg -t -p /path/to/bridge.py

Then run (to play vs gnubg at intermediate level for 100 games):

(tdgammon) $ python /path/to/main.py evaluate --episodes 50 --hidden_units_agent0 40 --type nn --model_agent0 path/to/saved_models/agent0.tar vs_gnubg --difficulty beginner --host GNUBG_HOST --port GNUBG_PORT

The hidden units (--hidden_units_agent0) of the model must be same of the loaded model (--model_agent0).

Web Interface

You can play against a trained agent via a web gui:

(tdgammon) $ python /path/to/main.py gui --host localhost --port 8002 --model path/to/saved_models/agent0.tar --hidden_units 40 --type nn

Then navigate to http://localhost:8002 in your browser:

Run python /path/to/main.py gui --help for a list of parameters available about the web gui.

Plot Wins

Instead of evaluating the agent during training (it can require some time especially if you evaluate against gnubg - difficulty world_class), you can load all the saved models in a folder, and evaluate each model (saved at different time during training) against one or more opponents.
The models in the directory should be of the same type (i.e the structure of the network should be the same for all the models in the same folder).

To plot the wins against gnubg, run on Ubuntu (or where gnubg is installed):

gnubg -t -p /path/to/bridge.py

In the example below the trained model is going to be evaluated against gnubg on two different difficulties levels - beginner and advanced:`

(tdgammon) $ python /path/to/main.py plot --save_path /path/to/saved_models/myexp --hidden_units 40 --episodes 10 --opponent random,gnubg --dst /path/to/experiments --type nn --difficulty beginner,advanced --host GNUBG_HOST --port GNUBG_PORT

To visualize the plots:

(tdgammon) $ tensorboard --logdir=runs/path/to/experiment/ --host localhost --port 8001

Run python /path/to/main.py plot --help for a list of parameters available about plotting.

Backgammon OpenAI Gym Environment

For a detailed description of the environment: gym-backgammon.

Bibliography, sources of inspiration, related works

TD-Gammon and Temporal Difference Learning:
GNU Backgammon: https://www.gnu.org/software/gnubg/
Rules of Backgammon:
- www.bkgm.com/rules.html
- https://en.wikipedia.org/wiki/Backgammon
- Starting Position: http://www.bkgm.com/gloss/lookup.cgi?starting+position
- https://bkgm.com/faq/
Install GNU Backgammon on Ubuntu:
How to use python to interact with gnubg: [Bug-gnubg] Documentation: Looking for documentation on python scripting
Other Implementation of the Backgammon OpenAI Gym Environment:
- https://github.com/edusta/gym-backgammon
Other Implementation of TD-Gammon:
How to setup your VMWare Fusion images to use static IP addresses on Mac OS X
- https://gist.github.com/pjkelly/1068716/6d19faa0122c0e1efe350e818bb8f4e8687ea1ab
PyTorch Tensorboard: https://pytorch.org/docs/stable/tensorboard.html

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

TD-Gammon

Table of Contents

Features

Installation

Without Anaconda Environment

GNU Backgammon

On Ubuntu:

How to interact with GNU Backgammon using Python Script?

On Ubuntu:

Usage

Train TD-Network

Evaluate Agent(s)

Agent vs Agent

Agent vs gnubg

Web Interface

Plot Wins

Backgammon OpenAI Gym Environment

Bibliography, sources of inspiration, related works

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

TD-Gammon

Table of Contents

Features

Installation

Without Anaconda Environment

GNU Backgammon

On Ubuntu:

How to interact with GNU Backgammon using Python Script?

On Ubuntu:

Usage

Train TD-Network

Evaluate Agent(s)

Agent vs Agent

Agent vs gnubg

Web Interface

Plot Wins

Backgammon OpenAI Gym Environment

Bibliography, sources of inspiration, related works

License