The software responsible for controlling the creation of Jobs, and notifying the rest of the software about job completion.
Expects the following environment variables to be set when running:
- "QUEUE_HOST": ip to the message broker
- "QUEUE_USER": The username for the message broker
- "QUEUE_PASSWORD": The password for the message broker
- "FIA_IP": ip to the FIA-API
- "DB_IP": ip for database
- "DB_USERNAME": Username for database
- "DB_PASSWORD": Password for database
- "REDUCE_USER_ID": The ID for used for when interacting with CEPH
- "RUNNER_SHA": The SHA256 of the runner container on the github container registry, that will be used for completing jobs on the cluster
- "KUBECONFIG": (Optional) Path to the kubeconfig file
When a job is created it will have a volume mounted to /output
that will be the correct folder for ceph to output to.
To run:
pip install .
jobcontroller
To install when developing:
pip install .[dev]
To demo and test. The easiest way to test JobController is running and functioning correctly, it requires a kubernetes cluster to interact with, and a rabbitmq instance with a queue to listen to:
- Follow these instructions to create the cluster
- Create the message broker, this is presently RabbitMQ
- Using the producer send one of the messages in the example messages section below.
- The JobController should make a job and the job will make pods that will perform the work for the run
-
The containers are stored in the container registry for the organisation on github.
-
Build container:
docker build . -f ./container/jobcontroller.D -t ghcr.io/fiaisis/jobcontroller
- Run container (replace contents of < > with relevant details):
docker run -it --rm --mount source=/ceph/<instrument>/RBNumbers/RB<experiment number>,target=/output --name jobcontroller ghcr.io/fiaisis/jobcontroller
- To push containers you will need to setup the correct access for it, you can follow this guide.
- Publish container:
docker push ghcr.io/fiaisis/jobcontroller -a
To run the tests:
pytest .
{"filepath": "/test/path/to/MARI0.nxs", "experiment_number": "0", "instrument": "MARI"}
{"filepath": "/test/path/to/MARI123456.nxs", "experiment_number": "1220474", "instrument": "MARI"}
{"run_number": 25581, "instrument": "MARI", "experiment_title": "Whitebeam - vanadium - detector tests - vacuum bad - HT on not on all LAB", "experiment_number": "1820497", "filepath": "/archive/25581/MAR25581.nxs", "run_start": "2019-03-22T10:15:44", "run_end": "2019-03-22T10:18:26", "raw_frames": 8067, "good_frames": 6452, "users": "Wood,Guidi,Benedek,Mansson,Juranyi,Nocerino,Forslund,Matsubara", "additional_values": {"ei": "auto", "sam_mass": 0.0, "sam_rmm": 0.0, "monovan": 0, "remove_bkg": true, "sum_runs": false, "runno": 25581}}
{"run_number": 28581, "instrument": "MARI", "experiment_title": "", "experiment_number": "2220746", "filepath": "/archive/NDXMARI/Instrument/data/cycle_23_1/MAR28581.nxs", "run_start": "2019-03-22T10:15:44", "run_end": "2019-03-22T10:18:26", "raw_frames": 8067, "good_frames": 6452, "users": "users", "additional_values": {"ei": "'auto'", "sam_mass": 0.0, "wbvan": 0, "sam_rmm": 0.0, "monovan": 0, "remove_bkg": false, "sum_runs": false, "runno": 28581, "mask_file_link": "https://raw.githubusercontent.com/mantidproject/scriptrepository/2d81c9cf70c2ee679472d99ee2e898f617c59f7a/direct_inelastic/MARI/mari_mask.xml"}}