The Fragalysis Backend

The Django server for Fragalysis using the Django REST Framework (DRF) for the API, and loaders for data.

See additional documentation relating to the backend on ReadTheDocs at https://fragalysis-backend.readthedocs.io/en/latest/index.html

Background

The Backend is part of the Stack, which consists of three services: -

a Postgres database
a neo4j graph database
the Fraglaysis "stack"

The stack is formed from code resident in a number of repositories. This one, and: -

Other, significant, repositories include: -

The stack is deployed as a container images to Kubernetes using Ansible playbooks that can be found in the Ansible repository. Additional development and deployment documentation can be found in the informaticsmatters/dls-fragalysis-stack-kubernetes repository.

Setting up development environment

This project uses Poetry https://python-poetry.org/ pacakage management system. Required packages (along with other project settings), are specified on pyproject.toml file, and all dependencies and their versions in poetry.lock file. When the repository is first downloaded, create the local virtual environment by running: -

poetry install

New packages are added with: -

poetry add

This opens an interactive dialogue where package name and version (exact or range) can be specified. Alternatively, the package can be added to pyproject.toml file under appropriate section manually.

After package has been added (or just to update packages defined with a range of allowed versions) run: -

poetry update

This resolves all dependencies (and their dependencies), writes poetry.lock file and installs/updates new packages to local venv. It's equivalent to running poetry lock && poetry install, so if you're not interested in local environment and just want to update the lockfile, you can run just poetry lock.

Building and running (local)

The backend is a Docker container image and can be build and deployed locally using docker-compose: -

docker-compose build

To run the application (which wil include deployment of the postgres and neo4j databases) run: -

docker-compose up -d

The postgres database is persisted in the data directory, outside of the repo.

You may need to provide a number of environment variables that are employed in the container image. Fragalysis configuration depends on a large number of variables, and the defaults may not be suitable for your needs.

The typical pattern with docker-compose, is to provide these variables in the docker-compose.yml file and adjust their values (especially the sensitive ones) using a local .env file (see environment variables).

The backend API, for example, should be available on port 8080 of your host at http://localhost:8080/api/.

You can visit the /accounts/login endpoint to login (assuming you have setup the appropriate environment variables for the container). This generates errors relating to the fact that the FE/Webpack can’t be found. This looks alarming but you are logged in.

The backend no longer writes .pyc files (the Dockerfile sets the environment variable PYTHONDONTWRITEBYTECODE). This, and the fact the backend code is mapped into the container, allows you to make "live" changes to the code on your host and see them reflected in the container app without having to rebuild or restart the backend container.

When you want to spin-down the deployment run: -

docker-compose down

When running locally (via docker-compose) Celery tasks are set to run synchronously, like a function call rather than as asynchronous tasks. This is controlled by the CELERY_TASK_ALWAYS_EAGER environment variable that you'll find in the docker-compose.yml file. If asynchronous Celery tasks are needed in local development, they can be launched with the additional compose file:

docker compose -f docker-compose.yml -f docker-compose.celery.yml up

There is also a convenient bash script that can be used to build and push an image to a repository. you just need to provide the Docker image namespace and a tag. All you need is poetry and docker, and you can run the script: -

export BE_IMAGE_TAG=1187.1
export BE_NAMESPACE=alanbchristie
./build-and-push.sh

Command-line access to the API

With the backend running you should be able to access the REST API. From the command-line you can use curl or httpie. Here, we use http to GET a response from the API root (which does not require authentication)...

http :8080/api/

The response should contain a list of endpoint names and URLs, something like this...

{
    "action-type": "http://localhost:8080/api/action-type/",
    "cmpdchoice": "http://localhost:8080/api/cmpdchoice/",
    "cmpdimg": "http://localhost:8080/api/cmpdimg/",
    [...]
    "vector3ds": "http://localhost:8080/api/vector3ds/",
    "vectors": "http://localhost:8080/api/vectors/",
    "viewscene": "http://localhost:8080/api/viewscene/"
}

To use much of the remainder of the API you will need to authenticate. Some endpoints allow you to use a token, obtained from the corresponding Keycloak authentication service. If you are running a local backend a client ID exists that should work for you, assuming you have a Keycloak user identity. With a few variables: -

TOKEN_URL=keycloak.example.com/auth/realms/xchem/protocol/openid-connect/token
CLIENT_ID=fragalysis-local
CLIENT_SECRET=00000000-0000-0000-0000-000000000000
USER=someone
PASSWORD=password123

...you should eb able to obtain an API token. Here we're using httpand jq: -

TOKEN=$(http --form POST https://$TOKEN_URL/ \
    grant_type=password \
    client_id=$CLIENT_ID \
    client_secret=$CLIENT_SECRET \
    username=$USER \
    password=$PASSWORD | jq -r '.access_token')

The token should last for at least 15 minutes, depending on the Keycloak configuration. With the Token you should then be able to make authenticated requests to the API on your local backend.

Here's an illustration of how to use the API from the command-line by getting, adding, and deleting a CompoundIdentifierType: -

ENDPOINT=api/compound-identifier-types

http :8080/$ENDPOINT/ "Authorization:Bearer $TOKEN"
RID=$(http post :8080/$ENDPOINT/ "Authorization:Bearer $TOKEN" name="XT345632" | jq -r '.id')
http delete :8080/$ENDPOINT/$RID/ "Authorization:Bearer $TOKEN"

Logging

The backend writes log information in the container to /code/logs/backend.log. This is typically persisted between container restarts on Kubernetes with a separate volume mounted at /code/logs.

For local development using the docker-compose.yml file you'll find the logs at ./data/logs/backend.log.

Configuration (environment variables)

The backend configuration is controlled by a number of environment variables. Variables are typically defined in the project's fragalysis/settings.py, where you will also find ALL the dynamically configured variables (those that can be changed using environment variables in the deployed Pod/Container).

Not all variables are dynamic. For example ALLOWED_HOSTS is a static variable that is set in the settings.py file and is not intended to be changed at run-time.

Refer to the documentation in the settings.py file to understand the environment and the style guide for new variables that you need to add.

Database migrations

The best approach is to spin-up the development backend (locally) using docker-compose with the custom migration compose file and then shell into Django. For example, to make new migrations called "add_job_request_start_and_finish_times" for the viewer's model run the following: -

Before starting postgres, if you need to, remove any pre-existing local database (if one exists) with rm -rf ./data/postgresl

docker-compose -f docker-compose-migrate.yml up -d

Then enter the backend container with: -

Then from within the backend container make the migrations (in this case for the viewer)...

docker-compose -f docker-compose-migrate.yml exec backend bash
python manage.py makemigrations viewer --name "add_job_request_start_and_finish_times"

Exit the container and tear-down the deployment: -

docker-compose -f docker-compose-migrate.yml down

The migrations will be written to your clone's filesystem as the project directory is mapped into the container as a volume at /code. You just need to commit the migrations that have been written to the corresponding migrations directory.

Sentry error logging

Sentry can be used to log errors in the backend container image.

In settings.py, this is controlled by setting the value of FRAGALYSIS_BACKEND_SENTRY_DNS, which is also exposed in the developer docker-compose file. To enable it, you need to set it to a valid Sentry DNS value.

Deployment mode

The stack can be deployed in one of tweo modes: - DEVELOPMENT or PRODUCTION. The mode is controlled by the DEPLOYMENT_MODE environment variable and is used by the backend in order to tailor the behaviour of the application.

In PRODUCTION mode the API is typically a little more strict than in DEVELOPMENT mode.

Forced errors ("infections")

In order to allow error paths of various elements of the stack to be tested, the developer can inject specific errors ("infections"). This is achieved by setting the environment variable INFECTIONS in the docker-compose.yml file or, for kubernetes deployments, using the ansible variable stack_infections.

Known errors are documented in the api/infections.py module. To induce the error (at thew appropriate point in the stack) provide the infection name as the value of the INFECTIONS environment variable. You can provide more than one name by separating them with a comma.

Infections are ignored in PRODUCTION mode.

Compiling the documentation

Because the documentation uses Sphinx and its autodoc module, compiling the documentation needs all the application requirements. As this is often impractical on the command-line, the most efficient way to build the documentation is from within the backend container: -

docker-compose up -d
docker-compose exec backend bash

pip install sphinx==5.3.0
pip install importlib-metadata~=4.0

cd docs
sphinx-build -b html source/ build/

The current version of Python used in the Django container image is 3.7 and this suffers from an import error relating to celery. It is fixed by using a pre-v5.0 version of importlib-metadata as illustrated in the above example. (see https://stackoverflow.com/questions/73933432/)

The code directory is mounted in the container so the compiled documentation can then be committed from the host machine.

Pre-commit

The project uses pre-commit to enforce linting of files prior to committing them to the upstream repository.

As fragalysis is a complex code-based (that's been maintained by a number of key developers) we currently limit the linting to the viewer application (see the .pre-commit-config.yaml file for details). In the future we might extend this to the entire code-base.

To get started review the pre-commit utility and then set-up your local clone by following the Installation and Quick Start sections of the pre-commit documentation.

Ideally from a Python environment...

poetry shell
poetry install --only dev

pre-commit install -t commit-msg -t pre-commit

Now the project's rules will run on every commit and you can check the state of the repository as it stands with...

pre-commit run --all-files

Design documents

As the application has evolved several design documents have been written detailing improvements. These may be useful for background reading on why decisions have been made.

The documents will be stored in the /design_docs folder in the repo. These include, but are not limit to: -

Name		Name	Last commit message	Last commit date
Latest commit History 2,924 Commits
.github/workflows		.github/workflows
api		api
design_docs		design_docs
doc_templates		doc_templates
docs		docs
fragalysis		fragalysis
grafana		grafana
hotspots		hotspots
hypothesis		hypothesis
media_serve		media_serve
network		network
scoring		scoring
service_status		service_status
tests		tests
viewer		viewer
xcdb		xcdb
.dockerignore		.dockerignore
.gitignore		.gitignore
.hadolint.yaml		.hadolint.yaml
.mypy.ini		.mypy.ini
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.readthedocs.yml		.readthedocs.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
build-and-push.sh		build-and-push.sh
build-requirements.txt		build-requirements.txt
django_nginx.conf		django_nginx.conf
docker-compose-migrate.yml		docker-compose-migrate.yml
docker-compose.celery.yml		docker-compose.celery.yml
docker-compose.edit.yml		docker-compose.edit.yml
docker-compose.test.yml		docker-compose.test.yml
docker-compose.yml		docker-compose.yml
docker-entrypoint.sh		docker-entrypoint.sh
endpoints.md		endpoints.md
launch-beat.sh		launch-beat.sh
launch-stack.sh		launch-stack.sh
launch-worker.sh		launch-worker.sh
manage.py		manage.py
nginx.conf		nginx.conf
poetry.lock		poetry.lock
proxy_params		proxy_params
pyproject.toml		pyproject.toml
test-entry.sh		test-entry.sh
wait-for-it.sh		wait-for-it.sh

License

xchem/fragalysis-backend

Folders and files

Latest commit

History

Repository files navigation