platform-monitoring

Local Development

Install minikube (https://github.com/kubernetes/minikube#installation);

Authenticate local docker:

gcloud auth configure-docker  # part of `make gke_login`

Launch minikube:
```
./minikube.sh start
```
Make sure the kubectl tool uses the minikube k8s cluster:
```
minikube status
kubectl config use-context minikube
```
Load images into minikube's virtual machine:
```
./minikube.sh load-images
```
Apply minikube configuration and some k8s fixture services:
```
./minikube.sh apply
```
Create a new virtual environment with Python 3.11:
```
python -m venv venv
source venv/bin/activate
```
Install testing dependencies:
```
make setup
```
Run the unit test suite:
```
make test_unit
```
Run the integration test suite:
```
make test_integration
```
Cleanup+shutdown minikube:
```
./minikube.sh clean
./minikube.sh stop
```

Logs collection

Job container logs are tracked by Fluent Bit and sent to S3 compatible storage using Amazon S3 plugin. Files are pushed to S3 every minute for each running job. S3 keys have format kube.var.log.containers.{pod_name}_{namespace_name}_{container_name}-{container_id}/{date}_{index}.gz.

Logs compaction

Logs are periodically compacted (every hour by default). Logs compaction consists of two phases: merging and cleaning. After merging phase certain number of log files are merged into one and pushed to S3. Cleaning of the merged files is done before the next merging phase to avoid situations where log files are being deleted while user is reading them.

S3 log reader is capable of reading both merged and raw log files and combining them into a single job log output.

Merging

Batches of raw Fluent Bit logs files are merged and pushed to S3 under data folder. S3 keys have format data/{pod_name}/{container_id}/{date}_{index}.gz. Additionally for each job a separate metadata file is maintained, it has a key format metadata/{pod_name}.json. It contains the list of all the merged files and some additional statistics.

Cleaning

Every time merging is done for a certain job its name is pushed to the cleanup queue. This queue is just a folder in S3 bucket and we need it in order to perform one last cleanup once the job finishes or stops producing logs. After cleanup job is removed from the queue.

Name		Name	Last commit message	Last commit date
Latest commit History 786 Commits
.github		.github
charts/platform-monitoring		charts/platform-monitoring
platform_monitoring		platform_monitoring
tests		tests
.codecov.yml		.codecov.yml
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
PLATFORMADMIN_IMAGE		PLATFORMADMIN_IMAGE
PLATFORMAPI_IMAGE		PLATFORMAPI_IMAGE
PLATFORMAUTHAPI_IMAGE		PLATFORMAUTHAPI_IMAGE
PLATFORMCONFIG_IMAGE		PLATFORMCONFIG_IMAGE
PLATFORMCONTAINERRUNTIME_IMAGE		PLATFORMCONTAINERRUNTIME_IMAGE
PLATFORMNOTIFICATIONS_IMAGE		PLATFORMNOTIFICATIONS_IMAGE
README.md		README.md
k8s.mk		k8s.mk
minikube.sh		minikube.sh
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

License

neuro-inc/platform-monitoring

Folders and files

Latest commit

History

Repository files navigation

platform-monitoring

Local Development

Logs collection

Logs compaction

Merging

Cleaning

About

Resources

License

Stars

Watchers

Forks

Languages