Skip to content

rptaylor/kapel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

88 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KAPEL

KAPEL is container-native Kubernetes accounting, for APEL and Gratia.

  • Lightweight and stateless (data storage is handled by Prometheus).
  • Supports two publishing modes:
    • "auto" mode to publish the current month and previous month.
    • "gap" mode to (re)publish an arbitrary fixed time period.
  • Supports two data source modes:
    • Normally, pod data is retrieved from Prometheus.
    • For manual corrections, you can supply the accounting data to be published yourself.
  • Supports two data outlets:
    • "ssmsend" mode to publish records to APEL via SSM.
    • "gratia" mode to publish records to GRACC via Gratia.

Requirements

  • For ssmsend mode: X509 certificate and key to publish APEL records
    • Note: ssmsend only uses the certificate for content signing, not TLS, so the DN of the certificate does not need to match any host name. It only needs to match the "Host DN" field in GOCDB for the gLite-APEL service.
  • kube-state-metrics and Prometheus (installing both via bitnami/kube-prometheus is recommended)
    • You may wish to disable collection of some resources in .Values.kube-state-metrics.kubeResources to reduce the volume of unnecessary data. Only the collection of pods resources is required.
    • You should ensure that the Prometheus deployment is configured to use persistent storage so the collected metrics data will be persisted for a sufficient period of time (e.g. at least a couple months).
    • kube-state-metrics should be configured with honorLabels: true
      • This is more intuitive, and set by default in the bitnami/kube-prometheus chart (see #7690), but differs from the behaviour of the upstream kube-state-metrics community chart.
    • For large production deployments (examples are based on a cluster with about 125 nodes and 7000 cores):
      • Increase .Values.prometheus.querySpec.timeout (e.g. ~ 1800s) to allow long queries to succeed.
      • Apply sufficient CPU and memory resource requests and limits.

In order to be accounted, pods must specify CPU resource requests, and remain registered in Completed state on the cluster for a period of time when they finish. All pods in a specified namespace will be accounted. To do accounting for different projects in multiple namespaces, a KAPEL chart can be installed and configured for each one.

Configuration

Helm chart installation

The KAPEL Helm chart is available from this Helm repository.

See the Helm Chart README for additional information.