Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status of testing of Apache Airflow Helm Chart 1.7.0rc1 #26971

Closed
15 of 25 tasks
jedcunningham opened this issue Oct 10, 2022 · 17 comments
Closed
15 of 25 tasks

Status of testing of Apache Airflow Helm Chart 1.7.0rc1 #26971

jedcunningham opened this issue Oct 10, 2022 · 17 comments
Labels
kind:meta High-level information important to the community testing status Status of testing releases

Comments

@jedcunningham
Copy link
Member

jedcunningham commented Oct 10, 2022

We have a kind request for all the contributors to the latest Apache Airflow Helm Chart 1.7.0rc1.

Could you please help us to test the RC versions of Airflow?

Please let us know in the comment if the issue is addressed in the latest RC.

Thanks to all who contributed to the release (probably not a complete list!):
@vivek-zeta @MatthieuBlais @joshuaghezzi @ephraimbuddy @mabrikan @ihorlukianov @danielhoherd @Swalloow @BobDu @moshederri @dan-vaughan @V0lantis @potiuk @dstandish @rishkarajgi @SuperQ @csp98 @Aakcht @EliMor @raphaelauv @gmsantos @jedcunningham

@jedcunningham jedcunningham added the kind:meta High-level information important to the community label Oct 10, 2022
@danielhoherd
Copy link
Contributor

Verified that executor=CeleryExecutor shows up in the airflow deployment by default:

$ helm list -n aftest
NAME   	NAMESPACE	REVISION	UPDATED                             	STATUS  	CHART        	APP VERSION
airflow	aftest   	1       	2022-10-10 11:51:10.148779 -0400 EDT	deployed	airflow-1.7.0	2.4.1
$ k -n aftest get deployment airflow-scheduler --show-labels
NAME                READY   UP-TO-DATE   AVAILABLE   AGE    LABELS
airflow-scheduler   1/1     1            1           4m2s   app.kubernetes.io/managed-by=Helm,chart=airflow-1.7.0,component=scheduler,executor=CeleryExecutor,heritage=Helm,release=airflow,tier=airflow

Also verified with KubernetesExecutor:

$ helm upgrade -n aftest airflow . --set executor=KubernetesExecutor
...lots of output...
$ k -n aftest get deployment airflow-scheduler --show-labels
NAME                READY   UP-TO-DATE   AVAILABLE   AGE     LABELS
airflow-scheduler   1/1     1            1           7m46s   app.kubernetes.io/managed-by=Helm,chart=airflow-1.7.0,component=scheduler,executor=KubernetesExecutor,heritage=Helm,release=airflow,tier=airflow

@jedcunningham
Copy link
Member Author

I've verified the following:
#26485 - Airflow 2.4.1 by default
#25561 - Celery worker liveness probe
#24395 - Postgres subchart is vendored
#23876 - Executor docs change

@joshuaghezzi
Copy link
Contributor

Verified #25732 - StatsD podAnnotations

image

image

@BobDu
Copy link
Contributor

BobDu commented Oct 11, 2022

Verified #26415 - Add default flower_url_prefix in helm chart values

ingress:
  web:
    enabled: true
    hosts:
    - name: "airflow2-dev.example.com"
    ingressClassName: "nginx"
  flower:
    enabled: true
    path: "/flower"
    hosts:
      - name: "airflow2-dev.example.com"
    ingressClassName: "nginx"

and not custom define config.celery.flower_url_prefix

# bobdudu @ BobDu in ~ 
$ kubectl -n airflow2 get cm airflow-airflow-config -o yaml              
apiVersion: v1
data:
  airflow.cfg: |-
    [celery]
    flower_url_prefix = /flower
    worker_concurrency = 16

    [celery_kubernetes_executor]
    kubernetes_queue = kubernetes

    [core]
    colored_console_log = False
    dags_folder = /opt/airflow/dags
    executor = CeleryExecutor
    load_examples = False
    remote_logging = False

    [elasticsearch]
    json_format = True
    log_id_template = {dag_id}_{task_id}_{execution_date}_{try_number}

    [elasticsearch_configs]
    max_retries = 3
    retry_timeout = True
    timeout = 30

    [kerberos]
    ccache = /var/kerberos-ccache/cache
    keytab = /etc/airflow.keytab
    principal = airflow@FOO.COM
    reinit_frequency = 3600

    [kubernetes]
    airflow_configmap = airflow-airflow-config
    airflow_local_settings_configmap = airflow-airflow-config
    multi_namespace_mode = False
    namespace = airflow2
    pod_template_file = /opt/airflow/pod_templates/pod_template_file.yaml
    worker_container_repository = 385382614844.dkr.ecr.ap-east-1.amazonaws.com/airflow2
    worker_container_tag = v0.3

    [logging]
    colored_console_log = False
    remote_logging = False

    [metrics]
    statsd_host = airflow-statsd
    statsd_on = True
    statsd_port = 9125
    statsd_prefix = airflow

    [scheduler]
    run_duration = 41460
    standalone_dag_processor = False
    statsd_host = airflow-statsd
    statsd_on = True
    statsd_port = 9125
    statsd_prefix = airflow

    [webserver]
    enable_proxy_fix = True
    rbac = True
  airflow_local_settings.py: ""
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: airflow
    meta.helm.sh/release-namespace: airflow2
  creationTimestamp: "2022-09-13T07:43:46Z"
  labels:
    app.kubernetes.io/managed-by: Helm
    chart: airflow-1.7.0
    component: config
    heritage: Helm
    release: airflow
    tier: airflow
  name: airflow-airflow-config
  namespace: airflow2
  resourceVersion: "757685511"
  uid: 86c10cea-11c4-40d5-8ec9-4a7f8a822464

No problem.

@Aakcht
Copy link
Contributor

Aakcht commented Oct 11, 2022

Tested #25283 - works as expected

@BobDu
Copy link
Contributor

BobDu commented Oct 11, 2022

Verified #26598 overrideMappings

statsd:
  overrideMappings:
    # === Counters ===
    - match: "(.+)\\.(.+)_start$"
      match_metric_type: counter
      name: "airflow_job_start"
      match_type: regex
      labels:
        airflow_id: "$1"
        job_name: "$2"
    - match: "(.+)\\.(.+)_end$"
      match_metric_type: counter
      name: "airflow_job_end"
      match_type: regex
      labels:
        airflow_id: "$1"
        job_name: "$2"
# bobdu @ BobDu in ~ [17:35:34] 
$ kubectl -n airflow2 get cm airflow-statsd -o yaml
apiVersion: v1
data:
  mappings.yml: |-
    mappings:
      - labels:
          airflow_id: $1
          job_name: $2
        match: (.+)\.(.+)_start$
        match_metric_type: counter
        match_type: regex
        name: airflow_job_start
      - labels:
          airflow_id: $1
          job_name: $2
        match: (.+)\.(.+)_end$
        match_metric_type: counter
        match_type: regex
        name: airflow_job_end
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: airflow
    meta.helm.sh/release-namespace: airflow2
  creationTimestamp: "2022-10-11T09:09:41Z"
  labels:
    app.kubernetes.io/managed-by: Helm
    chart: airflow-1.7.0
    component: config
    heritage: Helm
    release: airflow
    tier: airflow
  name: airflow-statsd
  namespace: airflow2
  resourceVersion: "757709465"
  uid: 8438f60b-b7d0-46be-99dc-8415fcefb800
# bobdu @ BobDu in ~ 
$ kubectl get --raw '/api/v1/namespaces/airflow2/services/airflow-statsd:9102/proxy/metrics'
# HELP airflow_localtaskjob_end Metric autogenerated by statsd_exporter.
# TYPE airflow_localtaskjob_end counter
airflow_localtaskjob_end 65
# HELP airflow_localtaskjob_start Metric autogenerated by statsd_exporter.
# TYPE airflow_localtaskjob_start counter
airflow_localtaskjob_start 65

@BobDu
Copy link
Contributor

BobDu commented Oct 11, 2022

some report about #24496
not a bug, but i think it may be necessary to highlight this change in the changelog.
must ensure .Values.airflowVersion if use custom image.
Helm chart 1.7.0 not add env AIRFLOW__CELERY__RESULT_BACKEND by default values, but if use airflow <= v2.3, worker will start failure.

airflow@airflow-worker-1:/opt/airflow$ airflow celery worker
psycopg2.OperationalError: could not translate host name "postgres" to address: Name or service not known

And this error message may not be friendly.

@MatthieuBlais
Copy link
Contributor

Tested #26838, works as expected

@jedcunningham
Copy link
Member Author

@BobDu, it is assumed that folks keep airflowVersion up to date with the version in their image. This has long been a thing. This is just the latest failure of this type, another off the top of my head is schedule livenessprobes between 2.0 and 2.1.

It would be nice to detect if folks forget though. Not sure how easy that'd be to do though.

@mabrikan
Copy link
Contributor

Tested #26423 in the RC. Working as expected.

Default value for imagePullPolicy in pod_template.yaml

$ helm list
NAME   	NAMESPACE	REVISION	UPDATED                                	STATUS  	CHART        	APP VERSION
airflow	airflow  	1       	2022-10-11 20:34:39.015555852 +0300 +03	deployed	airflow-1.7.0	2.4.1      
$ kubectl get cm airflow-airflow-config -oyaml | yq e '.data."pod_template_file.yaml"' - | yq e '.spec.containers[0].imagePullPolicy'
IfNotPresent

Changing it to Always

$ helm upgrade airflow --reuse-values --set=images.pod_template.pullPolicy=Always .
$ kubectl get cm airflow-airflow-config -oyaml | yq e '.data."pod_template_file.yaml"' - | yq e '.spec.containers[0].imagePullPolicy'
Always

@gmsantos
Copy link
Contributor

gmsantos commented Oct 12, 2022

all good for #24647

Partial values file:

workers:
  resources:
    requests:
      cpu: 300m
      memory: 128Mi
    limits:
      cpu: 700m
      memory: 512Mi

Resulting pod template file:

> k exec airflow-scheduler-c95484f44-hbr2g -it -- cat /usr/local/airflow/pod_templates/pod_template_file.yaml
...
      resources:
        limits:
          cpu: 700m
          memory: 512Mi
        requests:
          cpu: 300m
          memory: 128Mi

@Aakcht
Copy link
Contributor

Aakcht commented Oct 12, 2022

Also tested #24496 - looks good.

@gmsantos
Copy link
Contributor

Looks good for #23711 too

On values file:

dagProcessor:
  enabled: true
> kgp -l component=dag-processor
NAME                                     READY   STATUS    RESTARTS   AGE
airflow-dag-processor-8655f894b4-fhgw6   1/1     Running   0          2m19s

@jedcunningham
Copy link
Member Author

all good for #24647

(@gmsantos that PR was about worker annotations, not resources)

@gmsantos
Copy link
Contributor

gmsantos commented Oct 12, 2022

ops, sorry my bad. For #24647:

  workers:
    podAnnotations:
      cluster-autoscaler.kubernetes.io/safe-to-evict: "false"

pod template file:

> k exec airflow-cookbook-scheduler-6df58df678-b8mqz -it -- cat /usr/local/airflow/pod_templates/pod_template_file.yaml 
Defaulted container "scheduler" out of: scheduler, scheduler-log-groomer, wait-for-airflow-migrations (init)

...
---
apiVersion: v1
kind: Pod
metadata:
  name: dummy-name
...
  annotations:
    cluster-autoscaler.kubernetes.io/safe-to-evict: "false"

@csp33
Copy link
Contributor

csp33 commented Oct 13, 2022

#25059 working as expected.
Global setting:
image

Specific setting:
image

Global + specific setting:
image

@jedcunningham
Copy link
Member Author

The helm chart is being released! Thanks everyone for testing the RC 🍺

@potiuk potiuk added the testing status Status of testing releases label Dec 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:meta High-level information important to the community testing status Status of testing releases
Projects
None yet
Development

No branches or pull requests

10 participants