Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GrafanaAgent API does not allow to limit resources of configReloader pods #5903

Open
repjak opened this issue Dec 1, 2023 · 5 comments
Open
Labels
bug Something isn't working operator Grafana Agent Operator related variant/operator Related to Grafana Agent Static Operator.

Comments

@repjak
Copy link

repjak commented Dec 1, 2023

What's wrong?

When grafana agent operator is deployed by a Helm chart into a namespace with a quota, the pods created by the operator fail the quota as no resources are applied to the config reloader pod spec.

The API should:

  1. provide the option to specify the resources for configReloader pod, or
  2. apply resources consistently to all pods created by the agent

Steps to reproduce

The error can be reproduced with loki helm chart, which creates a GrafanaAgent instance.

  1. Deploy grafana-agent-operator Helm chart into a namespace with ResourceQuota
  2. Deploy loki into the same namespace
  3. Daemonset loki-logs created by the grafana agent operator fails quota

System information

OVH Managed Kubernetes Service, kubernetes version 1.25.12-3
Helm charts:
- grafana/grafana-agent-operator, version: ^0.3.11
- grafana/loki,  version: ^5.39.0
Helm version v3.13.2
Agent operator: docker.io/grafana/agent-operator:v0.37.4

Logs

$ kubectl describe daemonset loki-logs -n monitoring
Name:           loki-logs
Selector:       app.kubernetes.io/instance=loki,app.kubernetes.io/managed-by=grafana-agent-operator,app.kubernetes.io/name=grafana-agent,grafana-agent=loki,operator.agent.grafana.com/name=loki,operator.agent.grafana.com/type=logs
Node-Selector:  <none>
Labels:         app.kubernetes.io/instance=loki
                app.kubernetes.io/managed-by=grafana-agent-operator
                app.kubernetes.io/name=grafana-agent
                grafana-agent=loki
                operator.agent.grafana.com/name=loki
                operator.agent.grafana.com/type=logs
Annotations:    deprecated.daemonset.template.generation: 1
                meta.helm.sh/release-name: loki
                meta.helm.sh/release-namespace: monitoring
Desired Number of Nodes Scheduled: 4
Current Number of Nodes Scheduled: 0
Number of Nodes Scheduled with Up-to-date Pods: 0
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 0
Pods Status:  0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app.kubernetes.io/instance=loki
                    app.kubernetes.io/managed-by=grafana-agent-operator
                    app.kubernetes.io/name=grafana-agent
                    app.kubernetes.io/version=v0-37-4
                    grafana-agent=loki
                    operator.agent.grafana.com/name=loki
                    operator.agent.grafana.com/type=logs
  Annotations:      kubectl.kubernetes.io/default-container: grafana-agent
  Service Account:  loki-grafana-agent
  Containers:
   config-reloader:
    Image:      quay.io/prometheus-operator/prometheus-config-reloader:v0.67.1
    Port:       <none>
    Host Port:  <none>
    Args:
      --config-file=/var/lib/grafana-agent/config-in/agent.yml
      --config-envsubst-file=/var/lib/grafana-agent/config/agent.yml
      --watch-interval=1m
      --statefulset-ordinal-from-envvar=POD_NAME
      --reload-url=http://127.0.0.1:8080/-/reload
    Environment:
      POD_NAME:   (v1:metadata.name)
      HOSTNAME:   (v1:spec.nodeName)
      SHARD:     0
    Mounts:
      /var/lib/docker/containers from dockerlogs (ro)
      /var/lib/grafana-agent/config from config-out (rw)
      /var/lib/grafana-agent/config-in from config (ro)
      /var/lib/grafana-agent/data from data (rw)
      /var/lib/grafana-agent/secrets from secrets (ro)
      /var/log from varlog (ro)
   grafana-agent:
    Image:      grafana/agent:v0.37.4
    Port:       8080/TCP
    Host Port:  0/TCP
    Args:
      -config.file=/var/lib/grafana-agent/config/agent.yml
      -config.expand-env=true
      -server.http.address=0.0.0.0:8080
      -enable-features=integrations-next
    Limits:
      cpu:     50m
      memory:  64M
    Requests:
      cpu:      10m
      memory:   32M
    Readiness:  http-get http://:http-metrics/-/ready delay=0s timeout=3s period=5s #success=1 #failure=120
    Environment:
      POD_NAME:   (v1:metadata.name)
      HOSTNAME:   (v1:spec.nodeName)
      SHARD:     0
    Mounts:
      /var/lib/docker/containers from dockerlogs (ro)
      /var/lib/grafana-agent/config from config-out (rw)
      /var/lib/grafana-agent/config-in from config (ro)
      /var/lib/grafana-agent/data from data (rw)
      /var/lib/grafana-agent/secrets from secrets (ro)
      /var/log from varlog (ro)
  Volumes:
   config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  loki-logs-config
    Optional:    false
   config-out:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
   secrets:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  loki-secrets
    Optional:    false
   varlog:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log
    HostPathType:  
   dockerlogs:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/docker/containers
    HostPathType:  
   data:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/grafana-agent/data
    HostPathType:  
Events:
  Type     Reason        Age                From                  Message
  ----     ------        ----               ----                  -------
  Warning  FailedCreate  28m                daemonset-controller  Error creating: pods "loki-logs-xhtj9" is forbidden: failed quota: monitoring.quota: must specify limits.cpu for: config-reloader; limits.memory for: config-reloader; requests.cpu for: config-reloader; requests.memory for: config-reloader
@repjak repjak added the bug Something isn't working label Dec 1, 2023
@repjak
Copy link
Author

repjak commented Dec 1, 2023

Adding output from $ kubectl -n monitoring describe grafanaagent loki:

Name:         loki
Namespace:    monitoring
Labels:       app.kubernetes.io/instance=loki
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=loki
              app.kubernetes.io/version=2.9.2
              helm.sh/chart=loki-5.39.0
Annotations:  meta.helm.sh/release-name: loki
              meta.helm.sh/release-namespace: monitoring
API Version:  monitoring.grafana.com/v1alpha1
Kind:         GrafanaAgent
Metadata:
  Creation Timestamp:  2023-11-30T20:44:42Z
  Generation:          2
  Resource Version:    3040719368
  UID:                 5bedee00-a8c7-4aba-a963-41be76adaebb
Spec:
  Disable Reporting:       false
  Disable Support Bundle:  false
  Enable Config Read API:  false
  Logs:
    Instance Selector:
      Match Labels:
        app.kubernetes.io/instance:  loki
        app.kubernetes.io/name:      loki
  Resources:
    Limits:
      Cpu:     50m
      Memory:  64M
    Requests:
      Cpu:               10m
      Memory:            32M
  Service Account Name:  loki-grafana-agent
Events:                  <none>

Copy link
Contributor

github-actions bot commented Jan 1, 2024

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it.
If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue.
The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity.
Thank you for your contributions!

@github-actions github-actions bot added the needs-attention An issue or PR has been sitting around and needs attention. label Jan 1, 2024
@tpaschalis tpaschalis added the operator Grafana Agent Operator related label Jan 9, 2024
@tpaschalis
Copy link
Member

Just to clarify, this only applies to the Operator in static mode; the Agent's own helm chart allows defining resources on the configReloader pod normally.

@github-actions github-actions bot removed the needs-attention An issue or PR has been sitting around and needs attention. label Jan 10, 2024
@elcomtik
Copy link

elcomtik commented Feb 1, 2024

This bothers me too, it would be great to be able to define resources for config-reloader too.

@repjak
Copy link
Author

repjak commented Feb 1, 2024

As a workaround, one can create a LimitRange with default and defaultRequest limits for the namespace, but it requires permissions.

@rfratto rfratto added the variant/operator Related to Grafana Agent Static Operator. label Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working operator Grafana Agent Operator related variant/operator Related to Grafana Agent Static Operator.
Projects
None yet
Development

No branches or pull requests

4 participants