Thanos receiver hogs high memory utilisation for more than 16 hours after stress tests #7165

yliu138repo · 2024-02-28T10:21:37Z

yliu138repo
Feb 28, 2024

HI Folks, we have memory issues with Thanos receive horizontal autoscaling as they don't actually scale down after 16 hours from the load testing. we have HPA with min=2 and max=9. During the load testing, receive clusters scale up to 9 pods. As the below image implies, from the initial peak due to load testing around 17:00, the scaled up pods never gets scaled down. Instead ALL pods' memory usage remain flat afterwards, and never drops; Inital 2 pods remain even higher around 890Mi, and the memory limit is 2.5Gi! This has made the whole setup unusable since every time running load testing, we have to reinstall the whole thanos stacks to avoid any pod crashes. (We set each memory limit to 2.5Gi because we'd like to compare how well thanos autoscales, and try to keep the same configs with initial prometheus configuration)

We are using bitnami helm package for thanos on Openshift Cluster, and the image using is thanos:0.33.0-debian-11-r1 , thanos receive configuration

            - receive
            - '--log.level=debug'
            - '--log.format=logfmt'        
            - '--grpc-address=0.0.0.0:10901'
            - '--http-address=0.0.0.0:10902'
            - '--remote-write.address=0.0.0.0:19291'
            - '--receive.hashrings-algorithm=ketama'
            - '--objstore.config=$(OBJSTORE_CONFIG)'
            - '--tsdb.path=/var/thanos/receive'
            - '--label=replica="$(NAME)"'
            - '--label=receive="true"'
            - '--tsdb.retention=6h'
            - >-
              --receive.local-endpoint=$(NAME).obs-thanos-receive-headless.$(NAMESPACE).svc.cluster.local:10901

--tsdb.retention=6h was set up to keep the samples for only 6 hours, samples will get uploaded to a s3 bucket (MinIO)
we have default hashring with 1 tenant only

[
  {
    "hashring": "default",
    "endpoints": [
      "obs-thanos-receive-0.obs-thanos-receive-headless.NAMESPACE.svc.cluster.local:10901",
      "obs-thanos-receive-1.obs-thanos-receive-headless.NAMESPACE.svc.cluster.local:10901",
      "obs-thanos-receive-2.obs-thanos-receive-headless.NAMESPACE.svc.cluster.local:10901",
      "obs-thanos-receive-3.obs-thanos-receive-headless.NAMESPACE.svc.cluster.local:10901",
      "obs-thanos-receive-4.obs-thanos-receive-headless.NAMESPACE.svc.cluster.local:10901",
      "obs-thanos-receive-5.obs-thanos-receive-headless.NAMESPACE.svc.cluster.local:10901",
      "obs-thanos-receive-6.obs-thanos-receive-headless.NAMESPACE.svc.cluster.local:10901",
      "obs-thanos-receive-7.obs-thanos-receive-headless.NAMESPACE.svc.cluster.local:10901",
      "obs-thanos-receive-8.obs-thanos-receive-headless.NAMESPACE.svc.cluster.local:10901"
    ]
  }
]

the HPA setup

  maxReplicas: 9
  metrics:
  - resource:
      name: memory
      target:
        averageUtilization: 60
        type: Utilization
    type: Resource
  - resource:
      name: cpu
      target:
        averageUtilization: 100
        type: Utilization
    type: Resource
  minReplicas: 2
  scaleTargetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: obs-thanos-receive

We are not sure what configs we are missing so Thanos receive hogs high memory after long itme - 16 hours with --tsdb.retention=6h . In my understanding, we should be able to write the incoming data to the disk every 2 hours and send it to object storage periodically, and remove the samples from disk every 6 hours. Not sure of the reason to hold data in-memory. Please correct me if I am wrong. Thanks in advance

GiedriusS · 2024-02-28T15:23:47Z

GiedriusS
Feb 28, 2024
Maintainer

Are you sure that the blocks have been deleted? Maybe you profile your memory usage through pprof and upload it to pprof.me?

1 reply

yliu138repo Mar 1, 2024
Author

@GiedriusS thanks I am not sure. I was setting --tsdb.retention=6h, so assuming the memory would be released after some time. Pls correct me if I am wrong.

RE the pprof, can you pls direct me to an instruction how to use pprof memory against thanos receiver? Sorry new to this area

yliu138repo · 2024-03-10T12:11:18Z

yliu138repo
Mar 10, 2024
Author

@GiedriusS I have to reinstall the the whole stack for another round of test, the memory still remain high after 2 days of load test as above image suggests.

Not sure if I pprof the right way, here is the step

oc port-forward obs-thanos-receive-2 10902:10902 (looks like only 10902 can be porfiled)
go tool pprof -inuse_space http://localhost:10902/debug/pprof/heap

Please find the attached profile report for in-use.

alloc_space

Pod utilisation

oc adm top pod obs-thanos-receive-2
NAME                   CPU(cores)   MEMORY(bytes)   
obs-thanos-receive-2   0m           1225Mi

Looks like pod memory utilisation is much higher than in_use allocation. Please help to have a look, I have a feeling this might be related to the image I am using or the configuration.

1 reply

yliu138repo Mar 10, 2024
Author

@GiedriusS I am not able to upload to pprof.me due to company policy

yliu138repo · 2024-03-10T22:20:33Z

yliu138repo
Mar 10, 2024
Author

Another deployment with load test

This deployment with slightly lower memory usage, but also similar memory utilisation pattern. After 12 hours, the memory still remails high and the receiver doesnt autoscales down as indicated below

inuse_objects

top 10

(pprof) top10
Showing nodes accounting for 138534, 99.78% of 138840 total
Dropped 37 nodes (cum <= 694)
Showing top 10 nodes out of 56
      flat  flat%   sum%        cum   cum%
     32770 23.60% 23.60%      32770 23.60%  github.com/prometheus/prometheus/tsdb.newStripeSeries
     32768 23.60% 47.20%      32768 23.60%  errors.New (inline)
     21857 15.74% 62.95%      34605 24.92%  github.com/prometheus/prometheus/tsdb.(*stripeSeries).getOrSet
     20057 14.45% 77.39%      20057 14.45%  github.com/aws/aws-sdk-go/aws/endpoints.init
     12748  9.18% 86.57%      12748  9.18%  github.com/prometheus/prometheus/tsdb.(*seriesHashmap).set
      8192  5.90% 92.47%       8192  5.90%  google.golang.org/protobuf/internal/filetype.Builder.Build
      5461  3.93% 96.41%       5461  3.93%  github.com/prometheus/client_golang/prometheus.v2.NewDesc
      4681  3.37% 99.78%       4681  3.37%  regexp/syntax.(*parser).newRegexp (inline)
         0     0% 99.78%      32768 23.60%  fmt.Errorf
         0     0% 99.78%       4681  3.37%  github.com/go-openapi/strfmt.init

0 replies

yliu138repo · 2024-03-18T23:30:55Z

yliu138repo
Mar 18, 2024
Author

Hi @fpetkovski , I saw the great comment PR you have made #4329 for TSDB pruning. I have also put the configuration --tsdb.retention=6h, and there is currently only 1 default tenant. Not sure why but the pod remains high memory consistently after several days from load testing, and the pod crashes in between. Not sure if any misconfiguration with receiver. I saw from the tenant lifecycle management on https://thanos.io/tip/components/receive.md/#tenant-lifecycle-management

A Receiver will automatically decommission a tenant once new samples have not been seen for longer than the --tsdb.retention period configured for the Receiver. The tenant decommission process includes flushing all in-memory samples for that tenant to disk, sending all unsent blocks to S3, and removing the tenant TSDB from the filesystem. If a tenant receives new samples after being decommissioned, a new TSDB will be created for the tenant.

However seems that our receivers doesn't release any memory after several days from write stops.
Image we have been using :
thanos:0.33.0-debian-11-r1 and docker.io/thanosio/thanos:main-2024-03-08-5910ed6

Could you please help here?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thanos receiver hogs high memory utilisation for more than 16 hours after stress tests #7165

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Thanos receiver hogs high memory utilisation for more than 16 hours after stress tests #7165

yliu138repo Feb 28, 2024

Replies: 4 comments · 2 replies

GiedriusS Feb 28, 2024 Maintainer

yliu138repo Mar 1, 2024 Author

yliu138repo Mar 10, 2024 Author

yliu138repo Mar 10, 2024 Author

yliu138repo Mar 10, 2024 Author

Another deployment with load test

inuse_objects

yliu138repo Mar 18, 2024 Author

yliu138repo
Feb 28, 2024

Replies: 4 comments 2 replies

GiedriusS
Feb 28, 2024
Maintainer

yliu138repo Mar 1, 2024
Author

yliu138repo
Mar 10, 2024
Author

yliu138repo Mar 10, 2024
Author

yliu138repo
Mar 10, 2024
Author

yliu138repo
Mar 18, 2024
Author