Prometheus 2.44.0 vs 2.30.3 - why is it so much faster? #12518

rejohnst · 2023-07-04T00:10:09Z

rejohnst
Jul 4, 2023

We run Prometheus in a sharded config (2 shards, 1 replica) in about 40 different k8s clusters that host a large production service. They're pretty big instances - each shard scraping about 3 million time-series per scrape cycle and evaluating ~300 rules. They also remote-write their metrics to a global Thanos cluster we run. We've been running all of the zones on Prometheus 2.30.3 since February 2022. With 2.30.3, each Prometheus instance was consuming ~15 vCPUS. We recently updated two of our largest clusters to 2.44.0. With 2.44.0 we were shocked (pleasantly) to find that the CPU utilization had dropped to around 5-6 vCPUs per instance. There's been a ton of changes obviously between 2.30.3 and 2.44.0. But I am curious what change or changes could have caused such a dramatic improvement in CPU utilization. I skimmed through the release notes for all the releases but nothing jumped out - or I just missed it.

Answered by bboreham

Jul 9, 2023

All profiles are for 30 seconds.
shard1-pprof-before.gz shows 282 CPU-seconds, so 9.2 CPUs active.
shard1-pprof-after.gz shows 218 CPU-seconds, 6.4 CPUs active.
This is a bit less than the 15->6 you first mentioned, but still a decent drop.

In before we have 190s in scrapePool.Sync, plus 65 in background garbage-collection.
In after we have 118s in scrapePool.Sync, plus 50 in garbage-collection.

The detail confirms that #12048 and #12084 gave big improvements.

Even after this, nearly all the time is going into producing Labels to show in the 'dropped targets' view, which I have proposed to restrict. When using Kubernetes this issue can be avoided by filtering targets using namespaces and …

View full answer

bwplotka · 2023-07-04T08:19:35Z

bwplotka
Jul 4, 2023
Maintainer

Quite many, but you are right, we could outline those and be proud of it more (: It sounds like it deserves a blog post (: WDYT? @bboreham

Would like to give us some screenshots of Heap (mem) and CPU graphs before & after? (:

11 replies

rejohnst Jul 5, 2023
Author

We updated another large cluster this morning. I've attached CPU before and after CPU profiling data from both shards. One pair of attachments has the raw text outpput. The other pair has the file produced by topproto.

Below is the before/after output of top -cum from shard 0:

BEFORE (2.30.3)

(pprof) top -cum
Showing nodes accounting for 28.31s, 10.18% of 278.01s total
Dropped 943 nodes (cum <= 1.39s)
Showing top 10 nodes out of 151
      flat  flat%   sum%        cum   cum%
         0     0%     0%    191.30s 68.81%  github.com/prometheus/prometheus/scrape.(*Manager).reload.func1
     0.03s 0.011% 0.011%    191.30s 68.81%  github.com/prometheus/prometheus/scrape.(*scrapePool).Sync
     1.79s  0.64%  0.65%    176.04s 63.32%  github.com/prometheus/prometheus/scrape.targetsFromGroup
     0.11s  0.04%  0.69%    164.40s 59.13%  runtime.systemstack
     1.51s  0.54%  1.24%    139.98s 50.35%  runtime.mallocgc
     0.21s 0.076%  1.31%    119.50s 42.98%  github.com/prometheus/prometheus/scrape.populateLabels
    22.81s  8.20%  9.52%    119.16s 42.86%  runtime.scanobject
     1.72s  0.62% 10.14%     92.48s 33.26%  github.com/prometheus/prometheus/pkg/labels.(*Builder).Labels
     0.08s 0.029% 10.17%     75.67s 27.22%  runtime.makeslice
     0.05s 0.018% 10.18%     67.35s 24.23%  github.com/prometheus/prometheus/pkg/relabel.Process

AFTER (2.44.0)

(pprof) top -cum
Showing nodes accounting for 5.95s, 3.62% of 164.40s total
Dropped 1091 nodes (cum <= 0.82s)
Showing top 10 nodes out of 153
      flat  flat%   sum%        cum   cum%
         0     0%     0%    108.78s 66.17%  github.com/prometheus/prometheus/scrape.(*Manager).reload.func1
     0.04s 0.024% 0.024%    108.78s 66.17%  github.com/prometheus/prometheus/scrape.(*scrapePool).Sync
     1.50s  0.91%  0.94%    108.20s 65.82%  github.com/prometheus/prometheus/scrape.TargetsFromGroup
     0.41s  0.25%  1.19%     81.63s 49.65%  github.com/prometheus/prometheus/scrape.PopulateLabels
     0.54s  0.33%  1.51%     63.65s 38.72%  github.com/prometheus/prometheus/model/labels.(*Builder).Labels
     0.05s  0.03%  1.55%     60.26s 36.65%  runtime.systemstack
     0.06s 0.036%  1.58%     44.21s 26.89%  golang.org/x/exp/slices.SortFunc[...] (inline)
     0.84s  0.51%  2.09%     44.15s 26.86%  golang.org/x/exp/slices.pdqsortLessFunc[...]
     2.51s  1.53%  3.62%     40.71s 24.76%  runtime.scanobject
         0     0%  3.62%     33.76s 20.54%  runtime.gcBgMarkWorker

NP-CHI4-shard0-pprof-cpu-after.txt
NP-CHI4-shard1-pprof-cpu-after.txt
NP-CHI4-shard0-pprof-cpu-before.txt
NP-CHI4-shard1-pprof-cpu-before.txt

NP-CHI4-shard0-after-profile001.pb.gz
NP-CHI4-shard0-before-profile001.pb.gz
NP-CHI4-shard1-after-profile001.pb.gz
NP-CHI4-shard1-before-profile001.pb.gz

rejohnst Jul 5, 2023
Author

The above is the CPU profiling. I can grab you a before/after heap profile when we do the next cluster.

bboreham Jul 6, 2023
Maintainer

Sorry there is some mismatch between what you are doing to produce profiles and what I understand.
I would do curl 1.2.3.4:9090/debug/pprof/profile > file.pprof.gz and upload that.

Based on the top few lines from the text I would point to #12048 and #12084.

rejohnst Jul 6, 2023
Author

Got it - I'll grab new profiles using the approachyou suggested when we do our next cluster.

rejohnst Jul 6, 2023
Author

Ok - here's before/after profiles from a cluster we updated today:

shard0-pprof-after.gz
shard0-pprof-before.gz
shard1-pprof-after.gz
shard1-pprof-before.gz

bboreham · 2023-07-09T08:29:21Z

bboreham
Jul 9, 2023
Maintainer

All profiles are for 30 seconds.
shard1-pprof-before.gz shows 282 CPU-seconds, so 9.2 CPUs active.
shard1-pprof-after.gz shows 218 CPU-seconds, 6.4 CPUs active.
This is a bit less than the 15->6 you first mentioned, but still a decent drop.

In before we have 190s in scrapePool.Sync, plus 65 in background garbage-collection.
In after we have 118s in scrapePool.Sync, plus 50 in garbage-collection.

The detail confirms that #12048 and #12084 gave big improvements.

Even after this, nearly all the time is going into producing Labels to show in the 'dropped targets' view, which I have proposed to restrict. When using Kubernetes this issue can be avoided by filtering targets using namespaces and selectors in preference to drop rules.

shard0 is very similar; 9.4 vs 7.2 CPUs.

1 reply

rejohnst Jul 9, 2023
Author

Thank you for looking at this Bryan. Let me know if you'd like to see any other before/after profiles and if you want me to capture the profiles differently or for longer. Needless to say we're pretty happy here with the improvements as we were previously CPU constrained on the control-plane servers where we run the Prometheus instances.

This comment was marked as spam.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prometheus 2.44.0 vs 2.30.3 - why is it so much faster? #12518

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 12 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

This comment was marked as spam.

Select a reply

Prometheus 2.44.0 vs 2.30.3 - why is it so much faster? #12518

rejohnst Jul 4, 2023

Replies: 3 comments · 12 replies

bwplotka Jul 4, 2023 Maintainer

rejohnst Jul 5, 2023 Author

rejohnst Jul 5, 2023 Author

bboreham Jul 6, 2023 Maintainer

rejohnst Jul 6, 2023 Author

rejohnst Jul 6, 2023 Author

bboreham Jul 9, 2023 Maintainer

rejohnst Jul 9, 2023 Author

This comment was marked as spam.

rejohnst
Jul 4, 2023

Replies: 3 comments 12 replies

bwplotka
Jul 4, 2023
Maintainer

rejohnst Jul 5, 2023
Author

rejohnst Jul 5, 2023
Author

bboreham Jul 6, 2023
Maintainer

rejohnst Jul 6, 2023
Author

rejohnst Jul 6, 2023
Author

bboreham
Jul 9, 2023
Maintainer

rejohnst Jul 9, 2023
Author