Reduce number of core threads in HTTPServer to one #786

fstab · 2022-05-19T21:40:40Z

In a typical scenario with a Prometheus server scraping every 15 seconds, one thread should be enough in HTTPServer. Reduce the number of core threads from 5 to 1.

Signed-off-by: Fabian Stäber <fabian@fstab.de>

brian-brazil · 2022-05-20T05:25:07Z

This PR violates the OpenMetrics spec and should be reverted. Concurrent exposition must be supported, as it is very common such as a HA pair. This PR will break users, failing scrapes and adding artifacts to their graphs.

…

On Thu 19 May 2022, 22:44 Fabian Stäber, ***@***.***> wrote: Merged #786 <#786> into master. — Reply to this email directly, view it on GitHub <#786 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABWJG5VZRQO32QDBHYVRU6LVK2Y5DANCNFSM5WNRCL2A> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

fstab · 2022-05-20T07:51:50Z

Thanks for keeping an eye on PRs! In this case I think there is just a misunderstanding with the weird terminology of ThreadPoolExecutor:

corePoolSize: the number of threads to keep in the pool, even if they are idle
maximumPoolSize the maximum number of threads to allow in the pool

This PR sets the corePoolSize to 1, but leaves the maximumPoolSize at 5. It's still possible to run 5 requests in parallel, but the executor won't keep 5 threads running when they're idle.

brian-brazil · 2022-05-20T09:40:43Z

Thanks for the clarification, that makes perfect sense to do.

@dave2wave

### Motivation prometheus client 0.16.0 contains some approvements that we can benefit from. Thanks for @dave2wave @michaeljmarshall the reminder and pointing out. > [ENHANCEMENT] Reduce the number of core threads in HTTPServer from 5 to 1. The HTTPServer will still start up to 5 threads on demand if there are parallel requests, but it will use only 1 thread as long as requests are sequential (prometheus/client_java#786). [ENHANCEMENT] Optimize metric name sanitization: Replace the regular expression with a hard-coded optimized algorithm to improve performance (prometheus/client_java#777). Thanks @fwbrasil See https://github.com/prometheus/client_java/releases ### Modifications Bump prometheus client version from 0.15.0 to 0.16.0 ### Documentation Check the box below or label this PR directly. Need to update docs? - [x] `doc-not-needed` dependency updates, no need doc

prometheus/client_java#786 reduced the core pool size to 1 for the http executor size. This had adverse effects in our environment leading to connectivity issues to metrics port This patch overrides that behaviour reverting back to 5 persistent threads in the executor pool

ctrlaltluc · 2022-07-21T07:26:54Z

@fstab we have a setup using the Prometheus client including this change, and we started to experience some inconveniences.

Our setup is Kafka on Kubernetes, running prometheus/jmx_exporter as an agent attached to the Kafka process. Our scrape duration is about 12 seconds.
The Kafka brokers are fronted by Envoy reverse proxies, which have periodic health-checks (with 1s timeouts) calling the /-/healthy endpoint provided by the JMX exporter.

After upgrading the Prometheus client to include this fix, we started experiencing health-check network timeouts. Investigations showed us that the latency when calling /-/healthy increases from 10-60ms (with the core pool size of 5) to 10ms-15s (with the core pool size of 1). Since this change is the only one causing these timeouts, we attribute this increase to the overhead of creating a new thread, if the pool has no available one to serve /-/healthy.

We normally thought about increasing the timeout, which is one of the options beside rolling back the Prometheus Java client version to 0.16.
I just wanted to raise this here, because the latency distribution tail increased from under 100ms to 15s seems pretty big.

fstab · 2022-07-22T13:58:18Z

@ctrlaltluc thanks a lot for pointing this out. This was a bug. Apparently the description of corePoolSize and maximumPoolSize is misleading, as explained in this Blog post.

I pushed a fix, switching to a cached thread pool executor.

ctrlaltluc · 2022-07-26T08:59:20Z

@fstab nice guy, Sun!

This explains the behavior of /-/healthy timing out on our side too. The health-checks were served, but when the core thread managed to get the task from the queue.

Thanks!

@dave2wave

### Motivation prometheus client 0.16.0 contains some approvements that we can benefit from. Thanks for @dave2wave @michaeljmarshall the reminder and pointing out. > [ENHANCEMENT] Reduce the number of core threads in HTTPServer from 5 to 1. The HTTPServer will still start up to 5 threads on demand if there are parallel requests, but it will use only 1 thread as long as requests are sequential (prometheus/client_java#786). [ENHANCEMENT] Optimize metric name sanitization: Replace the regular expression with a hard-coded optimized algorithm to improve performance (prometheus/client_java#777). Thanks @fwbrasil See https://github.com/prometheus/client_java/releases ### Modifications Bump prometheus client version from 0.15.0 to 0.16.0 ### Documentation Check the box below or label this PR directly. Need to update docs? - [x] `doc-not-needed` dependency updates, no need doc (cherry picked from commit 948000b)

prometheus/client_java#786 reduced the core pool size to 1 for the http executor size. This had adverse effects in our environment leading to connectivity issues to metrics port This patch overrides that behaviour reverting back to 5 persistent threads in the executor pool

Reduce number of core threads in HTTPServer to one

c0bf6d9

Signed-off-by: Fabian Stäber <fabian@fstab.de>

fstab merged commit 2f31b96 into master May 19, 2022

fstab deleted the core-threads branch May 19, 2022 21:44

fstab mentioned this pull request May 19, 2022

[drafting] collector#collect performance improve #782

Closed

michaeljmarshall mentioned this pull request Jul 14, 2022

Bump prometheus client version from 0.5.0 to 0.15.0 apache/pulsar#13785

Merged

1 task

shoothzj mentioned this pull request Jul 14, 2022

Bump prometheus client version from 0.15.0 to 0.16.0 apache/pulsar#16591

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce number of core threads in HTTPServer to one #786

Reduce number of core threads in HTTPServer to one #786

fstab commented May 19, 2022

brian-brazil commented May 20, 2022 via email

fstab commented May 20, 2022 •

edited

brian-brazil commented May 20, 2022

ctrlaltluc commented Jul 21, 2022 •

edited

fstab commented Jul 22, 2022

ctrlaltluc commented Jul 26, 2022

Reduce number of core threads in HTTPServer to one #786

Reduce number of core threads in HTTPServer to one #786

Conversation

fstab commented May 19, 2022

brian-brazil commented May 20, 2022 via email

fstab commented May 20, 2022 • edited

brian-brazil commented May 20, 2022

ctrlaltluc commented Jul 21, 2022 • edited

fstab commented Jul 22, 2022

ctrlaltluc commented Jul 26, 2022

fstab commented May 20, 2022 •

edited

ctrlaltluc commented Jul 21, 2022 •

edited