Reverting custom thread pool from #53 in watch mode #81

jglick · 2018-10-19T01:44:34Z

Observed to cause anomalous behavior during a RestartableJenkinsRule test in the ant plugin after updating dependencies in jenkinsci/ant-plugin#32. I think this is because the thread pool is not getting shut down, so two copies of the build get loaded after restart and mayhem ensues. Most likely this would not happen in production systems, which should be throwing out the plugin class loaders across restarts. Anyway as of #63 we make far fewer check calls, so the rationale for avoiding the shared Timer in #53 is obsolete.

…TaskStep.Execution.check, since it has often been observed to block for a while." This reverts commit c9c1364.

…ble-task-step-plugin#81.

svanoort

I have really mixed feelings about this change, because:

We have already hit issues in the past with overloading the Timer thread pool from Pipeline and blocking other activities as a result
This thread pool has timeouts for threads so it can shrink if not used -- if we find we need to increase the Timer thread pool in Core to deal with extra load, it is unfortunately not so configured (perhaps a mistake on our part). I actually suspect that the Timer pool probably would benefit from a keep-alive time and allowing the pool to shrink and grow, since it contributes a fairly large static memory footprint (assuming 1 MB stack size and 10 threads, 10 MB even if only 1-2 tasks at a time are usually running). This may sound small, but we're increasingly trying to tune Jenkins for one-shot or instance-per-team use and it adds up. The cost is sometimes creating new threads if the pool size has shrunk too small (maybe use a system property for minimum size).
It seems like this could be trivially (and more correctly) handled via a Terminator or explicit shutdown hook.
It's not clear that the original issue is confirmed to come from this cause in the first place.

jglick · 2018-10-24T13:12:34Z

We have already hit issues in the past with overloading the Timer thread pool from Pipeline

Because it was being used for purposes which now it is not.

It's not clear that the original issue is confirmed to come from this cause in the first place.

It is pretty clear to me. The test failed until I introduced this fix, then it passed.

jglick · 2018-10-24T14:08:53Z

it is unfortunately not so configured (perhaps a mistake on our part)

Possibly. I seem to recall issues enabling that. Could be revisited.

assuming 1 MB stack size and 10 threads, 10 MB even if only 1-2 tasks at a time are usually running

Exactly why it is undesirable to introduce additional thread pools for plugin use when they are not strictly necessary: reusing an existing pool minimizes overhead.

jglick · 2018-10-24T14:10:46Z

this could be trivially (and more correctly) handled via a Terminator or explicit shutdown hook

I am willing to do that if it helps get jenkinsci/ant-plugin#32 out the door, though I still think this PR is preferable, at least once #84 is reverted.

jglick · 2018-10-24T16:05:47Z

On hold because of #84.

svanoort · 2018-10-26T11:32:16Z

It is pretty clear to me. The test failed until I introduced this fix, then it passed.

Ah, okay, well that would have been good to know in the PR description.

Exactly why it is undesirable to introduce additional thread pools for plugin use when they are not strictly necessary: reusing an existing pool minimizes overhead.

I don't disagree with generally consolidating pools, just want to make we don't have thread-pool exhaustion risks. Revising that pool implementation would sort that out just fine.

jglick · 2018-10-26T13:27:24Z

that would have been good to know in the PR description

Sorry, thought that was clear from the PR description mentioning the upstream PR, and that PR depending on this one. Should have been more explicit.

want to make [sure] we don't have thread-pool exhaustion risks

Of course. The history: originally we did use Timer for this purpose; we did encounter thread-pool exhaustion. But that was because we were flooding the pool with lots of requests to check (each of which involves a few remote calls)—for every running sh step, there would be a task scheduled at least every 15s, and when the process was actively emitting output, this could be much more frequent. So they were all piling up on top of each other as soon as you ran any significant load. Besides introducing per-check timeouts, one fix was to introduce a separate thread pool, which in a loaded system could still become saturated, but at least unrelated parts of Jenkins needing Timer were unaffected.

When watch mode is active, this all changes. We do still run (bounded) tasks for every running sh step, but only once every 5m, as a way to make sure the build eventually aborts in case the watcher goes south. So even with lots of activity, we are unlikely to be hogging the system thread pool. Under this scenario, launching an extra couple dozen threads that are almost always idle is wasteful.

src/main/java/org/jenkinsci/plugins/workflow/steps/durable_task/DurableTaskStep.java

Now active only in USE_WATCHING mode.

stale

…e-task-step-plugin into standard-thread-pool

jglick · 2021-11-22T20:56:37Z

src/main/java/org/jenkinsci/plugins/workflow/steps/durable_task/DurableTaskStep.java

@@ -210,7 +211,10 @@ public FormValidation doCheckLabel(@QueryParameter String label) {
    public static long REMOTE_TIMEOUT = Integer.parseInt(System.getProperty(DurableTaskStep.class.getName() + ".REMOTE_TIMEOUT", "20"));

    private static ScheduledThreadPoolExecutor threadPool;
-    private static synchronized ScheduledThreadPoolExecutor threadPool() {
+    private static synchronized ScheduledExecutorService threadPool() {
+        if (USE_WATCHING) {


Note that this feature flag remains off by default.

Revert "Use a dedicated thread pool rather than Timer.get for Durable…

103ba99

…TaskStep.Execution.check, since it has often been observed to block for a while." This reverts commit c9c1364.

jglick requested review from abayer, rsandell, dwnusbaum, joseblas and svanoort October 19, 2018 01:44

This was referenced Oct 19, 2018

[JENKINS-54133] Encode console notes on master for JEP-210 compatibility jenkinsci/ant-plugin#32

Merged

Use the standard Timer rather than our own ScheduledExecutorService jenkinsci/durable-task-plugin#84

Merged

jglick added a commit to jglick/ant-plugin that referenced this pull request Oct 19, 2018

Picking up jenkinsci/durable-task-plugin#84 & jenkinsci/workflow-dura…

f5c9b51

…ble-task-step-plugin#81.

dwnusbaum previously approved these changes Oct 22, 2018

View reviewed changes

svanoort previously requested changes Oct 22, 2018

View reviewed changes

jglick added the on-hold label Oct 24, 2018

jglick commented Oct 29, 2018

View reviewed changes

src/main/java/org/jenkinsci/plugins/workflow/steps/durable_task/DurableTaskStep.java Outdated Show resolved Hide resolved

jglick mentioned this pull request Oct 29, 2018

Shut down threadPool when Jenkins shuts down #87

Merged

Merge branch 'master' into standard-thread-pool

1b6a34a

Now active only in USE_WATCHING mode.

jglick changed the title ~~Reverting custom thread pool from #53~~ Reverting custom thread pool from #53 in watch mode Apr 12, 2019

jglick removed the on-hold label Apr 12, 2019

jglick requested a review from dwnusbaum April 12, 2019 16:23

jglick added 2 commits June 3, 2019 20:13

Merge branch 'master' into standard-thread-pool

d2def1b

Merge branch 'master' of https://github.com/jenkinsci/workflow-durabl…

bd90be9

…e-task-step-plugin into standard-thread-pool

jglick requested a review from car-roll November 22, 2021 20:55

jglick added the bug label Nov 22, 2021

jglick commented Nov 22, 2021

View reviewed changes

car-roll approved these changes Nov 23, 2021

View reviewed changes

car-roll merged commit f832bc1 into jenkinsci:master Nov 23, 2021

jglick deleted the standard-thread-pool branch November 23, 2021 14:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reverting custom thread pool from #53 in watch mode #81

Reverting custom thread pool from #53 in watch mode #81

jglick commented Oct 19, 2018 •

edited

svanoort left a comment

jglick commented Oct 24, 2018

jglick commented Oct 24, 2018

jglick commented Oct 24, 2018

jglick commented Oct 24, 2018

svanoort commented Oct 26, 2018

jglick commented Oct 26, 2018

jglick Nov 22, 2021

Reverting custom thread pool from #53 in watch mode #81

Reverting custom thread pool from #53 in watch mode #81

Conversation

jglick commented Oct 19, 2018 • edited

svanoort left a comment

Choose a reason for hiding this comment

jglick commented Oct 24, 2018

jglick commented Oct 24, 2018

jglick commented Oct 24, 2018

jglick commented Oct 24, 2018

svanoort commented Oct 26, 2018

jglick commented Oct 26, 2018

jglick Nov 22, 2021

Choose a reason for hiding this comment

jglick commented Oct 19, 2018 •

edited