New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reverting custom thread pool from #53 in watch mode #81
Conversation
…TaskStep.Execution.check, since it has often been observed to block for a while." This reverts commit c9c1364.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have really mixed feelings about this change, because:
- We have already hit issues in the past with overloading the Timer thread pool from Pipeline and blocking other activities as a result
- This thread pool has timeouts for threads so it can shrink if not used -- if we find we need to increase the Timer thread pool in Core to deal with extra load, it is unfortunately not so configured (perhaps a mistake on our part). I actually suspect that the
Timer
pool probably would benefit from a keep-alive time and allowing the pool to shrink and grow, since it contributes a fairly large static memory footprint (assuming 1 MB stack size and 10 threads, 10 MB even if only 1-2 tasks at a time are usually running). This may sound small, but we're increasingly trying to tune Jenkins for one-shot or instance-per-team use and it adds up. The cost is sometimes creating new threads if the pool size has shrunk too small (maybe use a system property for minimum size). - It seems like this could be trivially (and more correctly) handled via a
Terminator
or explicit shutdown hook. - It's not clear that the original issue is confirmed to come from this cause in the first place.
Because it was being used for purposes which now it is not.
It is pretty clear to me. The test failed until I introduced this fix, then it passed. |
Possibly. I seem to recall issues enabling that. Could be revisited.
Exactly why it is undesirable to introduce additional thread pools for plugin use when they are not strictly necessary: reusing an existing pool minimizes overhead. |
I am willing to do that if it helps get jenkinsci/ant-plugin#32 out the door, though I still think this PR is preferable, at least once #84 is reverted. |
On hold because of #84. |
Ah, okay, well that would have been good to know in the PR description.
I don't disagree with generally consolidating pools, just want to make we don't have thread-pool exhaustion risks. Revising that pool implementation would sort that out just fine. |
Sorry, thought that was clear from the PR description mentioning the upstream PR, and that PR depending on this one. Should have been more explicit.
Of course. The history: originally we did use When watch mode is active, this all changes. We do still run (bounded) tasks for every running |
src/main/java/org/jenkinsci/plugins/workflow/steps/durable_task/DurableTaskStep.java
Outdated
Show resolved
Hide resolved
Now active only in USE_WATCHING mode.
@@ -210,7 +211,10 @@ public FormValidation doCheckLabel(@QueryParameter String label) { | |||
public static long REMOTE_TIMEOUT = Integer.parseInt(System.getProperty(DurableTaskStep.class.getName() + ".REMOTE_TIMEOUT", "20")); | |||
|
|||
private static ScheduledThreadPoolExecutor threadPool; | |||
private static synchronized ScheduledThreadPoolExecutor threadPool() { | |||
private static synchronized ScheduledExecutorService threadPool() { | |||
if (USE_WATCHING) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this feature flag remains off by default.
Observed to cause anomalous behavior during a
RestartableJenkinsRule
test in theant
plugin after updating dependencies in jenkinsci/ant-plugin#32. I think this is because the thread pool is not getting shut down, so two copies of the build get loaded after restart and mayhem ensues. Most likely this would not happen in production systems, which should be throwing out the plugin class loaders across restarts. Anyway as of #63 we make far fewercheck
calls, so the rationale for avoiding the sharedTimer
in #53 is obsolete.