ThreadPool concurrency refactoring #2220

wjordan · 2020-04-10T22:20:56Z

Description

This PR refactors some tricky concurrency parts of TestThreadPool, in order to make the unit tests more stable (all tests pass on jruby), faster (no more 1-second sleeps), and a bit more readable. Many of the tests now use a subclass of ThreadPool that overrides #<< to wait on the existing @not_full condition variable, to ensure work begins executing on a worker thread before continuing the test.

There are also some small changes to the ThreadPool class itself:

Add #with_mutex to make it easier to chain multiple @mutex.synchronize-wrapped methods together, and to extend existing mutex-wrapped methods in unit tests.

Wait for initially-spawned threads to enter the wait loop before returning from #initialize. This fixes a subtle edge-case concurrency bug mostly affecting unit-test reliability, demonstrated by the following simple test that fails on master:

puma/test/test_thread_pool.rb

Lines 239 to 242 in 8d61027

    
           def test_waiting_on_startup 
        
             pool = new_pool(1, 2) 
        
             assert_equal 1, pool.waiting 
        
           end

Fix a concurrency bug in #trim where a trim could still be requested even when enough work to utilize all threads has been queued but not yet picked up by workers (e.g., waiting > 0 but waiting - todo.size == 0). (I believe this was the cause of some jruby test flakiness in some trim unit tests.)
Refactor the main wait-loop in spawn_thread to make it a bit simpler and easier to read and reason about.

Your checklist for this pull request

I have reviewed the guidelines for contributing to this repository.
I have added an entry to History.md if this PR fixes a bug or adds a feature. If it doesn't need an entry to HISTORY.md, I have added [changelog skip] the pull request title.
I have added appropriate tests if this PR fixes a bug or adds a feature.
My pull request is 100 lines added/removed or less so that it can be easily reviewed.
If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
I have updated the documentation accordingly.
All new and existing tests passed, including Rubocop.

- Wait for threads to enter waiting loop on ThreadPool startup - Simplify #spawn_thread inner threadpool loop - Refactor TestThreadPool to make tests faster and more stable

nateberkopec · 2020-04-12T08:18:59Z

History.md

@@ -36,6 +36,7 @@
  * Simplify `Runner#start_control` URL parsing (#2111)
  * Removed the IOBuffer extension and replaced with Ruby (#1980)
  * Update `Rack::Handler::Puma.run` to use `**options` (#2189)
+  * ThreadPool concurrency refactoring (#2220)


I think the trim bugfix could also be mentioned

nateberkopec · 2020-04-12T08:20:13Z

I'll have to find some time to review this proper, but the test changes are gorgeous 😍

nateberkopec · 2020-04-14T23:28:26Z

lib/puma/thread_pool.rb

@@ -99,20 +102,13 @@ def spawn_thread
        while true
          work = nil

-          continue = true
-
          mutex.synchronize do


Clever change and much more readable

test/test_thread_pool.rb

nateberkopec · 2020-04-14T23:36:27Z

test/test_thread_pool.rb

-        end
-        Thread.pass until finish
+        pool.signal
+        sleep


typo or are you triggering a context switch?

Well, "fix" might overstate it, but we were able to get them passing by backporting test skips that are currently in Puma master upstream. With these skips, not only does tag `4.3.3.` all pass, but so does our custom branch, so we can be pretty confident we haven't broken anything vs. stock Puma. Some notes on test runs: - Invoke with `bundle exec rake test --trace` - Ctrl-C'ing the tests usually left a zombified version running in the background, which would *cause failures on subsequent runs*. Ensure after killing the a test run that no test processes remain in `jps -lm` - `TestThreadPool` tests are flaky in this release, though this was a known issue, and has been recently fixed here: puma#2220

ThreadPool concurrency refactoring

e1daf1b

- Wait for threads to enter waiting loop on ThreadPool startup - Simplify #spawn_thread inner threadpool loop - Refactor TestThreadPool to make tests faster and more stable

wjordan force-pushed the thread_pool_refactor branch from 8d61027 to e1daf1b Compare April 10, 2020 22:23

nateberkopec added maintenance bug labels Apr 12, 2020

nateberkopec reviewed Apr 12, 2020

View reviewed changes

nateberkopec added the waiting-for-review Waiting on review from anyone label Apr 12, 2020

nateberkopec reviewed Apr 14, 2020

View reviewed changes

test/test_thread_pool.rb Show resolved Hide resolved

nateberkopec reviewed Apr 14, 2020

View reviewed changes

test/test_thread_pool.rb

end

Thread.pass until finish

pool.signal

sleep

Copy link

Member

nateberkopec Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo or are you triggering a context switch?

Merge branch 'master' into thread_pool_refactor

71413e2

nateberkopec merged commit b16d8cb into puma:master Apr 14, 2020

olivierbellone mentioned this pull request Jul 2, 2021

Fix deadlock issue in thread pool #2656

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ThreadPool concurrency refactoring #2220

ThreadPool concurrency refactoring #2220

wjordan commented Apr 10, 2020 •

edited

nateberkopec Apr 12, 2020

nateberkopec commented Apr 12, 2020

nateberkopec Apr 14, 2020

nateberkopec Apr 14, 2020

	def test_waiting_on_startup
	pool = new_pool(1, 2)
	assert_equal 1, pool.waiting
	end

ThreadPool concurrency refactoring #2220

ThreadPool concurrency refactoring #2220

Conversation

wjordan commented Apr 10, 2020 • edited

Description

Your checklist for this pull request

nateberkopec Apr 12, 2020

Choose a reason for hiding this comment

nateberkopec commented Apr 12, 2020

nateberkopec Apr 14, 2020

Choose a reason for hiding this comment

nateberkopec Apr 14, 2020

Choose a reason for hiding this comment

wjordan commented Apr 10, 2020 •

edited