Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup time for parallelize worker testing is excessive for singe test case runs #36807

Closed
dhh opened this issue Jul 29, 2019 · 12 comments · Fixed by #36873
Closed

Setup time for parallelize worker testing is excessive for singe test case runs #36807

dhh opened this issue Jul 29, 2019 · 12 comments · Fixed by #36873
Assignees
Milestone

Comments

@dhh
Copy link
Member

dhh commented Jul 29, 2019

When running tests with parallelized workers, we seem to be doing the full database setup/schema load/teardown/fixtures loading even when only running a single test suite rather than all the suites.

On a 10-core machine, that means doing that 10 times (one database per core), which takes a long time, especially when you're just trying to iterate over a single test case. When parallelize workers is turned off, there's no setup/teardown being done. Just resetting the fixtures.

For my 10-core iMac Pro, running a 2-test test suite takes 0.4s without parallelize workers and 6s with parallelize workers.

When I run the entire suite, it's clearly better to run with parallelize workers (13s vs 38s!), but when running a single suite, it's a pain.

It seems like we should differentiate between "./bin/rails test", which should run with full parallelization, and "./bin/rails test tests/model/case.rb", which should not.

@dhh dhh added this to the 6.0.0 milestone Jul 29, 2019
@dhh
Copy link
Member Author

dhh commented Jul 29, 2019

cc @jhawthorn

@dhh
Copy link
Member Author

dhh commented Jul 29, 2019

Thinking this might be as simple as avoiding the parallelization path when the test runner is being asked to run just one case. You might still be paying the high tax early on in your app development on a many core machine, but you could always tweak the default settings in test_helper.rb while that's the case.

@ghost
Copy link

ghost commented Jul 29, 2019

What happened to Convention over configuration?

@dhh
Copy link
Member Author

dhh commented Jul 29, 2019

It's exactly the refinement of the convention that this issue is addressing: run parallelized workers with setup cost when the pay off is worth it, don't when it's not, and let the user decide when it's ambiguous (tiny app / many cores).

@ghost
Copy link

ghost commented Jul 29, 2019

Wouldn't the sensible convention for a newly generated app be to skip parallelization in either case?

@dhh
Copy link
Member Author

dhh commented Jul 29, 2019

No, because the setup cost with a tiny schema is likely to be tiny too. And that would then require changing configuration once the tip-over point is reached, which is likely to be forgotten. The hurt here is specifically with large app / many cores / just running a few test cases in an iterative loop.

@ghost
Copy link

ghost commented Jul 29, 2019

Okay, seems logical, since it's difficult to compute the tip-over point, it would have to be remembered by a human to change it. Sorry for interrupting.

@jhawthorn jhawthorn self-assigned this Jul 29, 2019
@jhawthorn
Copy link
Member

jhawthorn commented Jul 29, 2019

I've also run into slow setup on machines with many cores. I want to investigate only re-initializing the test databases when the schema changes (which hopefully makes parallelization setup fast enough to be a benefit even for a single suite). My initial thinking is to store something to track this in ar_internal_metadata. I'll try to test the idea this week.

@dhh
Copy link
Member Author

dhh commented Jul 29, 2019

That would be even better, @jhawthorn ❤️

@alexpooley
Copy link

We can lazily setup each database instead of having it ready up front. In practice this means you only run the after fork calls after popping the first job from the queue.

Partial solution at PR #36809

I was unable to test on 10 cores. Donations welcome 😄

@jhawthorn
Copy link
Member

jhawthorn commented Aug 9, 2019

This should hopefully be much better now with #36873 (also backported to 6-0-stable in 4f912de). The first run of the test suite in parallel (and each first after a schema change) will create the databases and load the schema, successive runs will reuse the previously created databases.

@dhh
Copy link
Member Author

dhh commented Aug 9, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants