New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#11949 trial: Pass parallel trial worker index as env variable #11950
base: trunk
Are you sure you want to change the base?
Conversation
Currently tests run under parallel trial do not know about each other. This makes it difficult to coordinate access to shared resources (e.g. database) that take a long time to construct. For example, consider a case of 10000 tests that access a database. Currently tests can only be run in sequence, each using the same database and cleaning up after itself. Running tests in parallel would involve constructing 10000 separate databases which is infeasible. However, if each test knew the index of the parallel Twisted trial worker it runs under, then only small number of databases would need to be created and tests would use them like in serial case. E.g. when running tests using trial -j16, only 16 databases would need to be created. This is done by exposing environment variable TWISTED_TRIAL_PARALLEL_INDEX to the tests run under the parallel runner.
please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for the changes.
They look good. Great PR.
I left only a few minor comments.
self.assertEqual(os.pathsep.join(sys.path), environments[0]["PYTHONPATH"]) | ||
parallel_indexes = set(e["TWISTED_TRIAL_PARALLEL_INDEX"] for e in environments) | ||
self.assertEqual(set(["0", "1", "2", "3"]), parallel_indexes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a small comment
parrallel_indexes vs parallelIndexes for pedantic code standard
and also check that PYTHONPATH is present in all environments
self.assertEqual(os.pathsep.join(sys.path), environments[0]["PYTHONPATH"]) | |
parallel_indexes = set(e["TWISTED_TRIAL_PARALLEL_INDEX"] for e in environments) | |
self.assertEqual(set(["0", "1", "2", "3"]), parallel_indexes) | |
pythonpathEnvs = [e["PYTHONPATH" for e in environments] | |
self.assertEqual([os.pathsep.join(sys.path)] * 4, pythonpathEnvs) | |
parallelIndexes = set(e["TWISTED_TRIAL_PARALLEL_INDEX"] for e in environments) | |
self.assertEqual(set(["0", "1", "2", "3"]), parallelIndexes) |
@@ -246,3 +246,7 @@ then deletes that schema in the tearDown function, your tests will behave in an | |||
unpredictable fashion as they tromp upon each other if they have their own | |||
schema. And this won't actually indicate a real error in your code, merely a | |||
testing-specific race-condition. | |||
|
|||
Trial provides `TWISTED_TRIAL_PARALLEL_INDEX` environment variable to the tests when run in parallel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this documentation is good enough.
Just a curiosity and for future reference,
Do you know how other testing frameworks are dealing with this issue?
Do they also set an env variable?
Is there some convention for the variable name?
Are there any other things that can help with shared resources?
I have approved the changes and the PR looks good to me. I see in #11949 that Glyph and Jean Paul are not happy with the way this is implemented. I will leave this unmerged waiting for a consensus on how to get this done. Regards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on #11949 , I don't think that this is a good API surface for us to support, and I think the documentation is insufficient. I'd like to see more explanation of whether this is actually necessary or not yet.
There was already some discussion on the ticket I don't see addressed in the review.
Sorry about that. thanks for the follow up here. |
No worries. We should discuss stuff on tickets more, but so many are just perfunctory CI / types / test housekeeping these days that it is an understandable habit to fall in to :) |
Scope and purpose
Fixes #11949
Currently tests run under parallel trial do not know about each other. This makes it difficult to coordinate access to shared resources (e.g. database) that take a long time to construct.
For example, consider a case of 10000 tests that access a database. Currently tests can only be run in sequence, each using the same database and cleaning up after itself. Running tests in parallel would involve constructing 10000 separate databases which is infeasible. However, if each test knew the index of the parallel Twisted trial worker it runs under, then only small number of databases would need to be created and tests would use them like in serial case. E.g. when running tests using trial -j16, only 16 databases would need to be created.
This is done by exposing environment variable
TWISTED_TRIAL_PARALLEL_INDEX to the tests run under the parallel runner.