Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#11949 trial: Pass parallel trial worker index as env variable #11950

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

p12tic
Copy link
Contributor

@p12tic p12tic commented Sep 3, 2023

Scope and purpose

Fixes #11949

Currently tests run under parallel trial do not know about each other. This makes it difficult to coordinate access to shared resources (e.g. database) that take a long time to construct.

For example, consider a case of 10000 tests that access a database. Currently tests can only be run in sequence, each using the same database and cleaning up after itself. Running tests in parallel would involve constructing 10000 separate databases which is infeasible. However, if each test knew the index of the parallel Twisted trial worker it runs under, then only small number of databases would need to be created and tests would use them like in serial case. E.g. when running tests using trial -j16, only 16 databases would need to be created.

This is done by exposing environment variable
TWISTED_TRIAL_PARALLEL_INDEX to the tests run under the parallel runner.

Currently tests run under parallel trial do not know about each other.
This makes it difficult to coordinate access to shared resources (e.g.
database) that take a long time to construct.

For example, consider a case of 10000 tests that access a database.
Currently tests can only be run in sequence, each using the same
database and cleaning up after itself. Running tests in parallel would
involve constructing 10000 separate databases which is infeasible.
However, if each test knew the index of the parallel Twisted trial
worker it runs under, then only small number of databases would need to
be created and tests would use them like in serial case. E.g. when
running tests using trial -j16, only 16 databases would need to be
created.

This is done by exposing environment variable
TWISTED_TRIAL_PARALLEL_INDEX to the tests run under the parallel runner.
@p12tic
Copy link
Contributor Author

p12tic commented Sep 3, 2023

please review

adiroiban
adiroiban previously approved these changes Sep 4, 2023
Copy link
Member

@adiroiban adiroiban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the changes.

They look good. Great PR.

I left only a few minor comments.

Comment on lines +266 to +268
self.assertEqual(os.pathsep.join(sys.path), environments[0]["PYTHONPATH"])
parallel_indexes = set(e["TWISTED_TRIAL_PARALLEL_INDEX"] for e in environments)
self.assertEqual(set(["0", "1", "2", "3"]), parallel_indexes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a small comment

parrallel_indexes vs parallelIndexes for pedantic code standard

and also check that PYTHONPATH is present in all environments

Suggested change
self.assertEqual(os.pathsep.join(sys.path), environments[0]["PYTHONPATH"])
parallel_indexes = set(e["TWISTED_TRIAL_PARALLEL_INDEX"] for e in environments)
self.assertEqual(set(["0", "1", "2", "3"]), parallel_indexes)
pythonpathEnvs = [e["PYTHONPATH" for e in environments]
self.assertEqual([os.pathsep.join(sys.path)] * 4, pythonpathEnvs)
parallelIndexes = set(e["TWISTED_TRIAL_PARALLEL_INDEX"] for e in environments)
self.assertEqual(set(["0", "1", "2", "3"]), parallelIndexes)

@@ -246,3 +246,7 @@ then deletes that schema in the tearDown function, your tests will behave in an
unpredictable fashion as they tromp upon each other if they have their own
schema. And this won't actually indicate a real error in your code, merely a
testing-specific race-condition.

Trial provides `TWISTED_TRIAL_PARALLEL_INDEX` environment variable to the tests when run in parallel.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this documentation is good enough.

Just a curiosity and for future reference,

Do you know how other testing frameworks are dealing with this issue?

Do they also set an env variable?
Is there some convention for the variable name?

Are there any other things that can help with shared resources?

@adiroiban
Copy link
Member

I have approved the changes and the PR looks good to me.

I see in #11949 that Glyph and Jean Paul are not happy with the way this is implemented.

I will leave this unmerged waiting for a consensus on how to get this done.

Regards

Copy link
Member

@glyph glyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on #11949 , I don't think that this is a good API surface for us to support, and I think the documentation is insufficient. I'd like to see more explanation of whether this is actually necessary or not yet.

@glyph glyph dismissed adiroiban’s stale review September 4, 2023 19:42

There was already some discussion on the ticket I don't see addressed in the review.

@adiroiban
Copy link
Member

@glyph

There was already some discussion on the ticket I don't see addressed in the review.

Sorry about that.
The first notification was for the PR ... and I did the review without checking the ticket.
My bad. I will check the ticket discussion before doing a review.

thanks for the follow up here.

@glyph
Copy link
Member

glyph commented Sep 5, 2023

The first notification was for the PR ... and I did the review without checking the ticket. My bad. I will check the ticket discussion before doing a review.

No worries. We should discuss stuff on tickets more, but so many are just perfunctory CI / types / test housekeeping these days that it is an understandable habit to fall in to :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parallel Twisted trial workers should know an index to differentiate between each other
4 participants