Add support for minimization #184

alex · 2019-06-02T22:28:53Z

Once you have tests that fail only when run in a particular order, it'd be incredibly useful to have sort of support for automatically determining what the minimum set of tests are required to trigger the failure.

adamchainz · 2019-06-05T07:24:27Z

Yes, this is a bit related to #17. Perhaps the set of tests could be cached in .pytest_cache somehow, and then this could be bisected on. The problem with bisection is that in the worst test isolation failures, cutting the list of tests in the middle fixes them, since it's a test in the first half leaking state into the second half.

Fixing #17 would at least allow manual bisection

thbde · 2020-05-10T17:27:17Z

I remember an approach in the context of bisecting for multiple errors: In the bisection step, you can reorganize the partitions to find a suitable split.

For example, assume we have a (full) list of tests F. A test f_x fails if f_p is executed before, f_x and f_p in F. Therefore, we want to identify a subset that contains f_x and f_p.
We start with a bisection and create 2 lists with half of the elements:
H1, H2 where H1 + H2 = F
Now, there are 4 cases (assuming F fails):

H1 fails, H2 fails => this most likely means we have more issues. We can just continue with H1 and ignore H2 (can be investigated later)
H1 fails, H2 is ok => continue with H1
H1 is ok, H2 fails => continue with H2
H1 is ok, H2 is ok => continue with a new partition

So, both H1 and H2 are ok and f_x is in one list and f_p in the other list. Therefore, we create new partitions. We split up H1 and H2 in half (and get quarters):
Q1, Q2, Q3, Q4 where Q1 + Q2 = H1 and Q3 + Q4 = H2

We create all possible pairs of these 4 lists and use these pairs as input to the previous step:

(Q1+Q3, Q2+Q4)
(Q1+Q4, Q2+Q3)
We can skip (Q1+Q2, Q3+Q4), because we tested it already.

Now, there must be at least one partition where f_x and f_p are both in one half of the selected tests. Continue with the typical approach until we cannot reduce anymore.

It could happen that all these pairs are ok. Then, we would have at least one addition test f_p2 that is required to be executed before f_x. We could either further split up our partitions and rearrange them or at some point we might consider switching to a less efficient algorithm and take out subsets of certain size from F. Worst case: all tests in F before the last test f_l test must be executed to make f_l fail. We can verify this worst case only by testing all subsets that remove for each subset a single different test from F.

adamchainz · 2020-05-11T15:57:55Z

That sounds like a great process @thbde . I think it's a challenge to implement since the test process would need restarting repeatedly for each arrangement of the partitions. We'd need to preserve a bunch of state in .pytest_cache and encourage the user to keep reinvoking pytest- --randomly-minimize-failure (or whatever the flag is), with extra output to describe where we're at. Or perhaps pytest could reinvoke itself at the end of the run, though that sounds a bit hacky.

mrbean-bremen · 2023-02-14T15:45:14Z

Just stumbled over this issue, and FWIW, some time ago I wrote a small plugin that does something like this (though not as involved as described).

It does not use random ordering (though you can run it together with pytest-randomly, so that the first run will be random), but inverse ordering. It just runs the tests backwards and forwards, finds tests that fail in one of the runs, bisects the results and repeats until a dependency is found. Each test run is done in a separate process to minimize dependencies between runs.

adamchainz · 2023-02-15T17:07:55Z

Thanks for the note.

Since my last comment, I’ve also used the tool detect-test-pollution. It works well with pytest-randomly, as you can provide it a file with a list of test ID’s. So pytest-randomly can be used to find a problematic order, output the list of ID’s, then you can work on minimizing it with detect-test-pollution.

I'm gonna close this issue since external tools exist.

mrbean-bremen · 2023-02-15T19:20:34Z

Thanks - detect-test-pollution looks interesting, I wasn't aware of it!

pytest-find-dependencies is a bit more restricted, as it always runs all tests and tries to find dependencies there, though it also works well with pytest-randomly which will shuffle the initial test run. I actually use it in CI to ensure independent tests.

adamchainz closed this as completed Feb 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for minimization #184

Add support for minimization #184

alex commented Jun 2, 2019

adamchainz commented Jun 5, 2019

thbde commented May 10, 2020

adamchainz commented May 11, 2020

mrbean-bremen commented Feb 14, 2023 •

edited

adamchainz commented Feb 15, 2023

mrbean-bremen commented Feb 15, 2023

Add support for minimization #184

Add support for minimization #184

Comments

alex commented Jun 2, 2019

adamchainz commented Jun 5, 2019

thbde commented May 10, 2020

adamchainz commented May 11, 2020

mrbean-bremen commented Feb 14, 2023 • edited

adamchainz commented Feb 15, 2023

mrbean-bremen commented Feb 15, 2023

mrbean-bremen commented Feb 14, 2023 •

edited