Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the --filter option shorter for PHPUnit #1548

Merged
merged 1 commit into from Sep 4, 2021

Conversation

maks-rafalko
Copy link
Member

This is just an attempt to make the --filter shorter. It doesn't fix #1545 though.

Explanation:

Difference between master and this PR with built --filter:

- /App\\Tests\\Functional\\Api\\Document\\DocumentTemplateTest\:\:test_it_can_update_document_template with data set "Returns error when at leas one of the business programs not in the document type'\''s list"
+ /DocumentTemplateTest\:\:test_it_can_update_document_template with data set "Returns error when at leas one of the business programs not in the document type'\''s list"

We can get rid of all the App\\Tests\\Functional\\Api\\Document\\ and similar prefixes, leaving just ClassTest::test_case_method notation (or ClassTest::test_case_method with data set "#1" if data provider is used).

Why not just test_case_method1|test_case_method2? Because the less unique the filter, the more tests will be executed (which is worse from performance POV). If we have 1000 files with test_case_method1, then all the 1000 tests will be executed. Prefixing test_case_method1 with ClassTest:: make it hopefully unique.

On my real project test, this PR saves 13,000 characters from 50,000 and saves the performance (runs the same amount of tests, not more).

@maks-rafalko maks-rafalko added this to the 0.25.0 milestone Aug 7, 2021
@sanmai
Copy link
Member

sanmai commented Aug 7, 2021

Here's what we can do:

  • We can have several optimization strategies for --filter. For example, we can go the way you propose here, or we can use the original approach, or we can go test_case_method1|test_case_method2.
  • Next we can run a cost/benefit analysis for each of the strategies. For example, if the original strategy leads to a regex shorter than, say, 300 characters, we could probably just use it and not consider anything else.
  • We know how much time it takes to run every test, and we can tell how much time it would take to run the most optimal set of tests. Next we can check if there's a substantial difference between different remaining strategies. For example, if the "shortest" strategy such as test_case_method1|test_case_method2 leads to a test duration within 1% of the optimal time, we can probably just use the shortest.
  • If we didn't find the optimal strategy, we can optimize by finding a strategy with the shortest resulting regex (we can probably predict its length) that has the least amount of the deviation from the optimum.
  • If we didn't find a suitable strategy and, for suitable strategies, the assumed length is longer than 34000 characters, we can forfeit the approach.

Do you think this will be too much trouble to make it worth it?

@theofidry
Copy link
Member

theofidry commented Aug 11, 2021

@sebastianbergmann would there be an alternative solution? For example setting --filter as a possible array and if there is multiple entries then all must match?

@sebastianbergmann
Copy link

Just to make sure I understand your motivation correctly: you would like to define a list of tests to run? Then, I think, finally implementing sebastianbergmann/phpunit#3387 would be the way to go. That, however, would be a new feature that would only be available in PHPUnit >= 10.

@theofidry
Copy link
Member

That, however, would be a new feature that would only be available in PHPUnit >= 10

That's totally fine: whichever solution is implemented in PHPUnit will be in a newer version and infection will most likely always have a wider range of PHPUnit versions to support.

I think @maks-rafalko solutions is a good one as it tries to limit the issue but it's a tiny patch at best. @sanmai is also a viable solution but it feels (IMO) like adding a lot of brittle (it tries to find a compromise) code for something that IMO should just be possible without workarounds.

@maks-rafalko @sanmai maybe another patch would be to simply remove --filter and add a warning in the case where the length could not be reduced enough. It will result in a big performance degradation in those cases but IMO it is better than a failure/false result

@maks-rafalko
Copy link
Member Author

maks-rafalko commented Sep 4, 2021

@sanmai @theofidry I've added your ideas for further implementing here #1545 (comment), and will keep that issue open to try improve this filtering farther. For now will merge this small patch and test it on my daily job project with the upcoming 0.25.0 release

@maks-rafalko maks-rafalko merged commit cb952c6 into master Sep 4, 2021
@maks-rafalko maks-rafalko deleted the feature/shorter-filter branch September 4, 2021 09:23
@theofidry
Copy link
Member

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Infection does not run any tests if --filter is too big
4 participants