Benchmark for JUnitTestCaseSorter::uniqueByTestFile #1177

sanmai · 2020-03-17T14:45:32Z

This PR:

Adds a bechmark for JUnitTestCaseSorter, based on live data
Adds optional bucket sort
Covered by tests
Is isset() required for PHP 7.4?
Standard benchmark profile

Looking just at the Big O notation for the bucket sort, it seems to be more effective in general, but for smaller number of items and/or for larger number of buckets Quicksort is more effective, again, looking at the notation only.

Here's a Big O graph for 23 buckets:

Blue is for quicksort. Orange is for bucket sort.

For 25 buckets we're aiming the sort becomes theoretically more effective after 15 elements. Considering that we don't do the second stage of intra-bucket sorting, this effectiveness boundary might be closer for us, yet still far.

If we consider the distribution of overly tested mutations, we can see that our bucket sort will be applied in roughly 40% of cases in case of Psalm.

Bucket selection algorithm

Since effectiveness of the algorithm is in direct relationship with the number of buckets, and since our data is effectively skewed towards a specific mode at around 1/3 of a second, it does not help if we use a fixed number of buckets.

Instead we gradually lower precision of our timings starting with 8th of a second, later lowering it to 4 seconds. Also we pre-sort bucket array for the first second worth of buckets. By using calculated buckets we try to avoid looping around bucket numbers.

Example of a bucket distribution

Bucket	Test times
0	0.01, 0.08, 0.06 ... everything under 1/8 of a second goes here
1	(nothing fell here, skipping these hereafter)
2	0.26, 0.31, 0.28, 0.36, 0.35
4	0.56, 0.53
5	0.75
7	0.99, 0.95, 0.94
8	1.06
10	1.25
11	1.45, 1.44
14	1.8, 1.86
16	2.11, 2.05
18	2.28
19	2.45
23	2.93
24	3.03
25	3.17, 3.21
28	3.62, 3.53
30	3.75
32	7.81, 5.93, 4.42, 4.37, 4.22, 6.09, 4.27, 4.4, 6.75, 7.11, 4.93, 6.79, 6.1, 4.95, 6.66, 7.88, 7.69, 5.24, 5.8, 5.58, 5.99
64	8.57, 9.34, 8.92, 8.97, 8.22, 11.77, 8.61, 9.98, 9.13
96	15.25, 12.75, 12.48, 12.9, 13.7, 15.06, 14.85, 14.22, 15.08, 12.22
128	19.57, 17.9, 17.94, 18.89
160	20.73, 23.94, 22.72, 20.55, 20.31
192	26.77
224	29.58
256	32.26, 32.39
352	45.48, 46.91

Further optimization begs the following questions:

Should we care about tests longer than one seconds? E.g. if not, we can leave only several buckets under one second, and leave everything else above one second unsorted.
Aren't we overfitting our optimization procedure to Psalm only? Might be better to verify our approach against a greater dataset.
Is this really the bottleneck we should care about? (No doubt this was a fun exercise.)

… into pr/2020-03/test-timings

theofidry

A few nitpicks looks good to me otherwise

src/TestFramework/Coverage/JUnit/JUnitTestCaseSorter.php

tests/phpunit/TestFramework/Coverage/JUnit/JUnitTestCaseSorterTest.php

theofidry · 2020-03-19T17:27:55Z

I would also love us to add a benchmark to profile the mutations -> mutant process, but I don't think it should hold this PR either

Co-Authored-By: Théo FIDRY <theo.fidry@gmail.com>

sanmai · 2020-03-20T03:02:27Z

I'd love to test it with the new benchmarks, but this isn't particularly easy because we test only so much mutations, not the whole range. I'll see to it anyway a bit later.

src/TestFramework/Coverage/JUnit/JUnitTestCaseSorter.php

tests/phpunit/TestFramework/Coverage/JUnit/JUnitTestCaseSorterTest.php

maks-rafalko

Interesting PR. I can't say I understood every single line here (which makes me think it will probably be hard to support this code for average contributor without digging into the theory), but it seems like a very smart improvemnt.

Does Psalm have many mutations with >15 tests that cover the mutated line?
Is this diagram for Psalm?
What is the performance win in % for Psalm?

… into pr/2020-03/test-timings

sanmai · 2020-03-23T05:28:51Z

40% of Psalm's mutations have more than 15 tests. This is the graph.
This graph is for Big O for quicksort and bucket sort.
I cannot make a straight on benchmark on Psalm because Blackfire requires some impossible amounts of RAM I don't have. Our standard benchmarks show no particular improvement, but neither a regression. Might be because of insufficient depth.

Third point is the major point for me. Even if local benchmarks show improvement, global improvement might be below the noise threshold. I'd like to see them, and to show them, but it appears to be hard for one reason or another.

* Tweak JUnitTestCaseSorter * Update comment * Update return type

tests/phpunit/TestFramework/Coverage/JUnit/JUnitTestCaseSorterTest.php

tests/phpunit/TestFramework/Coverage/JUnit/TestLocationBucketSorterTest.php

…Test.php Co-Authored-By: Théo FIDRY <theo.fidry@gmail.com>

… into pr/2020-03/test-timings

sanmai added 14 commits March 17, 2020 23:13

Sample of JUnit test timing from a real project

8d9c455

Remove import

a75c87c

Implement bucket sort

a08af4d

Merge branch 'master' into pr/2020-03/test-timings

0ca0461

Improve bucket-sort

96f1aea

Benchmark new sorter

9eff4ed

CS

1c6e622

PHPStan fixes

d177238

Merge branch 'master' into pr/2020-03/test-timings

9239bbf

Merge branch 'pr/2020-03/test-timings' of github.com:sanmai/infection…

ede73cb

… into pr/2020-03/test-timings

Update to isOrderConstraintsValid

bda8f2e

Update MutationConfigBuilderTest

dda7f1f

Use dual approach

a91f031

Count once, comment about yield from

d95c601

theofidry approved these changes Mar 19, 2020

View reviewed changes

sanmai and others added 4 commits March 20, 2020 03:18

Remove small section sort

e41a179

Random tests are brittle, remove

cd4ceaf

Update src/TestFramework/Coverage/JUnit/JUnitTestCaseSorter.php

fcc70a3

Co-Authored-By: Théo FIDRY <theo.fidry@gmail.com>

Address review comments

e96f040

sanmai marked this pull request as ready for review March 20, 2020 03:05

sanmai added 2 commits March 20, 2020 12:05

Merge branch 'master' into pr/2020-03/test-timings

e023e9e

Compare down to third digit only

56dfa1b

sanmai requested a review from maks-rafalko March 20, 2020 03:13

sanmai added 3 commits March 20, 2020 17:02

Merge branch 'master' into pr/2020-03/test-timings

a6cd4ac

Merge branch 'master' into pr/2020-03/test-timings

1b942c8

Skip benchmarks under a debugger

10532f0

maks-rafalko added this to the 0.16.0 milestone Mar 20, 2020

maks-rafalko added the Performance label Mar 20, 2020

theofidry approved these changes Mar 20, 2020

View reviewed changes

maks-rafalko approved these changes Mar 21, 2020

View reviewed changes

sanmai added 10 commits March 23, 2020 01:53

Update test_it_returns_first_file_name_if_there_is_only_one

d2e6c7a

Improve test_it_returns_unique_and_sorted_by_time_test_cases

0a4eccd

Address review comments

d03d706

Factor out bucket sort into TestLocationBucketSorter

64d82dd

Update TestLocationBucketSorterTest

7726cfc

Merge branch 'pr/2020-03/test-timings' of github.com:sanmai/infection…

3153574

… into pr/2020-03/test-timings

Move initial array to a constant

4a8f119

Update TestLocationBucketSorterTest

6f8e0e2

Merge branch 'master' into pr/2020-03/test-timings

356d7bb

Use yield from

7a606f2

sanmai added 2 commits March 23, 2020 17:16

Tweak JUnitTestCaseSorter (infection#1187)

c3db77a

* Tweak JUnitTestCaseSorter * Update comment * Update return type

Merge branch 'master' into pr/2020-03/test-timings

4bd83ad

theofidry approved these changes Mar 23, 2020

View reviewed changes

sanmai and others added 3 commits March 23, 2020 17:47

Update tests/phpunit/TestFramework/Coverage/JUnit/JUnitTestCaseSorter…

ce154c6

…Test.php Co-Authored-By: Théo FIDRY <theo.fidry@gmail.com>

Merge branch 'pr/2020-03/test-timings' of github.com:sanmai/infection…

a4e6938

… into pr/2020-03/test-timings

CS

4f53178

sanmai changed the base branch from master to 0.16 March 23, 2020 09:13

sanmai merged commit c538af7 into infection:0.16 Mar 24, 2020

sanmai deleted the pr/2020-03/test-timings branch March 24, 2020 00:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark for JUnitTestCaseSorter::uniqueByTestFile #1177

Benchmark for JUnitTestCaseSorter::uniqueByTestFile #1177

sanmai commented Mar 17, 2020 •

edited

theofidry left a comment

theofidry commented Mar 19, 2020

sanmai commented Mar 20, 2020

maks-rafalko left a comment •

edited

sanmai commented Mar 23, 2020

Benchmark for JUnitTestCaseSorter::uniqueByTestFile #1177

Benchmark for JUnitTestCaseSorter::uniqueByTestFile #1177

Conversation

sanmai commented Mar 17, 2020 • edited

Bucket selection algorithm

theofidry left a comment

Choose a reason for hiding this comment

theofidry commented Mar 19, 2020

sanmai commented Mar 20, 2020

maks-rafalko left a comment • edited

Choose a reason for hiding this comment

sanmai commented Mar 23, 2020

sanmai commented Mar 17, 2020 •

edited

maks-rafalko left a comment •

edited