Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance] Add files to coverage whitelist instead of the whole directories when --filter or --git-diff-filter are used #1543

Merged
merged 3 commits into from Aug 5, 2021

Conversation

maks-rafalko
Copy link
Member

@maks-rafalko maks-rafalko commented Aug 4, 2021

Problem

Collecting coverage data is expensive operation.

Currently, when we build the initial tests phpunit.xml config, we add (if doesn't exist) the following coverage whitelist:

<phpunit>
    <coverage>
        <include>
            <directory>src/</directory>
            <directory>example/</directory>
        </include>
    </coverage>
</phpunit>

It means that all the files from directories src/ and example/ will be processed by coverage driver (be it xdebug, pcov or phpdbg
).

Collecting code coverage costs a lot, that's why this change tries to reduce the number of files added to coverage whitelist, replacing <directory> tag with N <file> tags if possible.

And one of such cases is when we use --filter=src/path/to/File1.php,src/path/to/File2.php option, or when we use --git-diff-filter=AM option, that results internally to --filter=X,Y,Z.

When the filter is used, we 100% know that we will mutate only particular files, so we need the code coverage only for them.

After this change, the following command

infection --filter=src/path/to/File1.php,src/path/to/File2.php

creates the following phpunit.xml for initial tests run that collects coverage:

<phpunit>
    <coverage>
        <include>
            <file>src/path/to/File1.php/</file>
            <file>src/path/to/File2.php/</file>
        </include>
    </coverage>
</phpunit>

This will dramatically decrease the time needed for collecting coverage data since we reduce the number of processed files.

Some numbers

Tests (without Infection) with src folder in coverage.include:

............................................                  2423 / 2423 (100%)

Time: 03:11.112, Memory: 587.62 MB

OK (2423 tests, 9861 assertions)
Generating code coverage report ... done [00:12.564]

Tests (without Infection) with 5 files in coverage.include:

............................................                  2423 / 2423 (100%)

Time: 01:12.771, Memory: 34.00 MB

OK (2423 tests, 9861 assertions)
Generating code coverage report ... done [00:00.041]

Difference:

- Time: 03:11.112, Memory: 587.62 MB
+ Time: 01:12.771, Memory: 34.00 MB

So, it saves 2 from 3 minutes.

Another case is with Infection and --filter option used for the real project.

master vs this PR:

infection -j4 --only-covered --filter=src/Recorder/RecorderCapabilities.php --ignore-msi-with-no-mutations --show-mutations  --log-verbosity=all --only-covering-test-cases

- Time: 8s. Memory: 28.00MB
+ Time: 2s. Memory: 26.00MB

it is 4x faster for the filtered file set.


Notes:

This PR will benefit those developers, who use Infection with --filter or --git-diff-filter options, running MT for the changed/added files (which I personally highly recommend, as it allows using Infection for the project regardless of its size).

…he whole directories when `--filter` or `--git-diff-filter` are used

Currently, when we build the initial tests `phpunit.xml` config, we add (if doesn't exist) the following coverage whitelist:

```xml
<phpunit>
    <coverage>
        <include>
            <directory>src/</directory>
            <directory>example/</directory>
        </include>
    </coverage>
</phpunit>
```

It means that **all** the files from directories `src/` and `example/` will be processed by coverage driver (be it `xdebug`, `pcov` or `phpdbg`
).

Collecting code coverage costs a lot, that's why this change tries to reduce the number of files added to coverage whitelist, replacing `<directory>` tag with N `<file>` tags if possible.

And one of the case is when we use `--filter=src/path/to/File1.php,src/path/to/File2.php` option, or when we use `--git-diff-filter=AM` option, that results internally to `--filter=X,Y,Z`.

When the filter is used, we 100% know that we will mutate only particular files, so we need the code coverage only for them.

Result initial `phpunit.xml` file will be like this:

```
infection --filter=src/path/to/File1.php,src/path/to/File2.php
```

```xml
<phpunit>
    <coverage>
        <include>
            <file>src/path/to/File1.php/</file>
            <file>src/path/to/File2.php/</file>
        </include>
    </coverage>
</phpunit>
```

This will dramatically decrease the time needed for collecting coverage data since we reduce the number of processed files.
@maks-rafalko maks-rafalko added DX Developer Experience Performance labels Aug 4, 2021
@@ -70,7 +70,6 @@ public function collectFiles(
$finder->notPath($excludeDirectory);
}

// Generator here to make sure these files used only once
yield from $finder;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanmai FYI: I didn't find any test or case that will be broken by this change, I think we've added it just in case to catch the situation where Finder is used twice. But in this PR - this is exactly what is needed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well change the signature to @return array<...>

@@ -48,7 +48,7 @@
final class PathReplacer
{
private Filesystem $filesystem;
private ?string $phpUnitConfigDir = null;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useless default value since this property is initialized in constructor

Copy link
Member

@sidz sidz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Member

@sanmai sanmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's just great!

@maks-rafalko maks-rafalko added this to the 0.25.0 milestone Aug 5, 2021
@maks-rafalko maks-rafalko merged commit 193ab76 into master Aug 5, 2021
@maks-rafalko maks-rafalko deleted the feature/filtered-coverage-files branch August 5, 2021 09:09
@maks-rafalko
Copy link
Member Author

Thank you @sid and @sanmai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DX Developer Experience Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants