Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClassMapGenerator: fix two bugs (long heredocs + markers in text) #10050

Merged

Conversation

jrfnl
Copy link
Contributor

@jrfnl jrfnl commented Aug 11, 2021

ClassMapGenerator: add tests for "long heredoc" bug

... to proof the existence of the bug and demonstrate the effect.

Note: in the test the backtrack limit is being lowered (and restored back to the default afterwards) to prevent the tests needing ridiculously huge test fixture files.

ClassMapGenerator: add test for "marker in text" bug

In PHP < 7.3, the heredoc/nowdoc marker was allowed to occur in the text, as long as it did not occur at the very start of the line.

This was also not handled correctly.

Ref: https://www.php.net/manual/en/migration73.incompatible.php#migration73.incompatible.core.heredoc-nowdoc

ClassMapGenerator: fix the regex

By using a look ahead assertion to match "new line - maybe whitespace - marker", the negative performance impact of the .* is significantly mitigated and backtracing will be severely limited.

This fixes the bug as reported in #10037.

The bug was discovered due to a PHP 8.1 "passing null to non-nullable" deprecation notice being thrown, but is not a PHP 8.1 bug.

In actual fact, this issue affected all PHP versions and could lead to incomplete classmaps when the code base contained files with huge heredocs/nowdocs.

The regex change (not completely) incidentally also fixes an issue with markers in a heredoc/nowdoc not being correctly handled. This bug could lead to "classes" being added to the class map which aren't actually classes.

Fixes #10037

@Seldaek Please let me know if this should go to an earlier branch. Happy to rebase.

... to proof the existence of the bug and demonstrate the effect.

Note: in the test the backtrack limit is being lowered (and restored back to the default afterwards) to prevent the tests needing ridiculously huge test fixture files.
In PHP < 7.3, the heredoc/nowdoc marker was allowed to occur in the text, as long as it did not occur at the very start of the line.

This was also not handled correctly.

Ref: https://www.php.net/manual/en/migration73.incompatible.php#migration73.incompatible.core.heredoc-nowdoc
By using a look ahead assertion to match "new line - maybe whitespace - marker", the negative performance impact of the `.*` is significantly mitigated and backtracing will be severely limited.

This fixes the bug as reported in 10037.

The bug was discovered due to a PHP 8.1 "passing null to non-nullable" deprecation notice being thrown, but is not a PHP 8.1 bug.

In actual fact, this issue affected all PHP versions and could lead to incomplete classmaps when the code base contained files with huge heredocs/nowdocs.

The regex change (not completely) incidentally also fixes an issue with markers in a heredoc/nowdoc not being correctly handled. This bug could lead to "classes" being added to the class map which aren't actually classes.

Fixes 10037

ini_set('pcre.backtrack_limit', '30000');
$result = ClassMapGenerator::createMap(__DIR__ . '/Fixtures/pcrebacktracelimit');
ini_restore('pcre.backtrack_limit');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use a try/finally, so in case of exceptions the value gets properly restored?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this retrieve the previous value and re-set it instead of restoring the initial value, which might not be the expected one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@staabm Good question. I'd done it like this to ensure that, even if the assertion fails, restore will still happen, but yes, in case of exceptions it may not.

I'm not a fan of using try/catch in test methods. Moving it to the tearDown() would be more appropriate IMO, or even moving the method to its own test class and then having the ini changes in setUp() and tearDown().

What do you think ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this retrieve the previous value and re-set it instead of restoring the initial value, which might not be the expected one?

Considered that too (actually started with that when I was writing the test), but couldn't find any other tests doing anything with ini_set(), let alone with this configuration setting, so I figured I may as well use the PHP native restore option. Happy to change that though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's fine as is. Low chances of problems, and tests only, so not really worth a debate :)

@Seldaek Seldaek added this to the 2.1 milestone Aug 12, 2021
@Seldaek Seldaek added the Bug label Aug 12, 2021
@Seldaek Seldaek merged commit 4da1a2d into composer:master Aug 12, 2021
@Seldaek
Copy link
Member

Seldaek commented Aug 12, 2021

Thanks, given the rather exotic aspect of the fix, I don't think it's worth backporting to 1.x, so it's fine targetting 2.1 👍🏻

@jrfnl jrfnl deleted the feature/10037-classmapgenerator-bug-fixes branch August 12, 2021 14:42
@jrfnl
Copy link
Contributor Author

jrfnl commented Aug 12, 2021

@Seldaek Thanks for merging this. While maybe exotic, it was definitely an interesting issue to look into ;-)

@Seldaek
Copy link
Member

Seldaek commented Aug 12, 2021

Yeah the more exotic bugs are usually the more interesting ones. It's not everyday you discover age-old bugs in widely used software :)

@jrfnl
Copy link
Contributor Author

jrfnl commented Aug 12, 2021

Yup, when I looked at the history I saw this particular regex hadn't been touched since 2012... 😂 (and was introduced to move away from the Tokenizer, which makes sense to me for all sorts of reasons)

@jrfnl
Copy link
Contributor Author

jrfnl commented Aug 12, 2021

Just out of curiousity - were there ever any issues reported about classes missing from the classmap where the cause was never previously found, which might just be related to this issue ?
Or about the classmap containing non-existent classes ?

@Seldaek
Copy link
Member

Seldaek commented Aug 13, 2021

I can't remember any such issue in the recent years. Early on as we ironed out issues sure but I guess such long heredocs really are that rare.

ondrejmirtes added a commit to phpstan/phpstan-src that referenced this pull request Aug 25, 2021
mglaman pushed a commit to mglaman/phpstan-src that referenced this pull request Oct 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: long heredoc causes missing classes in classmap generation
3 participants