New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite php file cleaning step to be less regex intensive and support extreme cases better #10107
Conversation
2334421
to
f813089
Compare
@Seldaek Happy to take a look, but may be a little before I get to it. |
@mvorisek btw this also addresses the problem you raised in the other PR as it does all the cleaning in one pass it can't mistake a heredoc inside a string or the like. |
I got an email notification here, but not seeing the mention here now re Hack autoloading, we don't actually use Composer's autoloading, but I think we depend on composer not erroring if it sees .php files that contain syntax errors (especially if they start with There's currently two common approaches to autoloading in hack:
We hope to entirely move to the built-in autoloader soon and stop using a composer plugin, but usability is a bit too much of a concern at the moment as it requires external services (watchman) to be configured. |
@fredemmott I deleted my comment because I realized what the code I linked to was about and it had nothing to do with HHVM :) Thanks for the clarifications tho, that should be no problem tho with the new parser I imagine. |
Are the backtrack limits really an issue? If they can be fixed, one combined |
I see - segfault with regex introduced in #10072 with PHP 5.6 only. Long time ago, I have seen some issues with possessive quantifiers (like #10106 (comment) is interesting, but I am not sure if high step count always mean a slower performance. |
The approach in master also silently fails on PHP 7.1 and PHP 7.2 so this new approach is needed and welcome. |
Please link (or open a new) PHP issue for it. PHP guys might know what is causing the problems. |
Also, I wonder how much the test file is compressible as the raw size is ~1 MB which might slow git clones a little bit. Replacing the long texts with some repeated paragraphs might hugely decreate it's compressed size. |
I've talked to @Seldaek about it, it's the same file that's causing the problem in this PR. |
I mean what exactly is causing php regex engine to fail, eg. how to tweak the regex, not the input data. |
… extreme cases better, fixes composer#10106
f813089
to
408ccb8
Compare
I removed the file, it didn't really add a ton of value given the new solution does not have a regex, it can't really fail due to the input size. |
Fixes #10106
The only downside here as far as I can tell is a slight slowdown, I see a
composer dump-autoload -o --profile
currently taking around 1.3x as long as it did before. The upside is that it's passing the whole test suite while the old code doesn't.This means it's fine as is IMO, but if someone wants to poke at it with a profiler to try to extract some last bits of perf, that'd be great! I already did what I could to remove obvious unnecessary allocation and co but didn't have time to profile.
/cc @jrfnl this last issue was finally the one drop too many and I got rid of the whole thing, happy to get your review here if you have time