New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClassMapGenerator: stabilize the heredoc/nowdoc stripping #10072
Changes from 6 commits
b66b23a
40bd4b0
c44be99
f6c446b
3f79e59
6ab1b6a
d8054d1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -246,7 +246,22 @@ private static function findClasses($path) | |
} | ||
|
||
// strip heredocs/nowdocs | ||
$contents = preg_replace('{<<<[ \t]*([\'"]?)(\w+)\\1(?:\r\n|\n|\r)(?:.*(?=[\r\n]+[ \t]*\\2))[\r\n]+[ \t]*\\2(?=\s*[;,.)])}s', 'null', $contents); | ||
$contents = preg_replace('{ | ||
# opening heredoc/nowdoc delimiter (word-chars) | ||
<<<[ \t]*([\'"]?)(\w+)\\1 | ||
Seldaek marked this conversation as resolved.
Show resolved
Hide resolved
Seldaek marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# needs to be followed by a newline | ||
(?:\r\n|\n|\r) | ||
# the meat of it, matching line by line until end delimiter | ||
(?: | ||
# a valid line is optional white-space (possessive match) not followed by the end delimiter, then anything goes for the rest of the line | ||
[\t ]*+(?!\\2 [\t \r\n]*[;,.)])[^\r\n]* | ||
# end of line(s) | ||
[\r\n]+ | ||
Seldaek marked this conversation as resolved.
Show resolved
Hide resolved
|
||
)* | ||
# end delimiter | ||
[\t ]* \\2 (?=[\t \r\n]*[;,.)]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. https://3v4l.org/i2YXM There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah yes now that we do consume the content lines correctly it is probably not needed anymore, thanks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok removing the whole thing fails some tests, but with \b it works, so I'm hopeful this is a good enough solution.
Seldaek marked this conversation as resolved.
Show resolved
Hide resolved
|
||
}x', 'null', $contents); | ||
|
||
// strip strings | ||
$contents = preg_replace('{"[^"\\\\]*+(\\\\.[^"\\\\]*+)*+"|\'[^\'\\\\]*+(\\\\.[^\'\\\\]*+)*+\'}s', 'null', $contents); | ||
// strip leading non-php code if needed | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All regexes must be combined in one, otherwise starting part of heredoc in a string will replace things very wrongly.
Ex:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look this code has worked fine for 10 years now on normal code, barring the one bug we recently discovered.. There may be ways to abuse it for sure as it's not lexing completely, but in the worst case is means your classes don't get found. I don't think it's worth adding more complexity to it to fix hypotheticals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments, simple strings and heredocs/nowdocs should really be replaced at once, it is easy and the only correct way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK then let's wrap this PR up, and once it's merged feel free to give that a shot, if you can make it a reasonable looking solution I'm not strictly against it, but IMO it's not strictly needed in practice, no matter how correct in theory.