New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClassMapGenerator: stabilize the heredoc/nowdoc stripping #10072
Changes from all commits
b66b23a
40bd4b0
c44be99
f6c446b
3f79e59
6ab1b6a
d8054d1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,8 +19,14 @@ class FailHeredocWhitespace | |
} | ||
WHITESPACE . <<< MARKERINTEXT | ||
In PHP < 7.3, the docblock marker could occur in the text as long as it did not occur at the very start of the line. | ||
class FailHeredocMarkerInText | ||
{ | ||
} | ||
But, what are you blind McFly, it's there. How else do you explain that wreck out there? Doc, Doc. Oh, no. You're alive. Bullet proof vest, how did you know, I never got a chance to tell you. About all that talk about screwing up future events, the space time continuum. Okay, alright, I'll prove it to you. | ||
MARKERINTEXT | ||
. MARKERINTEXT | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just FYI @jrfnl I had to add a dot here (could've been anything really) because the test the way it was before was invalid PHP syntax, so I don't think it was testing any real case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Seldaek Except that it was and that test has now been broken. As per the first line of the heredoc:
The test, as it was originally, will lint as valid in PHP < 7.3, but as a parse error in PHP 7.3. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, so I presume that stripping heredocs containing a marker in PHP < 7.3 is now broken again ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh 🤦🏻♂️ sorry. I do wonder though if it makes sense to test for this, given it'll break on all supported PHP versions.. Is it really reasonable to have an expectation that we handle such code correctly? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And yes, it is broken as we do not check for the end delimiter being followed by some meaningful character anymore (i.e. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jrfnl sorry just to make sure before I can make another release, do you agree with my assessment above? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Seldaek Thanks for asking for my opinion, but to answer your question: I stand by the solution I originally pulled here. Part of mastering regex is knowing when it's not the right solution and the fact that supporting both pre-PHP 7.3 as well as post-PHP 7.3 syntax correctly was neigh impossible (yes, I ran into that too) was a good enough reason for me to look at an alternative solution. Believe me when I say, I understand enjoying the challenge of a good regex puzzle, but in this case, regex is the problem instead of the solution. Happy to tweak the original solution some more (like adding the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK thanks for the feedback. Looking at your original fix here again though, it seems it also suffers from the same pitfalls as the regex version did when it was checking for the heredoc marker in text. i.e. 40bd4b0#diff-1180d0c164d5a8cb3ecb242c7e607ee86bd3a6a78e7c7c38d59ec317dc6eff44R323 has to lookahead for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, to support the "marker in text" case, you will always need a look-ahead. The
Well, for open source software: yes, nearly all will support PHP cross-version. For proprietary software: not so much and there is still a lot of legacy software out there. All the same, it's still an edge case. |
||
class FailHeredocMarkerInText2 | ||
{ | ||
} | ||
Look at my driver's license, expires 1987. Look at my birthday, for crying out load I haven't even been born yet. And, look at this picture, my brother, my sister, and me. Look at the sweatshirt, Doc, class of 1984. Why do you keep following me around? Hey beat it, spook, this don't concern you. | ||
MARKERINTEXT . <<<"DOUBLEQUOTES" | ||
class FailHeredocDoubleQuotes | ||
|
@@ -89,4 +95,55 @@ public function test_simple_string() | |
{ | ||
return 'class FailSimpleString {}'; | ||
} | ||
|
||
public function test_unicode_heredoc() | ||
{ | ||
return array(1, 2, <<<öéçив必 | ||
class FailUnicode | ||
{ | ||
} | ||
öéçив必, 3, 4); | ||
} | ||
|
||
public function test_wrapped_in_curly_brackets() | ||
{ | ||
return ${<<<FOO | ||
class FailCurlyBrackets | ||
{ | ||
} | ||
FOO}; | ||
} | ||
|
||
public function test_wrapped_in_angle_brackets() | ||
{ | ||
return [<<<FOO | ||
class FailAngleBrackets | ||
{ | ||
} | ||
FOO]; | ||
} | ||
} | ||
|
||
// Issue #10067. | ||
abstract class First { | ||
public function heredocDuplicateMarker(): void { | ||
echo <<<DUPLICATE_MARKER | ||
|
||
DUPLICATE_MARKER; | ||
} | ||
} | ||
|
||
abstract class Second extends First { | ||
public function heredocDuplicateMarker(): void { | ||
echo <<<DUPLICATE_MARKER | ||
|
||
DUPLICATE_MARKER; | ||
} | ||
} | ||
|
||
abstract class Third extends First { | ||
public function heredocMarkersOnlyWhitespaceBetween(): void { | ||
echo <<<DUPLICATE_MARKER | ||
DUPLICATE_MARKER; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All regexes must be combined in one, otherwise starting part of heredoc in a string will replace things very wrongly.
Ex:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look this code has worked fine for 10 years now on normal code, barring the one bug we recently discovered.. There may be ways to abuse it for sure as it's not lexing completely, but in the worst case is means your classes don't get found. I don't think it's worth adding more complexity to it to fix hypotheticals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments, simple strings and heredocs/nowdocs should really be replaced at once, it is easy and the only correct way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK then let's wrap this PR up, and once it's merged feel free to give that a shot, if you can make it a reasonable looking solution I'm not strictly against it, but IMO it's not strictly needed in practice, no matter how correct in theory.