Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix quotes in single-line comments breaking parameters #3497

Closed
wants to merge 5 commits into from

Conversation

TPHRyan
Copy link

@TPHRyan TPHRyan commented Mar 27, 2019

An odd number of quotes in comments would cause getUnquotedStatementFragments to start returning the inside of strings, rather than the outside.
Ignore comment lines to solve this problem.

Q A
Type bug
BC Break no
Fixed issues Could not find existing issue

Summary

This PR ensures that comments are ignored when breaking a query into fragments. Without this PR, comments with quotes (including apostrophes in words) can arbitrarily break parameter substitution.
While ideal production code does not include comments, test code can, and will confuse people if something breaks because they're present.

When trying to test a query locally, I got an SQLSTATE error indicating there was a mismatch between positional parameters and the parameters provided.
Inspecting the query just before execution, I noted that not all of the parameters had been substituted, it seemed that the ones that were later in the query had not.
After stepping through, I found the culprit to be getUnquotedSegmentFragments, later on in the query it was returning short fragments such as "variable_foo" etc. (I recognised these as names of json fields that I was accessing)
I realised that the pattern was such that it broke my comments into fragments, and then encountered the final quote in a comment and continued searching for its pair, then treating every following opening quote as a closing one!

This is fixable by adding syntax for comments to the beginning of the RegEx, so they are skipped if found.
I also needed to add the "-" character to the list of banned characters preceding a quote, so that it would be interrupted by a comment. This doesn't seem to prevent important edge cases, but I'm happy if one is found to prove me wrong.

Copy link
Member

@Ocramius Ocramius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of change requires test additions to back it

@Ocramius
Copy link
Member

Please also check https://travis-ci.org/doctrine/dbal/jobs/511822586

An odd number of quotes in comments would cause getUnquotedStatementFragments to start returning the inside of strings, rather than the outside.
Ignore comment lines to solve this problem.
@TPHRyan TPHRyan force-pushed the ignore-comments-in-fragments branch from 10605be to 186035e Compare March 27, 2019 03:44
Parameters would previously break when you have an odd number of single quotes in a comment and then sandwich parameters between strings. This test ensures this functionality is fixed.
@TPHRyan
Copy link
Author

TPHRyan commented Mar 27, 2019

Apologies, I could not find contributer guidelines from this repo.

I have added a regression test for this bugfix and fixed formatting issues.

@Ocramius
Copy link
Member

Apologies, I could not find contributer guidelines from this repo.

No worries, that's what CI is for on CS :-)

Ocramius
Ocramius previously approved these changes Mar 27, 2019
@Ocramius Ocramius requested a review from morozov March 27, 2019 13:52
@Ocramius Ocramius added this to the 2.10.0 milestone Mar 27, 2019
It is possible a comment '--' in a string might break the parser (reverse the parity of following quotes) so this case tests for that.
@TPHRyan
Copy link
Author

TPHRyan commented Mar 29, 2019

All requested changes have now been made.

@morozov
Copy link
Member

morozov commented Mar 29, 2019

@TPHRyan the build is unhappy. Please fix the failures.

@morozov morozov requested a review from Majkl578 March 29, 2019 21:13
@@ -250,11 +251,12 @@ private static function getUnquotedStatementFragments($statement)
self::ESCAPED_DOUBLE_QUOTED_TEXT . '|' .
self::ESCAPED_BACKTICK_QUOTED_TEXT . '|' .
self::ESCAPED_BRACKET_QUOTED_TEXT;
$expression = sprintf('/((.+(?i:ARRAY)\\[.+\\])|([^\'"`\\[]+))(?:%s)?/s', $literal);

$expression = sprintf('/(?:[\s]*--.*?\n)|((.+(?i:ARRAY)\\[.+\\])|([^-\'"`\\[]+))(?:%s)?/s', $literal);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an issue (#3472) caused by making this regular expression more complex (#3277). I don't know of a proper solution but it seems like making it even more complex may make things even worse.

Base automatically changed from master to 4.0.x January 22, 2021 07:44
@morozov
Copy link
Member

morozov commented Jul 25, 2021

As far as the parser is concerned, this is fixed in #4397.

@morozov morozov closed this Jul 25, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants