Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support escaped string literals (PostgreSQL) #502

Merged
merged 9 commits into from May 25, 2022

Conversation

ovr
Copy link
Contributor

@ovr ovr commented May 18, 2022

Hello!

It's a draft which implements special PostgreSQL escaped string syntax.

https://www.postgresql.org/docs/8.3/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS

PostgreSQL also accepts "escape" string constants, which are an extension to the SQL standard. An escape string constant is specified by writing the letter E (upper or lower case) just before the opening single quote, e.g. E'foo'. (When continuing an escape string constant across lines, write E only before the first opening quote.) Within an escape string, a backslash character () begins a C-like backslash escape sequence, in which the combination of backslash and following character(s) represents a special byte value. \b is a backspace, \f is a form feed, \n is a newline, \r is a carriage return, \t is a tab.

image

Thanks

Signed-off-by: Dmitry Patsura <talk@dmtry.me>
@ovr
Copy link
Contributor Author

ovr commented May 18, 2022

@alamb can you take a look to verify that it's a correct approach. Thanks

@coveralls
Copy link

coveralls commented May 18, 2022

Pull Request Test Coverage Report for Build 2379093604

  • 69 of 97 (71.13%) changed or added relevant lines in 4 files are covered.
  • 601 unchanged lines in 6 files lost coverage.
  • Overall coverage decreased (-0.8%) to 89.639%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tests/sqlparser_postgres.rs 24 25 96.0%
src/parser.rs 4 7 57.14%
src/ast/value.rs 11 19 57.89%
src/tokenizer.rs 30 46 65.22%
Files with Coverage Reduction New Missed Lines %
tests/sqlparser_redshift.rs 1 98.11%
tests/sqlparser_snowflake.rs 2 96.43%
tests/sqlparser_postgres.rs 16 97.71%
tests/sqlparser_common.rs 69 97.01%
src/ast/mod.rs 156 78.21%
src/parser.rs 357 82.97%
Totals Coverage Status
Change from base Build 2328175823: -0.8%
Covered Lines: 8392
Relevant Lines: 9362

💛 - Coveralls

@alamb
Copy link
Collaborator

alamb commented May 19, 2022 via email

@ovr ovr marked this pull request as ready for review May 20, 2022 14:33
Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @ovr -- I like this PR; very nice.

The only comment I think needs to be addressed prior to merge is the comment on EscapeEscapedStringLiteral

src/ast/value.rs Outdated Show resolved Hide resolved
src/ast/value.rs Outdated

impl<'a> fmt::Display for EscapeEscapedStringLiteral<'a> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let mut is_escaped = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see is_escaped ever set to false -- I would expect it would start with is_escaped = true and then is_escaped wold be set to false after each character was written

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this work with: First \n second \\ third \n fourth \

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't see is_escaped ever set to false -- I would expect it would start with is_escaped = true and then is_escaped wold be set to false after each character was written

I will remove it, It's useles.

It's not a correct value, because the last \ should be escaped.
image

src/parser.rs Outdated
@@ -496,6 +496,10 @@ impl<'a> Parser<'a> {
expr: Box::new(self.parse_subexpr(Self::PLUS_MINUS_PREC)?),
})
}
Token::EscapedStringLiteral(_) if dialect_of!(self is PostgreSqlDialect) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 for conditionalizing on postgres dialect

src/tokenizer.rs Outdated Show resolved Hide resolved
let mut s = String::new();
chars.next(); // consume the opening quote

// slash escaping
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYIW this example from stack overflow looks like it might be a nice way to avoid macro overhead (and thus code bloat): https://stackoverflow.com/questions/58551211/how-do-i-interpret-escaped-characters-in-a-string

Copy link
Contributor Author

@ovr ovr May 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the same because this function tries to find & escapes the string from the query. It tries to find a single quote that can be escaped or not escaped (end of the string).

in our case string are wrapped in single quotes, i.e e'str'


#[test]
fn parse_escaped_literal_string() {
let sql = r#"SELECT E's1 \n s1', E's2 \\n s2', E's3 \\\n s3', E's4 \\\\n s4', E'\''"#;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also recommend some negative tests like ' Foo\'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a correct value because the last quote was escaped, there is no single quote which should close the string expr. f46b07e

ovr and others added 6 commits May 23, 2022 13:47
Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good -- thank you @ovr !

@alamb alamb merged commit 2c0886d into sqlparser-rs:main May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants