Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse exponent literal as number #768

Merged
merged 1 commit into from Dec 28, 2022

Conversation

Jefffrey
Copy link
Contributor

Resolve part of #610

Allows parsing of exponent literals as numbers, provided the dialect does not allow an identifier to being with a number (Hive dialect). Will need more thinking on how to handle that specific case.

@@ -144,6 +144,7 @@ pub fn all_dialects() -> TestedDialects {
Box::new(RedshiftSqlDialect {}),
Box::new(MySqlDialect {}),
Box::new(BigQueryDialect {}),
Box::new(SQLiteDialect {}),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -617,6 +618,36 @@ impl<'a> Tokenizer<'a> {
return Ok(Some(Token::Period));
}

// Parse exponent as number
if chars.peek() == Some(&'e') || chars.peek() == Some(&'E') {
let mut char_clone = chars.peekable.clone();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this copy needed?

Given chars is already peekable I don't see why it can't be used directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed a way to peek more than just the next char, since is only valid exponent if e followed by optional sign and an actual number. Found easiest way was to simply clone the iter and use that, and if found not to be an exponent and safely discard it and continue regular behaviour with original iter.

Token::Comma,
Token::Whitespace(Whitespace::Space),
Token::Number(String::from("1e-10"), false),
Token::make_word("a", None),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this very strange that a new token is formed without whitespace after a number. I expected that this is a token error but this implementation agrees with postgres 🤯

postgres=# select 12e-10a;
      a       
--------------
 0.0000000012
(1 row)

postgres=# select 12e-10 a;
      a       
--------------
 0.0000000012
(1 row)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise

postgres=# select 1e-10-10;
   ?column?    
---------------
 -9.9999999999
(1 row)

postgres=# select 1e-10 -10;
   ?column?    
---------------
 -9.9999999999
(1 row)

🤯

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this behaviour is part of what bit me when trying to implement for Hive dialect 😅

@alamb
Copy link
Collaborator

alamb commented Dec 28, 2022

Thank you @Jefffrey

@coveralls
Copy link

Pull Request Test Coverage Report for Build 3757065788

  • 65 of 69 (94.2%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.04%) to 86.373%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/tokenizer.rs 43 45 95.56%
tests/sqlparser_common.rs 22 24 91.67%
Totals Coverage Status
Change from base Build 3720056662: 0.04%
Covered Lines: 12835
Relevant Lines: 14860

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants