Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count characters instead of bytes #529

Merged
merged 3 commits into from Jul 7, 2022
Merged

Conversation

michael-2956
Copy link
Contributor

For queries such as SELECT "なにか" FROM Y WHERE "なにか" = 'test; which are using UTF-8 characters as column names, counting lengths as byte counts (which len() does) may cause bugs.

For instance, the above query would produce an error Unterminated string literal at Line: 1, Column 47, while the correct column number is 35.

So, simply use .chars().count() instead. Replacing len() with this produces correct output.

Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution @michael-2956 !

Can you please add a test to this PR (so that this feature doesn't potentially get broken by a future change without us knowing)

@coveralls
Copy link

coveralls commented Jun 28, 2022

Pull Request Test Coverage Report for Build 2629916400

  • 12 of 12 (100.0%) changed or added relevant lines in 1 file are covered.
  • 412 unchanged lines in 5 files lost coverage.
  • Overall coverage increased (+0.1%) to 89.89%

Files with Coverage Reduction New Missed Lines %
tests/sqlparser_mysql.rs 1 99.8%
src/ast/query.rs 17 86.61%
tests/sqlparser_common.rs 27 97.15%
src/tokenizer.rs 71 89.27%
src/parser.rs 296 83.39%
Totals Coverage Status
Change from base Build 2535757386: 0.1%
Covered Lines: 8873
Relevant Lines: 9871

💛 - Coveralls

@michael-2956 michael-2956 requested a review from alamb July 7, 2022 14:00
Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- thank you @michael-2956

@alamb alamb merged commit c2ccc80 into sqlparser-rs:main Jul 7, 2022
mobuchowski pushed a commit to mobuchowski/sqlparser-rs that referenced this pull request Aug 3, 2022
* Count characters instead of bytes

* cargo fmt

* add tests to PR sqlparser-rs#529
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants