Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[parser] Disallow numeric separator in unicode scape sequences #10468

Merged
merged 7 commits into from Sep 23, 2019

Conversation

ivandevp
Copy link
Contributor

Q                       A
Fixed Issues? Fixes #10460
Patch: Bug Fix? Yes
Major: Breaking Change? No
Minor: New Feature? No
Tests Added + Pass? Yes
Any Dependency Changes? No
License MIT

@ivandevp ivandevp changed the title [parser] Disallow numeric separator in unicode scape sequences (#10460) [parser] Disallow numeric separator in unicode scape sequences Sep 19, 2019
@babel-bot
Copy link
Collaborator

babel-bot commented Sep 19, 2019

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/11657/

@existentialism existentialism added PR: Bug Fix 🐛 A type of pull request used for our changelog categories Spec: Numeric Separator pkg: parser labels Sep 19, 2019
@@ -897,6 +897,19 @@ export default class Tokenizer extends LocationParser {

let total = 0;

// Called from readHexChar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than readCodePoint, readHexChar could also be called from readRadixNumber. Did you checked whether 0xf_f can be successfully parsed?

We could add an extra allowNumSeparator: boolean to readHexChar

readHexChar(len: number, throwOnInvalid: boolean, allowNumSeparator: boolean): number | null

and set allowNumSeparator = true when calling from readCodePoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.


if (unicode.includes("_")) {
const numSeparatorPos = this.input.indexOf("_");
this.raise(
Copy link
Contributor

@JLHwung JLHwung Sep 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should throw an error like this if user does not enable numericSeparator parser plugin. Consider moving the error logic inside this.hasPlugin("numericSeparator") branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.

const unicode = this.input.slice(this.state.pos, this.state.pos + len);

if (unicode.includes("_")) {
const numSeparatorPos = this.input.indexOf("_");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work in case there is another _ in the input somewhere before the number. e.g.

_;

"\x1_2"

It would be better to move this check inside the for loop right after, where there already are the other errors about numeric separators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@@ -917,6 +921,13 @@ export default class Tokenizer extends LocationParser {
this.raise(this.state.pos, "Invalid or unexpected token");
}

if (allowNumSeparator) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it only throw if it is not allowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... My bad. I'll fix it right now.

Copy link
Member

@nicolo-ribaudo nicolo-ribaudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

if (!allowNumSeparator) {
this.raise(
this.state.pos,
"Numeric separators are not allowed inside unicode escape sequences",
Copy link
Contributor

@JLHwung JLHwung Sep 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Numeric separators are also not allowed in hex escape sequences:

var c = \x1_0
Suggested change
"Numeric separators are not allowed inside unicode escape sequences",
"Numeric separators are not allowed inside unicode escape sequences or hex escape sequences",

Could you also add a test case on hex escape sequences?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@@ -1250,9 +1262,13 @@ export default class Tokenizer extends LocationParser {

// Used to read character escape sequences ('\x', '\u').

readHexChar(len: number, throwOnInvalid: boolean): number | null {
readHexChar(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readHexChar is only used for \x and \u, so we don't have to expose allowNumSeparator to its caller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Good catch. So, now it is exposed to readInt's callers and changed to false when called from readHexChar.

@lock lock bot added the outdated A closed issue/PR that is archived due to age. Recommended to make a new issue label Dec 23, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Dec 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
outdated A closed issue/PR that is archived due to age. Recommended to make a new issue pkg: parser PR: Spec Compliance 👓 A type of pull request used for our changelog categories Spec: Numeric Separator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Numeric separators should be disallowed in unicode escape sequences
5 participants