Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

highlight line should handle case where size == 0 #785

Open
rubys opened this issue Jan 13, 2021 · 4 comments
Open

highlight line should handle case where size == 0 #785

rubys opened this issue Jan 13, 2021 · 4 comments

Comments

@rubys
Copy link
Contributor

rubys commented Jan 13, 2021

Reproduction instructions:

irb(main):003:0> Parser::CurrentRuby.parse('foo++')
(string):1:6: error: unexpected token $end
(string):1: foo++
(string):1:
Traceback (most recent call last):
...

Note the blank line where there should be either a caret or a tilde indicating the spot of the error.

The code in question is in Parser::Diagnostic::render_line.

I'd submit a pull request, but it isn't clear to me whether the code should be putting out a caret or a tilde in this case. I have verified that column_range.begin_pos is the spot immediately after the end of the line in this case.

@iliabylich
Copy link
Collaborator

$end is a special token that has begin = end = EOF, this is why it has zero source range.

Some other languages also don't render it:

$ node -e 'foo**'
[eval]:1
foo**


SyntaxError: Unexpected end of input
    at new Script (vm.js:88:7)
    at createScript (vm.js:263:10)
    at Object.runInThisContext (vm.js:311:10)
    at Object.<anonymous> ([eval]-wrapper:10:26)
    at Module._compile (internal/modules/cjs/loader.js:1138:30)
    at evalScript (internal/process/execution.js:94:25)
    at internal/main/eval_string.js:23:3

$ perl -e '++'
syntax error at -e line 1, at EOF
Execution of -e aborted due to compilation errors.

but some languages do:

$ python -c 'foo++'
  File "<string>", line 1
    foo++
        ^
SyntaxError: invalid syntax

$ echo "class ++" > test.java && javac test.java
test.java:1: error: <identifier> expected
class ++
     ^
test.java:1: error: reached end of file while parsing
class ++
        ^
2 errors

I think it's ok to change it. A few options:

  1. we could emit $eof token with a EOF...EOF+1 range, but I can't estimate implications of this change. Potentially it can break libraries that depend on parser.
  2. we could check for error_token_id == 0 in Parser::Base#on_error and if it's zero add +1 to location.end, but I guess it's not a part of the Racc stable API, i.e. it can change anytime.
  3. we could emit a different error message for $eof token, but it seems to be a breaking change.

Option 1 seems to be the best if nobody relies on this behavior. @mbj @palkan do you remember any places in your code using it? For rubocop we'll know it from CI 😄

@rubys Feel free to send a PR (if not I'll do it myself in a few days). lexer.rl should return [ false, [ '$eof'.freeze, range(eof, eof + 1) ] ] in #advance method, also it can break some unit tests, so expectations should be updated too.

@rubys
Copy link
Contributor Author

rubys commented Jan 13, 2021

I actually don't believe that any change to the tokens currently emitted is needed. At the moment, render_line is called with the following values:

range = #<Parser::Source::Range (string) 5...5>
range_end = false

Given this information, all that needs to be decided is:

  • what character to use. either caret and tilde seem to be reasonable choices
  • what position to place this character. For this input, either column 5 or column 6 (counting the first character as one) seem to be reasonable choices. I'd actually lean towards column 6 given that the current output contains the following line:
(string):1:6: error: unexpected token $end

@palkan
Copy link
Contributor

palkan commented Jan 14, 2021

do you remember any places in your code using it?

Nope 🙂

@iliabylich
Copy link
Collaborator

@rubys Yes, but then how can you know when this custom rule should be applied? If we apply it to all zero-length ranges we can easily miss some bugs for other diagnostic messages. Same for EOF...EOF ranges in general. This additional logic should be applied only to unexpected token $eof error.

what character to use. either caret and tilde seem to be reasonable choices

I'd use caret.

what position to place this character. For this input, either column 5 or column 6 (counting the first character as one) seem to be reasonable choices. I'd actually lean towards column 6 given that the current output contains the following line:

Agree, so for foo++ we should get

foo++
     ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants