Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Ruby) Character literal notation not supported for some cases #2950

Closed
Hirse opened this issue Jan 6, 2021 · 5 comments · Fixed by #2969
Closed

(Ruby) Character literal notation not supported for some cases #2950

Hirse opened this issue Jan 6, 2021 · 5 comments · Fixed by #2969
Labels
bug good first issue Should be easier for first time contributors help welcome Could use help from community language

Comments

@Hirse
Copy link
Contributor

Hirse commented Jan 6, 2021

Describe the issue
Only some cases of Character Literals seem to be supported.
Support is missing for the following groups of cases (in decreasing importance):

  • Single slash character (?/), breaks following lines as it gets treated as starting a regex
  • Escaped backslash character (?\\)
  • Non-ascii character (?あ)
  • Unicode using curly-bracket notation (?\u{1AF9})
  • Control and Meta characters (?\C-a)

Ruby Character Literals - Actual

Which language seems to have the issue?
ruby

Are you using highlight or highlightAuto?
highlight

Sample Code to Reproduce

c = ?a       #=> "a"
c = ?abc     #=> SyntaxError
c = ?\n      #=> "\n"
c = ?\s      #=> " "
c = ?\\      #=> "\\"
c = ?\u{41}  #=> "A"
c = ?\C-a    #=> "\x01"
c = ?\M-a    #=> "\xE1"
c = ?\M-\C-a #=> "\x81"
c = ?\C-\M-a #=> "\x81", same as above
c = ?あ      #=> "あ"


c = ?/          #=> /
c = ?\123       # octal bit pattern, where nnn is 1-3 octal digits ([0-7])
c = ?\xA1       # hexadecimal bit pattern, where nn is 1-2 hexadecimal digits ([0-9a-fA-F])
c = ?\uAF09     # Unicode character, where nnnn is exactly 4 hexadecimal digits ([0-9a-fA-F])
c = ?\cx        # control character, where x is an ASCII printable character
c = ?\c\M-x     # meta control character, where x is an ASCII printable character
c = ?\c?        # delete, ASCII 7Fh (DEL)
c = ?\C-?       # delete, ASCII 7Fh (DEL)

Expected behavior
Ruby Character Literals - Expected

Additional context
Seen in this StackExchange/CodeGolf answer: https://codegolf.stackexchange.com/a/217367/25026

->(n,g=->c,d{(1..n).map{|i|" "*(n-i)+d+" "*2*(n+i-1)+c}},l=g[?/,e=?\\].reverse){[" "*n+?_*n*2,g[e,?/],l[0..-2],l[-1].sub(/ +(?=\/)/,?_*n*2)]}

image

@Hirse Hirse added bug help welcome Could use help from community language labels Jan 6, 2021
@joshgoebel
Copy link
Member

joshgoebel commented Jan 6, 2021

Good find. How did you get the expected behavior shot? :) Is there a PR forthcoming? :)

I think all of these look fixable at first glance.

@Hirse
Copy link
Contributor Author

Hirse commented Jan 6, 2021

Agreed, most should be a simple regex change, though I don't quite understand why the single unicode char () is not supported currently.

I might look into a PR later, but for the expected image I manually hacked the html. :D

@joshgoebel
Copy link
Member

joshgoebel commented Jan 6, 2021

I don't quite understand why the single unicode char (あ) is not supported currently.

It's not in the regex?

        begin: /\B\?(\\\d{1,3}|\\x[A-Fa-f0-9]{1,2}|\\u[A-Fa-f0-9]{4}|\\?\S)\b/

I am not sure that \S works with fancy UTF-8 stuff.

@Hirse
Copy link
Contributor Author

Hirse commented Jan 6, 2021

\S seems to match unicode:
image
https://regexr.com/5jodu

@joshgoebel
Copy link
Member

Ah, but that character does not form a word boundary so \b will abort the match.

If we're going to fix this we need variants, not one big regex. :)

@joshgoebel joshgoebel added the good first issue Should be easier for first time contributors label Jan 7, 2021
joshgoebel pushed a commit that referenced this issue Jan 22, 2021
* Fixed character literals + Added tests
* Added test cases to handle \u{nnnn ...} where nnnn is 1-6 hex digits
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug good first issue Should be easier for first time contributors help welcome Could use help from community language
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants