Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ripper: incompatibility for uppercase UTF-8 constant names in aliases #3457

Open
noahgibbs opened this issue Feb 15, 2024 · 1 comment
Open
Labels

Comments

@noahgibbs
Copy link

In aliases and many other cases, CRuby's Ripper emits different lexer tokens depending on the symbol's name. For instance an uppercase letter emits :@const instead of :@Ident.

TruffleRuby does this correctly for 7-bit constants like "A", but not for unicode uppercase constants like "Ñ".

CRuby:

irb(main):001:0> require "ripper"
=> false
irb(main):002:0> Ripper.sexp_raw("alias :foo :Ñ")
=>
[:program,
 [:stmts_add,
  [:stmts_new],
  [:alias, [:symbol_literal, [:symbol, [:@ident, "foo", [1, 7]]]], [:symbol_literal, [:symbol, [:@const, "Ñ", [1, 12]]]]]]]

TruffleRuby:

irb(main):001:0> require "ripper"
=> false
irb(main):002:0> Ripper.sexp_raw("alias :foo :Ñ")
=>
[:program,
 [:stmts_add,
  [:stmts_new],
  [:alias, [:symbol_literal, [:symbol, [:@ident, "foo", [1, 7]]]], [:symbol_literal, [:symbol, [:@ident, "Ñ", [1, 12]]]]]]]
@noahgibbs noahgibbs changed the title Ripper: incompatibility for uppercase UTF-8 letters Ripper: incompatibility for uppercase UTF-8 constant names in aliases Feb 15, 2024
@eregon
Copy link
Member

eregon commented Feb 16, 2024

Thanks for the report.
We use the same C code as CRuby for Ripper.
So this is probably a bug of id_type/rb_str_symname_type/rb_enc_symname_type or sym_type or so.

Possibly related to #3407 which is also about identifier types, but probably not because rb_enc_symname_type seems implemented in C (code from CRuby).

In general the Ripper C extension uses way too many internals and is quite slow with tons of upcalls, so we'd like to get rid of it and replace it by Prism::RipperCompat :)
I think it's best to not use Ripper on TruffleRuby in the Prism test suite, if there is a difference with CRuby it's almost surely a bug and we'd want the same behavior as CRuby for Prism::RipperCompat.

@eregon eregon added the cexts label Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants