Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(JavaScript) Number in class or variable name breaks highlight #3921

Open
jdbruxelles opened this issue Nov 10, 2023 · 6 comments
Open

(JavaScript) Number in class or variable name breaks highlight #3921

jdbruxelles opened this issue Nov 10, 2023 · 6 comments
Labels
bug help welcome Could use help from community language

Comments

@jdbruxelles
Copy link

Describe the issue

I have a JS lib for i18n named si18n.js, and its class name is Si18n. So, when I highlight this keyword, only the part before "n" is correctly highlighted. In JavaScript, class name can contain numbers, but cannot start with a number. The parser must consider the part after numbers in class and variable name. It's already OK for functions name.

Any function or class name starting with uppercase letter AND contains number(s) AND ends with letter(s) are concerned.

Which language seems to have the issue?

JavaScript without auto-detection

Are you using highlight or highlightAuto?

I use highlight.

Sample Code to Reproduce

import Si18n from "si18n"; // Not OK
let loc = new Si18n();     // Not OK
let Si18n = "test";        // Not OK
let si18n = "test";        // OK
let SI18N = "test";        // OK
let SI18 = "test";         // OK
let si18ns92 = "test";     // OK
let SI_18_N = "test";      // OK
function si18n() {}        // OK
function Si18n() {}        // OK

Demo 1 - Demo 2

Expected behavior

Because class or variable name can contain numbers, a name like Si18n should be entirely highlighted. See the example above highlighted by GitHub.

@jdbruxelles jdbruxelles added bug help welcome Could use help from community language labels Nov 10, 2023
@joshgoebel
Copy link
Member

Any function or class name starting with uppercase letter AND contains number(s) AND ends with letter(s) are concerned.

I'm not sure what this is a definition of exactly - or if it's merely your preference. I just quickly googled both CamelCase and PythonCase and didn't find any real mention of numbers at all. Our highlighting rules are based on JS classes being idiomatically PythonCase. I believe we allow numbers after the full name, such as HvacEncoder32 or Base64.

I think you're suggesting that ANY lower-case letter could be replaced with a number?

Could you point me to any popular articles that discuss this being a valid understanding of PythonCase or it being idiomatic JS? Obviously it's a VALID class name, but that doesn't make it idiomatic.

@jdbruxelles
Copy link
Author

jdbruxelles commented Nov 11, 2023

or if it's merely your preference.

No

Maybe I'm not being clear with my words, I've provided two links (with the same content) for the demo.

What I'm trying to say is that you can put numbers in the middle of a string of letters because it's possible and valid (in JS). Admittedly, it's not often done, but it's valid. GitHub, VS Code (Monaco), Codemirror, etc. manage this very well. Why not highlight.js?

@joshgoebel
Copy link
Member

joshgoebel commented Nov 11, 2023

GitHub, VS Code (Monaco), Codemirror, etc.

We're unlike those other highlighters in that we support auto-detection - which create the possibility of one grammar messing with the auto-detection off another. Wider rules increase the risk of false positives across language boundaries. A wider rule is more likely to auto-detect the wrong language. Right now all grammars that highlight class names use the same simple rule (that doesn't work for internal numbers)...

To balance auto-detect we'd need to look at all grammars that have a PascalCase class matcher and see if numbers were ALSO valid in those grammars... making this change everywhere such names are valid would be necessary, not just fixing JS/TS.

Otherwise, the plan is to deprecate auto-detect with v12 - and then making changes to a grammar no longer needs to take all the other grammars into account.

@jdbruxelles
Copy link
Author

Thank you for your explanation. I understand the concerns about making changes that could affect other languages and potentially lead to false positives in auto-detection. I understand the complexities involved in implementing this change globally, especially considering the auto-detection feature.

If making a broader change to the highlighting rules is not viable or might introduce unintended issues across grammars, I respect the project's decision. It's good to know that you're considering these factors carefully (not because of preferences 😉).

I'm looking forward to seeing how the project evolves in the future, whether that involves addressing this issue or considering alternatives to improve highlighting in JavaScript. Thanks for your time and attention to this matter.

@joshgoebel
Copy link
Member

I'm open to leaving this open, just if we fixed it today (pre v12) someone wouldn't need to have looked at each of the grammars that has a smilier rule... In fact I'm starting to think perhaps we should just have a few camelCase, PythonCase, etc rules and add them as built-in modes - then every language could consume them...

Then we'd only need to be worrying about exception cases. Of course that begs the question of whether the "default" rule would includes numbers or not... which would likely require some small amount of looking into the other grammars involved.

@jdbruxelles
Copy link
Author

I appreciate your willingness to keep the question open, so I'm reopening it. I look forward to seeing how all of this evolves. Thanks again for your consideration and efforts.

@jdbruxelles jdbruxelles reopened this Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help welcome Could use help from community language
Projects
None yet
Development

No branches or pull requests

2 participants