Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(R) Numbers a not highlighted in multiple circumstances #3194

Closed
klmr opened this issue May 17, 2021 · 4 comments · Fixed by #3195
Closed

(R) Numbers a not highlighted in multiple circumstances #3194

klmr opened this issue May 17, 2021 · 4 comments · Fixed by #3195
Labels
bug help welcome Could use help from community language

Comments

@klmr
Copy link
Contributor

klmr commented May 17, 2021

Describe the issue

  1. Numbers aren’t highlighted at the very start of the document
  2. Numbers aren’t highlighted after  <-  (and possibly after other constructs)
  3. -> is declared as illegal (to distinguish it from Haskell) despite being legal in R; this might be an OK heuristic, but the documentation makes it sound as if illegal is more than just a gentle nudge in the weighting, and this wouldn’t be appropriate.

Which language seems to have the issue?

R

Are you using highlight or highlightAuto?

highlight

Sample Code to Reproduce

1
a <- 1

Expected behavior

<span class="hljs-number">1</span>
a &lt;- <span class="hljs-number">1</span>

I’ve got a PR to fix these issues and also implements operators and punctuation highlighting at the same time. But, alas, it now fails to highlight numbers directly after operators/punctuation, because the rule for numbers eats the preceding character. I thought we had a solution for this that didn’t require look-behind, but that solution was removed in 0e9fb35 (ostensibly this shouldn’t have caused a behavioural change, so it’s possible that the previous version already had the errors described above).

Here’s the patch I used to add operators (which fixes problem (2)) and allow numbers at the beginning of the document, but this now causes numbers directly after operators to fail highlighting.
diff --git a/src/languages/r.js b/src/languages/r.js
index a4348678..5cb90f3e 100644
--- a/src/languages/r.js
+++ b/src/languages/r.js
@@ -144,32 +140,42 @@ export default function(hljs) {
           {
             // Special case: only hexadecimal binary powers can contain fractions.
             match: [
-              /[^a-zA-Z0-9._]/, // not part of an identifier
+              /[^a-zA-Z0-9._]|^/, // not part of an identifier
               /0[xX][0-9a-fA-F]+\.[0-9a-fA-F]*[pP][+-]?\d+i?/
             ]
           },
           {
             match: [
-              /[^a-zA-Z0-9._]/, // not part of an identifier
+              /[^a-zA-Z0-9._]|^/, // not part of an identifier
               /0[xX][0-9a-fA-F]+(?:[pP][+-]?\d+)?[Li]?/
             ]
           },
           {
             match: [
-              /[^a-zA-Z0-9._]/, // not part of an identifier
+              /[^a-zA-Z0-9._]|^/, // not part of an identifier
               /(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?[Li]?/
             ]
           }
         ]
       },
       {
-        // infix operator
-        begin: '%',
-        end: '%'
+        className: 'operator',
+        variants: [
+          {
+            // custom infix operator
+            begin: '%',
+            end: '%'
+          },
+          {
+            // Base operators minus replacement functions and punctuation,
+            // plus aliases.
+            match: /[=!<>:]=|\|\||&&|:::?|<-|<<-|[-+*\/?!$&|:<=>@^~]|->|->>|\*\*/
+          }
+        ]
       },
-      // relevance boost for assignment
       {
-        begin: regex.concat(SIMPLE_IDENT, "\\s+<-\\s+")
+        className: 'punctuation',
+        match: /\[\[|[(){}[\]]/
       },
       {
         // escaped identifier
@klmr klmr added bug help welcome Could use help from community language labels May 17, 2021
@joshgoebel
Copy link
Member

joshgoebel commented May 17, 2021

I thought we had a solution for this that didn’t require look-behind, but that solution was removed in 0e9fb35

beforeMatch was look-ahead also, it also would have consumed the prior character and you'd have the exactly same problem. We have no true look-behind solution (outside of callbacks which we prefer not to use).

You might try two variants of number using multi-match (I'm not sure how ugly it will be):

variants:
[
 { match: [
    operator_re
    regex.either(number_types)
   ],
  scope: { 1: "operator", 2: "number" }
 },
 { match: [
    not_identifier
    regex.either(number_types)
   ]},

]

IE, combining the operator into the number matcher.

@joshgoebel
Copy link
Member

-> is declared as illegal (to distinguish it from Haskell) despite being legal in R;

That would probably be incorrect then, though sometimes context matters... like perhaps it's invalid in a certain scope, but valid in others...

@klmr
Copy link
Contributor Author

klmr commented May 18, 2021

OK, I think I’ve got it. PR coming in 3…2…1…

@joshgoebel
Copy link
Member

Talk about anti-climatic. :)

klmr added a commit to klmr/highlight.js that referenced this issue May 18, 2021
This change adds highlighting for operators and punctuation, and fixes
the issues described in highlightjs#3194.
joshgoebel pushed a commit that referenced this issue May 19, 2021
This change adds highlighting for operators and punctuation, and fixes
the issues described in #3194.

* Give R a relevance boost from arrow-assign
* Make `<-` less of a signal boost for R
* Rebalance relevance of common syntactic constructs
* Fix Vala having too much relevance for `^#` (meta/comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help welcome Could use help from community language
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants