Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(clj) Fix broken/inconsistent highlighting for $, Numbers, namespaced maps, HINT mode. Add charachter, regex and punctuation modes. #3397

Merged
merged 13 commits into from Nov 10, 2021
Merged
10 changes: 9 additions & 1 deletion CHANGES.md
Expand Up @@ -5,7 +5,15 @@ Grammars:
- fix(python) Fix recognition of numeric literals followed by keywords without whitespace (#2985) [Richard Gibson][]
- enh(swift) add SE-0290 unavailability condition (#3382) [Bradley Mackey][]
- enh(java) add `sealed` and `non-sealed` keywords (#3386) [Bradley Mackey][]
- fix(clojure) `comment` macro catches more than it should [Björn Ebbinghaus][]
- fix(clojure) Several issues with Clojure highlighting (#3397) [Björn Ebbinghaus][]
- fix(clojure) `comment` macro catches more than it should (#3395)
- fix(clojure) `$` in symbol breaks highlighting
- fix(clojure) Add complete regex for number detection
- enh(clojure) Add character mode for character literals
- fix(clojure) Inconsistent namespaced map highlighting
- enh(clojure) Add `regex` mode to regex literal
- fix(clojure) Remove inconsistent/broken highlighting for metadata
- enh(clojure) Add `punctuation` mode for commas.
MrEbbinghaus marked this conversation as resolved.
Show resolved Hide resolved

[Richard Gibson]: https://github.com/gibson042
[Bradley Mackey]: https://github.com/bradleymackey
Expand Down
56 changes: 38 additions & 18 deletions src/languages/clojure.js
Expand Up @@ -8,8 +8,8 @@ Category: lisp

/** @type LanguageFn */
export default function(hljs) {
const SYMBOLSTART = 'a-zA-Z_\\-!.?+*=<>&#\'';
const SYMBOL_RE = '[' + SYMBOLSTART + '][' + SYMBOLSTART + '0-9/;:]*';
const SYMBOLSTART = 'a-zA-Z_\\-!.?+*=<>&\'';
MrEbbinghaus marked this conversation as resolved.
Show resolved Hide resolved
const SYMBOL_RE = '[#]?[' + SYMBOLSTART + '][' + SYMBOLSTART + '0-9/;:$#]*';
const globals = 'def defonce defprotocol defstruct defmulti defmethod defn- defn defmacro deftype defrecord';
const keywords = {
$pattern: SYMBOL_RE,
Expand Down Expand Up @@ -45,20 +45,44 @@ export default function(hljs) {
'lazy-seq spread list* str find-keyword keyword symbol gensym force rationalize'
};

const SIMPLE_NUMBER_RE = '[-+]?\\d+(\\.\\d+)?';

const SYMBOL = {
begin: SYMBOL_RE,
relevance: 0
};
const NUMBER = {
className: 'number',
begin: SIMPLE_NUMBER_RE,
relevance: 0
scope: 'number',
relevance: 0,
variants: [
{match: /[-+]?0[xX][0-9a-fA-F]+N?/}, // hexadecimal // 0x2a
{match: /[-+]?0[0-7]+N?/}, // octal // 052
{match: /[-+]?[1-9][0-9]?[rR][0-9a-zA-Z]+N?/}, // variable radix from 2 to 36 // 2r101010, 8r52, 36r16
{match: /[-+]?[0-9]+\/[0-9]+N?/}, // ratio // 1/2
{match: /[-+]?[0-9]+((\.[0-9]*([eE][+-]?[0-9]+)?M?)|([eE][+-]?[0-9]+M?|M))/}, // float // 0.42 4.2E-1M 42E1 42M
{match: /[-+]?([1-9][0-9]*|0)N?/}, // int (don't match leading 0) // 42 42N
]
};
const CHARACTER = {
scope: 'character',
variants: [
{match: /\\o[0-3]?[0-7]{1,2}/}, // Unicode Octal 0 - 377
{match: /\\u[0-9a-fA-F]{4}/}, // Unicode Hex 0000 - FFFF
{match: /\\(newline|space|tab|formfeed|backspace|return)/}, // special characters
{match: /\\\S/} // any non-whitespace char
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so tiny a match... can we do any looking forward to qualify it? I'm imagining false positives with any language that allows \SOMETHING style tags... if we can't qualify it I think we should make it relevance: 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, it's [literal \][non space] (even broader than I was first reading it)... yeah I think we should add relevance: 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a literal and can appear anywhere, so you are right with relevance: 0.
I didn't put any thought in this or the other modes, what the relevance may be. What is "typically Clojure"?

It would be a nice exercise to generate possible values for modes and try to match them against other languages to automatically calculate the relevance. 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added relevance: 0.

Although, I am not sure how many programming languages support the full first Unicode plane. \⠓ is a valid Clojure literal.

]
}
MrEbbinghaus marked this conversation as resolved.
Show resolved Hide resolved
const REGEX = {
scope: 'regex',
begin: /#"/,
end: /"/
}
const STRING = hljs.inherit(hljs.QUOTE_STRING_MODE, {
illegal: null
});
const COMMA = {
scope: 'punctuation',
match: /,/,
relevance: 0
}
const COMMENT = hljs.COMMENT(
';',
'$',
Expand All @@ -71,15 +95,10 @@ export default function(hljs) {
begin: /\b(true|false|nil)\b/
};
const COLLECTION = {
begin: '[\\[\\{]',
begin: "\\[|(#::?" + SYMBOL_RE + ")?\\{",
end: '[\\]\\}]',
relevance: 0
};
const HINT = {
className: 'comment',
begin: '\\^' + SYMBOL_RE
};
const HINT_COL = hljs.COMMENT('\\^\\{', '\\}');
const KEY = {
className: 'symbol',
begin: '[:]{1,2}' + SYMBOL_RE
Expand All @@ -100,10 +119,11 @@ export default function(hljs) {
starts: BODY
};
const DEFAULT_CONTAINS = [
COMMA,
LIST,
CHARACTER,
REGEX,
STRING,
HINT,
HINT_COL,
COMMENT,
KEY,
COLLECTION,
Expand Down Expand Up @@ -138,17 +158,17 @@ export default function(hljs) {
];
BODY.contains = DEFAULT_CONTAINS;
COLLECTION.contains = DEFAULT_CONTAINS;
HINT_COL.contains = [ COLLECTION ];

return {
name: 'Clojure',
aliases: [ 'clj', 'edn' ],
illegal: /\S/,
contains: [
COMMA,
LIST,
CHARACTER,
REGEX,
STRING,
HINT,
HINT_COL,
COMMENT,
KEY,
COLLECTION,
Expand Down
5 changes: 5 additions & 0 deletions test/markup/clojure/character.expect.txt
@@ -0,0 +1,5 @@
<span class="hljs-character">\A</span>
<span class="hljs-character">\a</span>
<span class="hljs-character">\formfeed</span>
<span class="hljs-character">\u00DF</span>
<span class="hljs-character">\o303</span>
5 changes: 5 additions & 0 deletions test/markup/clojure/character.txt
@@ -0,0 +1,5 @@
\A
\a
\formfeed
\u00DF
\o303
18 changes: 9 additions & 9 deletions test/markup/clojure/deps_edn.expect.txt
@@ -1,14 +1,14 @@
{<span class="hljs-symbol">:aliases</span> {<span class="hljs-symbol">:export</span> {<span class="hljs-symbol">:exec-fn</span> stelcodes.dev-blog.generator/export},
<span class="hljs-symbol">:repl</span> {<span class="hljs-symbol">:extra-deps</span> {cider/cider-nrepl {<span class="hljs-symbol">:mvn/version</span> <span class="hljs-string">&quot;0.25.2&quot;</span>},
nrepl/nrepl {<span class="hljs-symbol">:mvn/version</span> <span class="hljs-string">&quot;0.8.3&quot;</span>}},
<span class="hljs-symbol">:extra-paths</span> [<span class="hljs-string">&quot;dev&quot;</span>],
{<span class="hljs-symbol">:aliases</span> {<span class="hljs-symbol">:export</span> {<span class="hljs-symbol">:exec-fn</span> stelcodes.dev-blog.generator/export}<span class="hljs-punctuation">,</span>
<span class="hljs-symbol">:repl</span> {<span class="hljs-symbol">:extra-deps</span> {cider/cider-nrepl {<span class="hljs-symbol">:mvn/version</span> <span class="hljs-string">&quot;0.25.2&quot;</span>}<span class="hljs-punctuation">,</span>
nrepl/nrepl {<span class="hljs-symbol">:mvn/version</span> <span class="hljs-string">&quot;0.8.3&quot;</span>}}<span class="hljs-punctuation">,</span>
<span class="hljs-symbol">:extra-paths</span> [<span class="hljs-string">&quot;dev&quot;</span>]<span class="hljs-punctuation">,</span>
<span class="hljs-symbol">:main-opts</span> [<span class="hljs-string">&quot;-m&quot;</span>
<span class="hljs-string">&quot;nrepl.cmdline&quot;</span>
<span class="hljs-string">&quot;--middleware&quot;</span>
<span class="hljs-string">&quot;[cider.nrepl/cider-middleware]&quot;</span>
<span class="hljs-string">&quot;--interactive&quot;</span>]},
<span class="hljs-symbol">:webhook</span> {<span class="hljs-symbol">:exec-fn</span> stelcodes.dev-blog.webhook/listen}},
<span class="hljs-symbol">:deps</span> {http-kit/http-kit {<span class="hljs-symbol">:mvn/version</span> <span class="hljs-string">&quot;2.5.3&quot;</span>},
org.clojure/clojure {<span class="hljs-symbol">:mvn/version</span> <span class="hljs-string">&quot;1.10.1&quot;</span>},
stasis/stasis {<span class="hljs-symbol">:mvn/version</span> <span class="hljs-string">&quot;2.5.1&quot;</span>}},
<span class="hljs-string">&quot;--interactive&quot;</span>]}<span class="hljs-punctuation">,</span>
<span class="hljs-symbol">:webhook</span> {<span class="hljs-symbol">:exec-fn</span> stelcodes.dev-blog.webhook/listen}}<span class="hljs-punctuation">,</span>
<span class="hljs-symbol">:deps</span> {http-kit/http-kit {<span class="hljs-symbol">:mvn/version</span> <span class="hljs-string">&quot;2.5.3&quot;</span>}<span class="hljs-punctuation">,</span>
org.clojure/clojure {<span class="hljs-symbol">:mvn/version</span> <span class="hljs-string">&quot;1.10.1&quot;</span>}<span class="hljs-punctuation">,</span>
stasis/stasis {<span class="hljs-symbol">:mvn/version</span> <span class="hljs-string">&quot;2.5.1&quot;</span>}}<span class="hljs-punctuation">,</span>
<span class="hljs-symbol">:paths</span> [<span class="hljs-string">&quot;src&quot;</span> <span class="hljs-string">&quot;resources&quot;</span>]}
13 changes: 9 additions & 4 deletions test/markup/clojure/globals_definition.expect.txt
Expand Up @@ -5,21 +5,26 @@
<span class="hljs-comment">; function</span>
(<span class="hljs-keyword">defn</span> <span class="hljs-title">clojure-function</span> [args]
(<span class="hljs-name"><span class="hljs-built_in">let</span></span> [string <span class="hljs-string">&quot;multiline\nstring&quot;</span>
regexp #<span class="hljs-string">&quot;regexp&quot;</span>
number <span class="hljs-number">100</span>,<span class="hljs-number">000</span>
regexp <span class="hljs-regex">#&quot;regexp&quot;</span>
number <span class="hljs-number">100000</span>
booleans [<span class="hljs-literal">false</span> <span class="hljs-literal">true</span>]
keyword <span class="hljs-symbol">::the-keyword</span>]
<span class="hljs-comment">;; this is comment</span>
(<span class="hljs-name"><span class="hljs-built_in">if</span></span> <span class="hljs-literal">true</span>
(<span class="hljs-name"><span class="hljs-built_in">-&gt;&gt;</span></span>
(<span class="hljs-name"><span class="hljs-built_in">list</span></span> [vector] {<span class="hljs-symbol">:map</span> map} #{&#x27;set})))))

#:person{<span class="hljs-symbol">:first</span> <span class="hljs-string">&quot;Han&quot;</span>
<span class="hljs-symbol">:last</span> <span class="hljs-string">&quot;Solo&quot;</span>
<span class="hljs-symbol">:ship</span> #:ship{<span class="hljs-symbol">:name</span> <span class="hljs-string">&quot;Millenium Falcon&quot;</span>}}
#::{<span class="hljs-symbol">:a</span> <span class="hljs-number">1</span><span class="hljs-punctuation">,</span> <span class="hljs-symbol">:b</span> <span class="hljs-number">2</span>}

<span class="hljs-comment">; global</span>
(<span class="hljs-keyword">def</span> <span class="hljs-title">some-var</span>)
<span class="hljs-comment">; another one</span>
(<span class="hljs-keyword">def</span> <span class="hljs-title">alternative-var</span> <span class="hljs-string">&quot;132&quot;</span>)
<span class="hljs-comment">; defonce</span>
(<span class="hljs-keyword">defonce</span> ^<span class="hljs-symbol">:private</span> <span class="hljs-title">another-var</span> #<span class="hljs-string">&quot;foo&quot;</span>)
(<span class="hljs-keyword">defonce</span> ^<span class="hljs-symbol">:private</span> <span class="hljs-title">another-var</span> <span class="hljs-regex">#&quot;foo&quot;</span>)

<span class="hljs-comment">; private function</span>
(<span class="hljs-keyword">defn-</span> <span class="hljs-title">add</span> [x y] (<span class="hljs-name"><span class="hljs-built_in">+</span></span> x y))
Expand Down Expand Up @@ -58,4 +63,4 @@

<span class="hljs-comment">;; create a couple shapes and get their area</span>
(<span class="hljs-keyword">def</span> <span class="hljs-title">myCircle</span> (<span class="hljs-name">Circle.</span> <span class="hljs-number">10</span>))
(<span class="hljs-keyword">def</span> <span class="hljs-title">mySquare</span> (<span class="hljs-name">Square.</span> <span class="hljs-number">5</span> <span class="hljs-number">11</span>))
(<span class="hljs-keyword">def</span> <span class="hljs-title">mySquare</span> (<span class="hljs-name">Square.</span> <span class="hljs-number">5</span> <span class="hljs-number">11</span>))
7 changes: 6 additions & 1 deletion test/markup/clojure/globals_definition.txt
Expand Up @@ -6,14 +6,19 @@
(defn clojure-function [args]
(let [string "multiline\nstring"
regexp #"regexp"
number 100,000
number 100000
joshgoebel marked this conversation as resolved.
Show resolved Hide resolved
booleans [false true]
keyword ::the-keyword]
;; this is comment
(if true
(->>
(list [vector] {:map map} #{'set})))))

#:person{:first "Han"
:last "Solo"
:ship #:ship{:name "Millenium Falcon"}}
#::{:a 1, :b 2}

; global
(def some-var)
; another one
Expand Down
34 changes: 0 additions & 34 deletions test/markup/clojure/hint_col.expect.txt

This file was deleted.

34 changes: 0 additions & 34 deletions test/markup/clojure/hint_col.txt

This file was deleted.

69 changes: 69 additions & 0 deletions test/markup/clojure/number.expect.txt
@@ -0,0 +1,69 @@
<span class="hljs-comment">; integer</span>
<span class="hljs-number">00</span>
<span class="hljs-number">42</span>
<span class="hljs-number">+42</span>
<span class="hljs-number">-42</span>

<span class="hljs-comment">; BigInt</span>
<span class="hljs-number">42N</span>
<span class="hljs-number">0N</span>
<span class="hljs-number">+42N</span>
<span class="hljs-number">-42N</span>

<span class="hljs-comment">; octal</span>
<span class="hljs-number">052</span>
<span class="hljs-number">00N</span>
<span class="hljs-number">+052</span>
<span class="hljs-number">-00N</span>

<span class="hljs-comment">; hex</span>
<span class="hljs-number">0x2a</span>
<span class="hljs-number">0x0N</span>
<span class="hljs-number">+0x2a</span>
<span class="hljs-number">-0x0N</span>

<span class="hljs-comment">; radix</span>
<span class="hljs-number">2r101010</span>
<span class="hljs-number">8r52</span>
<span class="hljs-number">16r2a</span>
<span class="hljs-number">36r16</span>
<span class="hljs-number">-2r101010</span>
<span class="hljs-number">+36r16</span>

<span class="hljs-comment">; radix BigInt</span>
<span class="hljs-number">2r101010N</span>
<span class="hljs-number">8r52N</span>
<span class="hljs-number">16r2aN</span>
<span class="hljs-number">36r16N</span>
<span class="hljs-number">+8r52N</span>
<span class="hljs-number">-16r2aN</span>

<span class="hljs-comment">;; ratios</span>
<span class="hljs-number">1/2</span>
<span class="hljs-number">-1/2</span>
<span class="hljs-number">+123/224</span>

<span class="hljs-comment">;; floats</span>
<span class="hljs-number">42.0</span>
<span class="hljs-number">-42.0</span>
<span class="hljs-number">+42.0</span>
<span class="hljs-number">42.</span>
<span class="hljs-number">+42.</span>
<span class="hljs-number">-42.</span>

<span class="hljs-comment">; BigDecimal</span>
<span class="hljs-number">42.0M</span>
<span class="hljs-number">-42M</span>
<span class="hljs-number">42.M</span>
<span class="hljs-number">42M</span>

<span class="hljs-comment">; with Exponent</span>
<span class="hljs-number">42.0E2</span>
<span class="hljs-number">-42.0E+9</span>
<span class="hljs-number">42E-0</span>
<span class="hljs-number">+42E-0</span>

<span class="hljs-number">42.0E2M</span>
<span class="hljs-number">42E+9M</span>
<span class="hljs-number">-42E+9M</span>
<span class="hljs-number">+42.0E2M</span>