Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(elixir) fix regular expression detection #3207

Merged
merged 8 commits into from Jun 2, 2021

Conversation

joshgoebel
Copy link
Member

Resolves #3206 and other cleanups.

Changes

  • clean up grammar a bit, add more regex sigil examples
  • add regex sigils that classify as regex
  • add char.escape
  • remove dead code

Checklist

  • Added markup tests, or they don't apply here because...
  • Updated the changelog at CHANGES.md

"when",
"while",
"with|0"
];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this PR merely reformats the list, but it caught my attention because there are a lot of keywords in it that I have never seen in Elixir. Also let my starts by saying that I am not sure if there is a formal definition of a "keyword". I'm using that term rather intuitively.

As far as I know, those are not Elixir keywords at all:

  • begin
  • break (Inspect.Algebra.break/1 is a function in the standard lib, but it has nothing to do with breaking out of loops like it might do in other languages)
  • defined
  • ensure
  • include
  • module
  • next (OptionParser.next/2 is a function in the standard lib, it has nothing to do with skipping to another iteration in a loop like it might do in other languages)
  • redo
  • retry
  • return
  • until
  • while

I'm unsure about:

  • self. It exists, but IMO it is a normal macro, not a keyword. I wouldn't expect it to be colored differently than other normal macros (like rem, round, trunc etc).
  • then. It's a new macro added in Elixir 1.12 together with tap, I don't think those are keywords.
  • with|0. I'm not sure what the pipe and zero mean here. with on its own is definitely a keyword.

Missing IMO:

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alias is also listed twice in the list.

{
begin: /\{/,
end: /\}/
begin: '~r' + '(?=' + SIGIL_DELIMITERS + ')',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this also mark regex modifiers as part of the regex? E.g.

~r(hello)
~r(hello)i
~r(hello)ui
~r(hello)U

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but we should add that. What are all the valid modifiers in Elixir?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for sigils r/R, combinations of those letters:

uismxfU

for sigils: w/W, combinations of those letters:

sac

Treating any group of lower and/or uppercase letters immediately after the closing delimiter as a modifier and thus part of the sigil could also be enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added modifiers to ~r/~R. I'll save w for another day/PR since it's not already broken out. Really if we're going to customize these any further we perhaps need a simple data table and then programmatically auto-generate all the sigil rules from that table. Are R and W really the only ones with modifiers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are R and W really the only ones with modifiers?

At the moment yes, but the language syntax already allows for any sigil to have any modifier that it wants. That actually makes me realize that anyone can define their own custom sigil with their own modifiers 😅.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problems for another day. :) Does the PR look workable for now you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it hard to judge the code, but from playing around with tools/developer.html I discovered that there is a problem with uppercase R sigil. It still needs to support escaping the closing delimiter.

For example for those two pairs the output HTML should literally only differ by a single letter (r <-> R), but right now it tries to end the R sigil too early:

Regex.match?(~r|foo\|bar|, "foo")
Regex.match?(~R|foo\|bar|, "foo")


Regex.match?(~r(hello( there\)*!)u, "hello!")
Regex.match?(~R(hello( there\)*!)u, "hello!")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I thought uppercase sigils didn't allow escaping. I guess escaping the end sigil character is the exception to the rule?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is the exception. Quoting the docs about the sigil R:

It returns a regular expression pattern without interpolations and without escape characters. Note it still supports escape of Regex tokens (such as escaping + or ?) and it also requires you to escape the closing sigil character itself if it appears on the Regex.

Copy link
Contributor

@angelikatyborska angelikatyborska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot 💜

@joshgoebel joshgoebel merged commit 3e87daa into highlightjs:main Jun 2, 2021
@joshgoebel
Copy link
Member Author

@angelikatyborska Thanks for all the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(Elixir) Function capture breaks further code highlighting
3 participants