Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Arturo language #2259

Merged
merged 67 commits into from Oct 26, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
4402331
add support for arturo language
RickBarretto Oct 24, 2022
2d77534
add arturo language's tests
RickBarretto Oct 24, 2022
d8e8cde
merge simple rules
RickBarretto Oct 24, 2022
30888ed
remove `string_end` function, it's breaking tests
RickBarretto Oct 24, 2022
a0ff99b
add file header
RickBarretto Oct 25, 2022
fc625b6
add ArturoLexer's docstring
RickBarretto Oct 25, 2022
ec79974
alias should be only lowecase
RickBarretto Oct 25, 2022
31dc1cd
remove whitespace
RickBarretto Oct 25, 2022
e81bbd5
uncategorized should be a `Text`, not a `Name`
RickBarretto Oct 25, 2022
805f1f6
no need to group it
RickBarretto Oct 25, 2022
cfb9386
add lazy quantifier
RickBarretto Oct 25, 2022
c7987f2
no need to group it
RickBarretto Oct 25, 2022
e3fc6fb
add lazy quantifier
RickBarretto Oct 25, 2022
a5a0f89
remove additional indentation
RickBarretto Oct 25, 2022
485b840
replace with `words` function
RickBarretto Oct 25, 2022
ddb7923
remove additional state
RickBarretto Oct 25, 2022
cb6a6b7
remove the high amount of trivial states
RickBarretto Oct 25, 2022
2c08d22
fix interpolation rule
RickBarretto Oct 25, 2022
7b683c4
replace inefficient match
RickBarretto Oct 25, 2022
d80b451
move `__all__` to top
RickBarretto Oct 25, 2022
914ccb2
include `strings` on `root`
RickBarretto Oct 25, 2022
2a5b4d6
replace hand-made regex with `words` function
RickBarretto Oct 25, 2022
8335e36
remove unnecessary state
RickBarretto Oct 25, 2022
bce43b9
update goldens
RickBarretto Oct 25, 2022
aa87683
add missing `r` string prefix
RickBarretto Oct 25, 2022
2ab3cf7
add missing commas
RickBarretto Oct 25, 2022
fab7818
remove trivial states
RickBarretto Oct 25, 2022
cea6951
cleanup code
RickBarretto Oct 25, 2022
67b45a9
change versionadded
RickBarretto Oct 25, 2022
749a7a3
move predicates to `builtin_functions`
RickBarretto Oct 25, 2022
528a730
rename `builtin_functions`,
RickBarretto Oct 25, 2022
1095047
forget `|` when match String type
RickBarretto Oct 25, 2022
8161842
remove strings' additional states
RickBarretto Oct 25, 2022
3ded7d5
replace regex for `words` function
RickBarretto Oct 25, 2022
15f451d
remove `string-content-multi-line` state
RickBarretto Oct 25, 2022
f110042
add a test for interpolation on multiline strings
RickBarretto Oct 25, 2022
9443273
fix `words` position
RickBarretto Oct 25, 2022
ccfe10f
cleanup code
RickBarretto Oct 25, 2022
98be0ff
remove `strings` state
RickBarretto Oct 25, 2022
97f7878
remove `comments` state
RickBarretto Oct 25, 2022
304452d
cleanup code
RickBarretto Oct 25, 2022
ac37bdf
add a forget `\w` rule at the start of a number
RickBarretto Oct 25, 2022
11fea4c
update goldens
RickBarretto Oct 25, 2022
f938e5b
update mapfiles
RickBarretto Oct 25, 2022
01d22d7
fix W119
RickBarretto Oct 25, 2022
e96cbcc
fix wrong operator
RickBarretto Oct 25, 2022
57c31e7
add Arturo's website and repository links
RickBarretto Oct 25, 2022
f4c48cd
update goldens
RickBarretto Oct 25, 2022
e60c6c8
remove redundant end-of-string
RickBarretto Oct 26, 2022
52ef9c8
change hyperlinks to reST format.
RickBarretto Oct 26, 2022
5ab2b42
replace generic `Text` with better tokens
RickBarretto Oct 26, 2022
50799a7
add missing operators
RickBarretto Oct 26, 2022
4c94701
Add switch structure
RickBarretto Oct 26, 2022
65dab55
cleanup code
RickBarretto Oct 26, 2022
621eff9
add array index constant
RickBarretto Oct 26, 2022
7ab88de
add builtin pseudos `init` and `this`
RickBarretto Oct 26, 2022
14c3c62
add labeled form to another constants
RickBarretto Oct 26, 2022
c3b2b44
import `Error` token
RickBarretto Oct 26, 2022
5cbd13b
update decorators
RickBarretto Oct 26, 2022
94d4d6d
update goldens
RickBarretto Oct 26, 2022
c972771
remove `constants` state
RickBarretto Oct 26, 2022
4619361
cleanup code
RickBarretto Oct 26, 2022
ea97a4f
add greedy pattern for multiline-strings
RickBarretto Oct 26, 2022
41d76a0
update goldens
RickBarretto Oct 26, 2022
8026942
fix character overlap
RickBarretto Oct 26, 2022
eea1c9f
fix digit overlap
RickBarretto Oct 26, 2022
493e1d9
remove `operators` state
RickBarretto Oct 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions pygments/lexers/_mapping.py
Expand Up @@ -30,6 +30,7 @@
'AppleScriptLexer': ('pygments.lexers.scripting', 'AppleScript', ('applescript',), ('*.applescript',), ()),
'ArduinoLexer': ('pygments.lexers.c_like', 'Arduino', ('arduino',), ('*.ino',), ('text/x-arduino',)),
'ArrowLexer': ('pygments.lexers.arrow', 'Arrow', ('arrow',), ('*.arw',), ()),
'ArturoLexer': ('pygments.lexers.arturo', 'Arturo', ('arturo', 'art'), ('*.art',), ()),
'AscLexer': ('pygments.lexers.asc', 'ASCII armored', ('asc', 'pem'), ('*.asc', '*.pem', 'id_dsa', 'id_ecdsa', 'id_ecdsa_sk', 'id_ed25519', 'id_ed25519_sk', 'id_rsa'), ('application/pgp-keys', 'application/pgp-encrypted', 'application/pgp-signature')),
'AspectJLexer': ('pygments.lexers.jvm', 'AspectJ', ('aspectj',), ('*.aj',), ('text/x-aspectj',)),
'AsymptoteLexer': ('pygments.lexers.graphics', 'Asymptote', ('asymptote', 'asy'), ('*.asy',), ('text/x-asymptote',)),
Expand Down
264 changes: 264 additions & 0 deletions pygments/lexers/arturo.py
@@ -0,0 +1,264 @@
"""
pygments.lexers.arturo
~~~~~~~~~~~~~~~~~~~~~~

Lexer for the Arturo language.

:copyright: Copyright 2006-2022 by the Pygments team, see AUTHORS.
:license: BSD, see LICENSE for details.
"""

import re
RickBarretto marked this conversation as resolved.
Show resolved Hide resolved

from pygments.lexer import bygroups, default, DelegatingLexer, do_insertions,\
include, RegexLexer, this, using, words
from pygments.token import Comment, Error, Generic, Keyword, Name, Number, \
Operator, Other, Punctuation, String, Text

from pygments.util import ClassNotFound, get_bool_opt

__all__ = [
'ArturoLexer'
]
class ArturoLexer(RegexLexer):
RickBarretto marked this conversation as resolved.
Show resolved Hide resolved
"""
For Arturo source code
RickBarretto marked this conversation as resolved.
Show resolved Hide resolved

See `Arturo's Github <https://github.com/arturo-lang/arturo>`_
and `Arturo's Website <http://arturo-lang.io/>`_

.. versionadded:: 2.14.0
"""

name = 'Arturo'
aliases = ['arturo', 'art']
filenames = ['*.art']

def handle_annotated_strings(self, match):
"""Adds syntax from another languages inside annotated strings

match args:
1:open_string,
2:exclamation_mark,
3:lang_name,
4:space_or_newline,
5:code,
6:close_string
"""
from pygments.lexers import get_lexer_by_name

# Header's section
yield match.start(1), String.Double , match.group(1)
yield match.start(2), String.Interpol, match.group(2)
yield match.start(3), String.Interpol, match.group(3)
yield match.start(4), Text.Whitespace, match.group(4)

lexer = None
if self.handle_annotateds:
try:
lexer = get_lexer_by_name(match.group(3).strip())
except ClassNotFound:
pass
code = match.group(5)

if lexer is None:
yield match.group(5), String, code
else:
yield from do_insertions([], lexer.get_tokens_unprocessed(code))

yield match.start(6), String.Double, match.group(6)



tokens = {
'root': [
(r';.*?$', Comment.Single),
(r'^((\s#!)|(#!)).*?$', Comment.Hashbang),

# Constants
(words(('false', 'true', 'maybe'), # boolean
suffix=r'\b'),
Name.Constant ),
(words(('this', 'init'), # class related
prefix=r'\b', # keywords
suffix=r'\b\??:?'),
Name.Builtin.Pseudo ),
(r'`.`', String.Char ), # character
(r'\\\w+\b\??:?', # array index
Name.Property ),
(r'#\w+', Name.Constant ), # color
(r'\b[0-9]+\.[0-9]+', # float
Number.Float ),
(r'\b[0-9]+', Number.Integer ), # integer
(r'\w+\b\??:', Name.Label ), # label
# Note: Literals can be labeled too
(r'\'(?:\w+\b\??:?)', # literal
Keyword.Declaration ),
(r'\:\w+', Keyword.Type ), # type
# Note: Attributes can be labeled too
(r'\.\w+\??:?', Name.Attribute ), # attributes

# Switch structure
(r'(\()(.*?)(\)\?)',
bygroups(Punctuation, using(this), Punctuation)),

# Single Line Strings
(r'"', String.Double, 'inside-simple-string'),
(r'»', String.Single, 'inside-smart-string' ),
(r'«««', String.Double, 'inside-safe-string' ),
(r'\{\/', String.Single, 'inside-regex-string'),

# Multi Line Strings
(r'\{\:', String.Double, 'inside-curly-verb-string'),
(r'(\{)(\!)(\w+)(\s|\n)([\w\W]*?)(^\})',
handle_annotated_strings),
(r'\{', String.Single, 'inside-curly-string' ),
(r'\-{3,}', String.Single, 'inside-eof-string' ),

include('builtin-functions'),

# Operators
(r'[()[\],]', Punctuation),
(words((
'->', '==>', '|', '::',
'@', '#', '$', '&', '!', '!!', './'
)), Name.Decorator), # sugar syntax
(words((
'<:', ':>', ':<', '>:', '<\\', '<>', '<', '>',
'ø', '∞',
'+', '-', '*', '~', '=', '^', '%', '/', '//',
'==>', '<=>', '<==>',
'=>>', '<<=>>', '<<==>>',
'-->', '<->', '<-->',
'=|', '|=', '-:', ':-',
'_', '.', '..', '\\'
)), Operator
),

(r'\b\w+', Name),
(r'\s+', Text.Whitespace),
(r'.+$', Error),
],

'inside-interpol': [
(r'\|', String.Interpol, '#pop'),
(r'[^|]+', using(this)),
],
'inside-template': [
(r'\|\|\>', String.Interpol, '#pop'),
(r'[^|]+', using(this)),
],
'string-escape': [
(words(('\\\\', '\\n', '\\t','\\"')), String.Escape)
],

'inside-simple-string': [
include('string-escape'),
(r'\|', String.Interpol, 'inside-interpol'), # Interpolation
(r'\<\|\|', String.Interpol, 'inside-template'), # Templates
(r'"', String.Double, '#pop'), # Closing Quote
(r'[^|"]+', String) # String Content
],
'inside-smart-string': [
include('string-escape'),
(r'\|', String.Interpol, 'inside-interpol'), # Interpolation
(r'\<\|\|', String.Interpol, 'inside-template'), # Templates
(r'\n', String.Single, '#pop'), # Closing Quote
(r'[^|\n]+', String) # String Content
],
'inside-safe-string': [
include('string-escape'),
(r'\|', String.Interpol, 'inside-interpol'), # Interpolation
(r'\<\|\|', String.Interpol, 'inside-template'), # Templates
(r'»»»', String.Double, '#pop'), # Closing Quote
(r'[^|»]+', String) # String Content
],
'inside-regex-string': [
(r'\\[sSwWdDbBZApPxucItnvfr0]+', String.Escape),
(r'\|', String.Interpol, 'inside-interpol'), # Interpolation
(r'\<\|\|', String.Interpol, 'inside-template'), # Templates
(r'\/\}', String.Single, '#pop'), # Closing Quote
(r'[^|\/]+', String.Regex) , # String Content
],
'inside-curly-verb-string': [
include('string-escape'),
(r'\|', String.Interpol, 'inside-interpol'), # Interpolation
(r'\<\|\|', String.Interpol, 'inside-template'), # Templates
(r'\:\}', String.Double, '#pop'), # Closing Quote
(r'[^|<:]+', String) # String Content
],
'inside-curly-string': [
include('string-escape'),
(r'\|', String.Interpol, 'inside-interpol'), # Interpolation
(r'\<\|\|', String.Interpol, 'inside-template'), # Templates
(r'\}', String.Single, '#pop'), # Closing Quote
(r'[^|<}]+', String), # String Content
],
'inside-eof-string': [
include('string-escape'),
(r'\|', String.Interpol, 'inside-interpol'), # Interpolation
(r'\<\|\|', String.Interpol, 'inside-template'), # Templates
(r'\Z', String.Single, '#pop'), # Closing Quote
(r'[^|<]+', String), # String Content
],

'builtin-functions': [
(words((
'all', 'and', 'any', 'ascii', 'attr', 'attribute',
'attributeLabel', 'binary', 'block' 'char', 'contains',
'database', 'date', 'dictionary', 'empty', 'equal', 'even',
'every', 'exists', 'false', 'floatin', 'function', 'greater',
'greaterOrEqual', 'if', 'in', 'inline', 'integer', 'is',
'key', 'label', 'leap', 'less', 'lessOrEqual', 'literal',
'logical', 'lower', 'nand', 'negative', 'nor', 'not',
'notEqual', 'null', 'numeric', 'odd', 'or', 'path',
'pathLabel', 'positive', 'prefix', 'prime', 'set', 'some',
'sorted', 'standalone', 'string', 'subset', 'suffix',
'superset', 'ymbol', 'true', 'try', 'type', 'unless', 'upper',
'when', 'whitespace', 'word', 'xnor', 'xor', 'zero',
), prefix=r'\b', suffix=r'\b\?'), Name.Builtin),
(words((
'abs', 'acos', 'acosh', 'acsec', 'acsech', 'actan', 'actanh',
'add', 'after', 'alphabet', 'and', 'angle', 'append', 'arg',
'args', 'arity', 'array', 'as', 'asec', 'asech', 'asin',
'asinh', 'atan', 'atan2', 'atanh', 'attr', 'attrs', 'average',
'before', 'benchmark', 'blend', 'break', 'builtins1',
'builtins2', 'call', 'capitalize', 'case', 'ceil', 'chop',
'chunk', 'clear', 'close', 'cluster', 'color', 'combine',
'conj', 'continue', 'copy', 'cos', 'cosh', 'couple', 'csec',
'csech', 'ctan', 'ctanh', 'cursor', 'darken', 'dec', 'decode',
'decouple', 'define', 'delete', 'desaturate', 'deviation',
'dictionary', 'difference', 'digest', 'digits', 'div', 'do',
'download', 'drop', 'dup', 'e', 'else', 'empty', 'encode',
'ensure', 'env', 'epsilon', 'escape', 'execute', 'exit', 'exp',
'extend', 'extract', 'factors', 'false', 'fdiv', 'filter',
'first', 'flatten', 'floor', 'fold', 'from', 'function',
'gamma', 'gcd', 'get', 'goto', 'hash', 'help', 'hypot', 'if',
'in', 'inc', 'indent', 'index', 'infinity', 'info', 'input',
'insert', 'inspect', 'intersection', 'invert', 'join', 'keys',
'kurtosis', 'last', 'let', 'levenshtein', 'lighten', 'list',
'ln', 'log', 'loop', 'lower', 'mail', 'map', 'match', 'max',
'maybe', 'median', 'min', 'mod', 'module', 'mul', 'nand',
'neg', 'new', 'nor', 'normalize', 'not', 'now', 'null', 'open',
'or', 'outdent', 'pad', 'panic', 'path', 'pause',
'permissions', 'permutate', 'pi', 'pop', 'pow', 'powerset',
'powmod', 'prefix', 'print', 'prints', 'process', 'product',
'query', 'random', 'range', 'read', 'relative', 'remove',
'rename', 'render', 'repeat', 'replace', 'request', 'return',
'reverse', 'round', 'sample', 'saturate', 'script', 'sec',
'sech', 'select', 'serve', 'set', 'shl', 'shr', 'shuffle',
'sin', 'sinh', 'size', 'skewness', 'slice', 'sort', 'split',
'sqrt', 'squeeze', 'stack', 'strip', 'sub', 'suffix', 'sum',
'switch', 'symbols', 'symlink', 'sys', 'take', 'tan', 'tanh',
'terminal', 'to', 'true', 'truncate', 'try', 'type', 'union',
'unique', 'unless', 'until', 'unzip', 'upper', 'values', 'var',
'variance', 'volume', 'webview', 'while', 'with', 'wordwrap',
'write', 'xnor', 'xor', 'zip'
), prefix=r'\b', suffix=r'\b'), Name.Builtin)
],

}

def __init__(self, **options):
self.handle_annotateds = get_bool_opt(options, 'handle_annotateds', True)
RegexLexer.__init__(self, **options)