Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EQL: delegating parsing to ANTLR #2196

Open
homedirectory opened this issue Feb 23, 2024 · 0 comments · May be fixed by #2209
Open

EQL: delegating parsing to ANTLR #2196

homedirectory opened this issue Feb 23, 2024 · 0 comments · May be fixed by #2209

Comments

@homedirectory
Copy link
Contributor

homedirectory commented Feb 23, 2024

Description

EQL has an "operational" grammar, which is implemented in a form of Java fluent interfaces.

The objective is to improve the EQL grammar and the AST structure in order to:

  • Facilitate evolvability (i.e., introduction of new querying capabilities, which means support for
    new expressions, should be easy and reliable in terms of expression correctness).
  • The current AST structure needs to be better suited for processing by the EQL semantic analyser.
    It should also be guaranteed that only valid ASTs would result from EQL expressions. It would be
    beneficial to get an already built AST from a fluent API method chain, instead of a sequence of
    tokens as it is now.

The current approach is manual, where EQL's operational grammar is maintained by extending existing
and introducing new fluent interfaces, which in turn need to be implemented to emit the relevant
tokens. This approach worked well in practice, but each time new querying capability is needed, a
very careful informal analysis is required to make sure that any changes or introduction of new
constructs would not lead to a possibility of allowing the creation of invalid EQL expressions.
Also, there is not easy way to guarantee that the resultant AST would strictly conform to the
amended operational grammar.

One well-established tool that supports development of programming languages is ANTLR.
ANTLR accepts a grammar in a specific format and generates a parser for the language described by the given grammar.

Parsers are usually used with plain text as input, and employing them to parse a fluent API is not a
typical task. Nevertheless, parsers generated by ANTLR can be used for this purpose. The main
difference lies in substituting the output of the lexer, which is also generated by ANTLR and
serves as the entry point of the parsing process. Instead of feeding EQL as plain text to the
generated lexer, we should construct the lexer's output ourselves, which is very similar to what we
have now -- a sequence of tokens that result from the fluent API method call chain.

#2178 lays the groundwork for integration with ANTLR.

Expected outcome

  • Introduction of new querying capabilities would only require modification of the established EQL
    grammar, which would result in automatic regeneration of a parser for the language.
  • AST structure that is well-aligned with the grammar.
  • Embedding into Java with a path for embedding into other languages (e.g., JavaScript).
@homedirectory homedirectory self-assigned this Feb 23, 2024
homedirectory added a commit that referenced this issue Feb 23, 2024
homedirectory added a commit that referenced this issue Feb 23, 2024
…building it

This gives us more control over the BNF representation which will assist
with the transformation into ANTLR grammar format (among other ones).
homedirectory added a commit that referenced this issue Feb 23, 2024
…rameterisations of the same terminal

Since we don't include the parameterisation of terminals in the ANTLR
grammar we need to eliminate duplicate tokens.
homedirectory added a commit that referenced this issue Feb 23, 2024
homedirectory added a commit that referenced this issue Feb 26, 2024
homedirectory added a commit that referenced this issue Feb 28, 2024
…pojo-bl

Also, don't generate a listener, which is not needed.
homedirectory added a commit that referenced this issue Feb 28, 2024
- Replaced the intermediate Tokens class by EqlSentenceBuilder.
- Temporarily commented out the old code which won't compile anymore and
  is staged for removal.
homedirectory added a commit that referenced this issue Feb 28, 2024
… "case when" to a later stage

Instead of doing it in the fluent API implementation, we will do it
during the compilation of expressions.
homedirectory added a commit that referenced this issue Mar 15, 2024
01es added a commit that referenced this issue Mar 18, 2024
homedirectory added a commit that referenced this issue Mar 18, 2024
@homedirectory homedirectory linked a pull request Mar 18, 2024 that will close this issue
01es added a commit that referenced this issue Mar 22, 2024
01es added a commit that referenced this issue Mar 24, 2024
01es added a commit that referenced this issue Mar 28, 2024
01es added a commit that referenced this issue Apr 2, 2024
01es added a commit that referenced this issue Apr 10, 2024
homedirectory added a commit that referenced this issue Apr 11, 2024
The transitive dependency has a different version, coming from the
GraphQL dependency. In TG applications the transitive was being picked
up instead of the direct one.
01es added a commit that referenced this issue Apr 17, 2024
01es added a commit that referenced this issue Apr 17, 2024
01es added a commit that referenced this issue Apr 17, 2024
01es added a commit that referenced this issue Apr 26, 2024
01es added a commit that referenced this issue May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant