Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSharp and Java produce different results for identical input, identical tokens #3112

Closed
kaby76 opened this issue Mar 6, 2021 · 7 comments
Closed

Comments

@kaby76
Copy link
Contributor

kaby76 commented Mar 6, 2021

Antlr 4.9.1 parsers for grammars-v4/idl/IDL.g4 with the input shown below are producing different results. To be clear, the parsers should produce identical output, but they do not. The list of tokens is identical between Java and CSharp so the lexing is identical. There are no actions/semantic predicates in IDL.g4, and I have removed the custom error listener generated by my program for both targets, so it cannot be attributed to those possible issues. I believe there is a problem in one or the other Antlr4 runtimes.

I also tested this across the other targets. After renaming the parser symbol "exports", which conflicts with certain targets, the parsers produce different results: for Dart and Go, the input is accepted; for JavaScript and Python3, the input is rejected.

The error outputted by the parser is:

line 3:2 mismatched input 'interface' expecting {'@', 'typedef', 'custom', 'struct', 'native', 'eventtype', 'enum', 'home', 'exception', 'const', 'module', 'union', 'abstract', 'typeprefix', 'typeid', 'component', 'bitset', 'bitmask', '@annotation'}

To reproduce, use dotnet-antlr v2.2.0.

module HelloApp
{
  interface Hello
  {
  string sayHello();
  oneway void shutdown();
  };
};
@ericvergnaud
Copy link
Contributor

What is 'dotnet-antlr v2.2.0' ?

@kaby76
Copy link
Contributor Author

kaby76 commented Mar 6, 2021

@ericvergnaud It's a tool to read pom's and generate a driver for Antlr grammar per target. It's geared for grammars-v4. A makefile is presented to present a common, simple build and run without remembering how to call the toolchain to build and run. Version 2.2.0 generates for the six targets I mentioned above. The source code, in C#, using essentially Console.WriteLine's to generate the output files, but plan to use ST to generate drivers according to supplied templates.

Attached is the code produced for idl. Please understand it's a work in progress and there are bugs, and the code produced could be better.
idl-fixed.zip

@kaby76
Copy link
Contributor Author

kaby76 commented Mar 6, 2021

CSharp appears fixed in 4.9.2-snapshot (i.e., helloworld.idl parses without error), as well as Python3. Will check the remaining targets now.

@kaby76
Copy link
Contributor Author

kaby76 commented Mar 6, 2021

@ericvergnaud I checked this on the latest snapshot--the bug is fixed in all targets, likely the state set problem #3075. I will close this out, but it would be good to see a 4.9.2 release sooner than later. I hope I can figure out a package management scheme for all targets from a CI build artifact. --Ken

@kaby76 kaby76 closed this as completed Mar 6, 2021
@ericvergnaud
Copy link
Contributor

@parrt indeed we have a bunch of critical bug fixes in JS and C# runtimes
any appetite for a quick release?

@parrt parrt added this to the 4.9.2 milestone Mar 6, 2021
@parrt
Copy link
Member

parrt commented Mar 6, 2021

@ericvergnaud sure. it's fairly fresh in my mind. Could you tag stuff as 4.9.2 that got merged for this release? I have a script that will generate release notes derived from anything tagged with that milestone. I have created a draft 4.9.2 release.

@ericvergnaud
Copy link
Contributor

ericvergnaud commented Mar 7, 2021 via email

kaby76 referenced this issue in antlr/grammars-v4 Apr 14, 2021
…ter, operator, extern, namespace, delete, signed, and, or, not, union, struct, typedef, typename, bool, template. Some are not checked by the Antlr tool, but most are. The symbols are renamed in the usual manner--with a trailing underscore. (2) There are two grammars that define unused symbols (sharc, NULL; lcalendar, bool). These are probably errors in the grammars.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants