Skip to content

C++ target: escape ? character to prevent accidental trigraphs #2805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 22, 2020

Conversation

felixn
Copy link
Contributor

@felixn felixn commented Apr 19, 2020

If grammars contain something like ?? as a lexer rule (e.g. CONDITION_CONVERSION: '??';), the generated C++ code will contain something like "'??'" in _literalNames. This can cause compiler warnings or even unintended behaviour, because sequences starting with two ? characters are interpreted as trigraphs.

Using an override of shouldUseUnicodeEscapeForCodePointInDoubleQuotedString in CppTarget
shouldUseUnicodeEscapeForCodePointInDoubleQuotedString in Target was changed from private static to protected to allow overriding

(It's possible to avoid this issue by using C++17 (drops support for trigraphs) and -Wno-trigraphs, but escaping the ? character is a universal fix).

@felixn felixn changed the title escape ? character in c++ codegen to prevent accidental trigraphs C++ target: escape ? character to prevent accidental trigraphs Apr 19, 2020
@felixn
Copy link
Contributor Author

felixn commented May 1, 2020

Ping @mike-lischke

@mike-lischke
Copy link
Member

Can't you switch off trigraphs in every C++ compiler? This is a feature so rarely used that it seems unnecessary to waste CPU cycles to provide a fix in ANTLR4. But if you deem it important enough then we can take it in.

@felixn
Copy link
Contributor Author

felixn commented Jun 21, 2020

Hi Mike,
it looks like it's only possible to disable trigraphs by selecting C++17 or higher or by selecting a "std" mode instead of a "gnu" mode. See
https://clang.llvm.org/docs/UsersManual.html#differences-between-various-standard-modes
or
https://gcc.gnu.org/onlinedocs/cpp/Initial-processing.html

Either way, my suggestion (implemented in this PR) is for ANTLR to never generate trigraphs - so we don't force the user to use certain compiler settings / disable certain warning.
@parrt - what's your take on this?

(using an override of shouldUseUnicodeEscapeForCodePointInDoubleQuotedString in CppTarget)
@parrt
Copy link
Member

parrt commented Jun 21, 2020

I'll defer to @mike-lischke

@mike-lischke
Copy link
Member

A fix for a really marginal problem, but OK.

@felixn
Copy link
Contributor Author

felixn commented Jun 22, 2020

Thanks for the positive feedback.
Could you please trigger the merge? I don't have write access.
(Same with #2806)

@parrt parrt merged commit 7a6db6a into antlr:master Jun 22, 2020
@parrt parrt added this to the 4.9 milestone Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants