Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grammar backtrace #98

Closed
sunjay opened this issue Feb 16, 2017 · 5 comments · Fixed by #736
Closed

Grammar backtrace #98

sunjay opened this issue Feb 16, 2017 · 5 comments · Fixed by #736

Comments

@sunjay
Copy link
Contributor

sunjay commented Feb 16, 2017

Currently, if you want to figure out what the parser was expecting when it failed, you call the parser's expected() method which tells you which rules could have been next and provides a position which can be referenced in the source.

Unfortunately, when debugging a grammar, it can be difficult to use that information alone to figure out what went wrong. Especially in large grammars, knowing the next expected token doesn't always give you enough of the picture to figure out what went wrong. The expected token may be used in many places.

You can make an educated guess from the given source position, but it is hard to know for sure if your assumptions can be trusted.

It would be convenient if, when returning an error, the parser could generate a "backtrace" through the grammar which would inform us of exactly which path through the rules failed.

This is not a cheap calculation, so it should be saved for error cases and even possibly only done behind a feature flag which doesn't even compile that code unless the user adds a features section in their Cargo.toml.

@hansihe
Copy link

hansihe commented Sep 6, 2018

I gave porting a grammar from rust-peg a go, and I really like a lot of the design decisions and features.

However, without any debugging or tracing support, I find this difficult to use for more advanced grammars. Even just a trace of what rules are matched and which are not would be really helpful for debugging.

Just writing this to express interest.

@dragostis
Copy link
Contributor

dragostis commented Sep 6, 2018

@hansihe, you should give https://pest-parser.github.com a try. It has an editor that helps. But, yes, this is something needed.

By the way, if you're curious about writing an Erlang grammar in pest, I'd be more than willing to help out! ❤️

@dragostis
Copy link
Contributor

I've tried writing a solution to this and it's a bit more involved to get a decent trace. Basically, all positive and negative rules in an Error have their own trace which leads to a lot of information.

Maybe something more pragmatic would be a way to debug parsing as it happens? But I don't know exactly how that's supposed to work.

@hansihe
Copy link

hansihe commented Sep 9, 2018

I have a mostly complete grammar written in pest, and I'll push it if you want to have a look at it. There are a couple of issues I am having trouble figuring out when I don't have any tracing available.

I gave the editor you linked a try, and it pointed out some issues in the grammar that I fixed, thanks!

I hacked together some simple debugging utilities for rust-peg that helped me out a lot. It involves a compile time option that, when enabled, writes a trace of events to a vector (rule started, rule failed to match/rule matched, some others). I then have some utilities for pretty-printing the trace with colors and indentation, and optional collapsing of successfully matched rules. I can imagine an interactive browser for the trace would be quite nice as well.

Maybe something similar would work well for pest, at least as a start?

@hansihe
Copy link

hansihe commented Sep 9, 2018

If you are interested, here is the WIP pest grammar for core erlang. There are some test files in the test_data directory (*.core).

I'm sure I have some left-recursion somewhere in the grammar, since it seems to go into an infinite loop on some inputs.

tomtau pushed a commit to tomtau/pest that referenced this issue Nov 20, 2022
based on the old PR by @dragostis: pest-parser#277

Changes that were made:
- debugger core context was refactored and extracted to a lib (so that
it could be used in other frontends, e.g. editor plugins)
- CLI was extended using rustyline helpers to provide file completions,
history etc.
- applied suggestions from @hansihe from the old PR
(pest-parser#277 (comment)):
1. added `ba` (add breakpoints at all rules) which is useful
for stepping through the entire grammar, plus breakpoint deletions
and loading input directly from readline;
2. added command line arguments.
- changed the listener function to return a boolean, so that
the debugger can signal back to a parsing thread to finish
before reaching its input's EOF.
tomtau pushed a commit to tomtau/pest that referenced this issue Nov 20, 2022
based on the old PR by @dragostis: pest-parser#277

Changes that were made:
- debugger core context was refactored and extracted to a lib (so that
it could be used in other frontends, e.g. editor plugins)
- CLI was extended using rustyline helpers to provide file completions,
history etc.
- applied suggestions from @hansihe from the old PR
(pest-parser#277 (comment)):
1. added `ba` (add breakpoints at all rules) which is useful
for stepping through the entire grammar, plus breakpoint deletions
and loading input directly from readline;
2. added command line arguments.
- changed the listener function to return a boolean, so that
the debugger can signal back to a parsing thread to finish
before reaching its input's EOF.

historyfile init
@tomtau tomtau linked a pull request Nov 20, 2022 that will close this issue
tomtau pushed a commit to tomtau/pest that referenced this issue Nov 20, 2022
based on the old PR by @dragostis: pest-parser#277

Changes that were made:
- debugger core context was refactored and extracted to a lib (so that
it could be used in other frontends, e.g. editor plugins)
- CLI was extended using rustyline helpers to provide file completions,
history etc.
- applied suggestions from @hansihe from the old PR
(pest-parser#277 (comment)):
1. added `ba` (add breakpoints at all rules) which is useful
for stepping through the entire grammar, plus breakpoint deletions
and loading input directly from readline;
2. added command line arguments.
- changed the listener function to return a boolean, so that
the debugger can signal back to a parsing thread to finish
before reaching its input's EOF.

Co-authored-by: Dragoș Tiselice <dragostiselice@gmail.com>
@tomtau tomtau closed this as completed in 8c602d8 Nov 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants