Skip to content

Parser Debugging and Diagnostics

Paul McGuire edited this page Oct 23, 2021 · 9 revisions

Diagnostic switches

Pyparsing contains the following Diagnostics switches:

  • Diagnostics.warn_multiple_tokens_in_named_alternation
  • Diagnostics.warn_ungrouped_named_tokens_in_collection
  • Diagnostics.warn_name_set_on_empty_Forward
  • Diagnostics.warn_on_multiple_string_args_to_oneof * Diagnostics.enable_debug_on_named_expressions

All are disabled by default, but you can selectively enable them to get some warnings if your parser uses techniques that may not give you desired results.

To enable a switch, add code similar to the following to your parser code:

import pyparsing as pp
pp.enable_diag(pp.Diagnostics.warn_ungrouped_named_tokens_in_collection)

You can also enable all warnings by:

  • calling pp.enable_all_warnings()
  • running Python with the -Wd or -Wd:::pyparsing switch
  • running Python with the PYPARSINGENABLEALLWARNINGS environment variable set to a non-empty value

You can suppress all warnings by:

  • running Python with the -Wi:::pyparsing switch

Diagnostics.warn_multiple_tokens_in_named_alternation

Enables warnings when a results name is defined on a MatchFirst or Or expression with one or more And subexpressions (only warns if __compat__.collect_all_And_tokens is False; this flag is set to True in pyparsing 3.0.0, and the compatibility behavior is no longer supported).

Diagnostics.warn_ungrouped_named_tokens_in_collection

Here is an example of an ungrouped named tokens in collection:

term = ppc.identifier | ppc.number
# this expression has a results name, and the expressions it
# contains also have results names
eqn = (term("lhs") + '=' + term("rhs"))("eqn")

eqn.runTests("""\
    a = 1000
    """)

The resulting output is:

diag_examples.py:11: UserWarning: warn_ungrouped_named_tokens_in_collection: setting results name 'eqn' on And expression collides with 'rhs' on contained expression
  eqn = (term("lhs") + '=' + term("rhs"))("eqn")

a = 1000
['a', '=', 1000]
- eqn: ['a', '=', 1000]
- lhs: 'a'
- rhs: 1000

Note that all the results names are at the same level, no hierarchy. If other expressions in this parser had 'lhs' or 'rhs' names, in similar ungrouped hierarchy, the 'lhs' and 'rhs' names would clash, and the default would be for only the last name to be reported.

The resolution for this warning is to Group eqn:

eqn = Group(term("lhs") + '=' + term("rhs"))("eqn")

Which gives this output:

a = 1000
[['a', '=', 1000]]
- eqn: ['a', '=', 1000]
  - lhs: 'a'
  - rhs: 1000

Now 'lhs' and 'rhs' are grouped under 'eqn', and would not be overwritten by other 'lhs' or 'rhs' names in other expressions.

Diagnostics.warn_name_set_on_empty_Forward

Enables warnings when a Forward is defined with a results name, but has no contents defined.

This is to help report when a Forward has been defined, and named with a results name, but never assigned any contents:

expr = Forward()("recursive_expression")
# never used afterward

Diagnostics.warn_on_multiple_string_args_to_oneof

Enables warnings whan one_of is incorrectly called with multiple str arguments. A common mistake is to call one_of with multiple str arguments:

direction = one_of("left", "right")

one_of takes additional keyword arguments, so Python will accept this call, but it generates the wrong expression. The correct form is:

direction = one_of("left right")

or

direction = one_of(["left", "right"])

Diagnostics.enable_debug_on_named_expressions

After enabling this switch, all expressions that are defined with names using setName() are automatically enabled for parse-time debugging.

Use setName(), setDebug(), and traceParseAction to monitor parsing behavior

TBD

Use ParseException.explain() to get more details

TBD

Use runTests() to run multiple test and see where parsers fail to parse

TBD

Use - operator instead of + in selected places in your parser to improve parse error locations

TBD