Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency between docs and behaviour concerning whitespace (pyparsing 2.4.7 -> 3.0 regression) #317

Closed
exhuma opened this issue Oct 25, 2021 · 10 comments

Comments

@exhuma
Copy link

exhuma commented Oct 25, 2021

I ran into a regression in pyparsing 3.0.1 which broke one of our parsers. The change in behaviour was not listed in the changelog. So it's either a regression or an oversight in the changelog 😉

The following code worked in pyparsing 2.4.7 but it fails in pyparsing 3.0.1 (note the leading whitespace):

from pyparsing import Word, StringStart, StringEnd

P_MTARG = (
    StringStart()
    + Word("abcde")
    + StringEnd()
)
print(P_MTARG.parseString("    aaa"))

In order to fix the issue, the whitespace needed to be explicitly defined:

from pyparsing import Word, StringStart, Suppress, ZeroOrMore, White, StringEnd
P_MTARG = (
    StringStart()
    + Suppress(ZeroOrMore(White()))
    + Word("abcde")
    + StringEnd()
)
print(P_MTARG.parseString("    aaa"))

This is in conflict with the documentation, which states (emphasis mine):

White - also similar to Word, but matches whitespace characters. Not usually
needed, as whitespace is implicitly ignored by pyparsing.
However, some
grammars are whitespace-sensitive, such as those that use leading tabs or
spaces to indicating grouping or hierarchy. (If matching on tab characters, be
sure to call parse_with_tabs on the top-level parse element.)

See https://pyparsing-docs.readthedocs.io/en/latest/HowToUsePyparsing.html#basic-parserelement-subclasses

@heltluke
Copy link

Might also be responsible for errors related to matplotlib's mathtext https://matplotlib.org/stable/tutorials/text/mathtext.html

@ptmcg
Copy link
Member

ptmcg commented Oct 25, 2021

Might also be responsible for errors related to matplotlib's mathtext https://matplotlib.org/stable/tutorials/text/mathtext.html

Can you send me examples of mathtext errors? They may be related, but they might be something else.

@heltluke
Copy link

heltluke commented Oct 25, 2021

Sure, sorry. A minimal example would be

import matplotlib.pyplot as plt

plt.plot(range(5), '.')
plt.title(r"$x \cdot y$")
plt.show()

which presently fails with ParseFatalException: Unknown symbol: \cdot, found '\' (at char 2), (line:1, col:3).

Happy to create a separate issue if it's indeed not at all related.

@ptmcg
Copy link
Member

ptmcg commented Oct 25, 2021

Yes, this looks different, please open a new issue. I'll have to chase down the mathtext parser code too. (I thought matplotlib used a vendored pyparsing though - maybe that changed recently.)

@ptmcg
Copy link
Member

ptmcg commented Oct 26, 2021

I'm going to choose to support both behaviors in the next release (3.0.2). I will revert the handling of LineStart + expr to how things worked before (so no intervening Optional(White()) will be necessary), and for those parsers that must have the expr start at the beginning of the line, they can use the new class AtLineStart, as AtLineStart(expr). This will make it much clearer that AtLineStart modifies the behavior of expr, whereas in LineStart() + expr, the two are parsed separately, allowing for whitespace.

If you are modifying your parser code to handle this change for 3.0.0 by inserting a White() expression, make it Optional(White()) or Empty() so that it will be compatible with the previous code, 3.0.0, and 3.0.2 and beyond.

Sorry for springing this surprise on you. I think this finally gets LineStart where it should be.

@ptmcg
Copy link
Member

ptmcg commented Oct 27, 2021

3.0.2 caused some other problems, please revisit after updating to 3.0.3.

@ptmcg
Copy link
Member

ptmcg commented Oct 29, 2021

Have you been able to retry this with 3.0.3?

@exhuma
Copy link
Author

exhuma commented Oct 30, 2021

@ptmcg I'm not at the office at the moment. I will give it a try as soon as I can.

@exhuma
Copy link
Author

exhuma commented Nov 2, 2021

@ptmcg I can confirm that the issue is resolved in pyparsing>=3.0.2. In other words, the issue only exists in pyparsing==3.0.1

I tested 3.0.2, 3.0.3 and 3.0.4

@ptmcg
Copy link
Member

ptmcg commented Nov 2, 2021

That's great news! I'll go ahead and close this issue then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants