Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misleading debug text when encountering \r #496

Open
rowlesmr opened this issue Jul 10, 2023 · 2 comments
Open

Misleading debug text when encountering \r #496

rowlesmr opened this issue Jul 10, 2023 · 2 comments

Comments

@rowlesmr
Copy link

rowlesmr commented Jul 10, 2023

Having a \r in a string which is being parsed resets the output string in the debug output, overwriting what was already there.

The parsing is correct, just the explanatory text is wrong.

\r is imprtant as a standalone character, as I need to be able to accept it as a line terminator.

from pyparsing import (
    Opt,
    ParserElement, 
    Regex
)

if __name__ == "__main__":

    ParserElement.set_default_whitespace_chars(" \t")
    debug = True

    line_term = (("\r" + Opt("\n")) | "\n").set_debug(flag=debug).set_name("line_term")
    comment = (Regex("#.*(?=(\r\n?)|\n)") + line_term).set_debug(flag=debug).set_name("comment")
    string = (Regex("[a-z0-9]+") + Opt(line_term)).set_debug(flag=debug).set_name("string")
    value = (string | comment).set_debug(flag=debug).set_name("value")
    file = (value[...] + line_term[...]).set_debug(flag=debug)

    s="""#multi word comment \nval val2 \r val3\nval4  \t\n\n\r"""
    print(f"{file.parse_string(s, parse_all=True)=}")

results in (in part):

#... more stuff before
Match line_term at loc 20(1,21)
  #multi word comment 
                      ^
Matched line_term -> ['\n']
Matched comment -> ['#multi word comment ', '\n']
Matched value -> ['#multi word comment ', '\n']
Match value at loc 21(2,1)
 val3
  ^
Match string at loc 21(2,1)
 val3
  ^
Match line_term at loc 24(2,4)
 val3
     ^
Match line_term failed, ParseException raised: Expected '\r', found 'val2'  (at char 25), (line:2, col:5)
Matched string -> ['val']
Matched value -> ['val']
Match value at loc 24(2,4)
 val3
     ^
Match string at loc 25(2,5)
 val3
      ^
Match line_term at loc 29(2,9)
 val3
          ^
Matched line_term -> ['\r']
Matched string -> ['val2', '\r']
Matched value -> ['val2', '\r']
#... more stuff after
@ptmcg
Copy link
Member

ptmcg commented Jul 10, 2023

Interesting issue. Could you also please supply a small sample string I can use for s? Probably past the repr of the string so that the control characters show up properly.

@rowlesmr
Copy link
Author

One string:

s="""#multi word comment \nval val2 \r val3\nval4 \t\n\n\r"""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants