Skip to content

Releases: pyparsing/pyparsing

pyparsing 3.0.5

07 Nov 17:37
Compare
Choose a tag to compare
  • Added return type annotations for col, line, and lineno.

  • Fixed bug when warn_ungrouped_named_tokens_in_collection warning was raised when assigning a results name to an original_text_for expression. (Issue #110, would raise warning in packaging.)

  • Fixed internal bug where ParserElement.streamline() would not return self if already streamlined.

  • Changed run_tests() output to default to not showing line and column numbers. If line numbering is desired, call with with_line_numbers=True. Also fixed minor bug where separating line was not included after a test failure.

pyparsing 3.0.4

30 Oct 08:32
Compare
Choose a tag to compare
  • Fixed bug in which Dict classes did not correctly return tokens as nested ParseResults, reported by and fix identified by Bu Sun Kim, many thanks!!!

  • Documented API-changing side-effect of converting ParseResults to use __slots__ to pre-define instance attributes. This means that code written like this (which was allowed in pyparsing 2.4.7):

    result = Word(alphas).parseString("abc")
    result.xyz = 100
    

    now raises this Python exception:

    AttributeError: 'ParseResults' object has no attribute 'xyz'
    

    To add new attribute values to ParseResults object in 3.0.0 and later, you must assign them using indexed notation:

    result["xyz"] = 100
    

    You will still be able to access this new value as an attribute or as an indexed item.

  • Fixed bug in railroad diagramming where the vertical limit would count all expressions in a group, not just those that would create visible railroad elements.

pyparsing 3.0.3

27 Oct 18:29
Compare
Choose a tag to compare
  • Fixed regex typo in one_of fix for as_keyword=True.

  • Fixed a whitespace-skipping bug, Issue #319, introduced as part of the revert of the LineStart changes. Reported by Marc-Alexandre Côté, thanks!

  • Added header column labeling > 100 in with_line_numbers - some input lines are longer than others.

pyparsing 3.0.2

27 Oct 12:14
Compare
Choose a tag to compare
  • Reverted change in behavior with LineStart and StringStart, which changed the interpretation of when and how LineStart and StringStart should match when a line starts with spaces. In 3.0.0, the xxxStart expressions were not really treated like expressions in their own right, but as modifiers to the following expression when used like LineStart() + expr, so that if there were whitespace on the line before expr (which would match in versions prior to 3.0.0), the match would fail.

    3.0.0 implemented this by automatically promoting LineStart() + expr to AtLineStart(expr), which broke existing parsers that did not expect expr to necessarily be right at the start of the line, but only be the first token found on the line. This was reported as a regression in Issue #317.

    In 3.0.2, pyparsing reverts to the previous behavior, but will retain the new AtLineStart and AtStringStart expression classes, so that parsers can chose whichever behavior applies in their specific instance. Specifically:

    # matches expr if it is the first token on the line (allows for leading whitespace)
    LineStart() + expr
    
    # matches only if expr is found in column 1
    AtLineStart(expr)
    
  • Performance enhancement to one_of to always generate an internal Regex, even if caseless or as_keyword args are given as True (unless explicitly disabled by passing use_regex=False).

  • IndentedBlock class now works with recursive flag. By default, the results parsed by an IndentedBlock are grouped. This can be disabled by constructing the IndentedBlock with grouped=False.

pyparsing 3.0.1

24 Oct 18:17
Compare
Choose a tag to compare
  • Fixed bug where Word(max=n) did not match word groups less than length 'n'. Thanks to Joachim Metz for catching this!

  • Fixed bug where ParseResults accidentally created recursive contents. Joachim Metz on this one also!

  • Fixed bug where warn_on_multiple_string_args_to_oneof warning is raised even when not enabled.

pyparsing 3.0.0

23 Oct 17:16
Compare
Choose a tag to compare

Version 3.0.0 -

Version 3.0.0.final -

  • Added support for python -W warning option to call enable_all_warnings() at startup. Also detects setting of PYPARSINGENABLEALLWARNINGS environment variable to any non-blank value.

  • Fixed named results returned by url to match fields as they would be parsed using urllib.parse.urlparse.

  • Early response to with_line_numbers was positive, with some requested enhancements:
    . added a trailing "|" at the end of each line (to show presence of trailing spaces); can be customized using eol_mark argument
    . added expand_tabs argument, to control calling str.expandtabs (defaults to True to match parseString)
    . added mark_spaces argument to support display of a printing character in place of spaces, or Unicode symbols for space and tab characters
    . added mark_control argument to support highlighting of control characters using '.' or Unicode symbols, such as "␍" and "␊".

  • Modified helpers common_html_entity and replace_html_entity() to use the HTML entity definitions from html.entities.html5.

  • Updated the class diagram in the pyparsing docs directory, along with the supporting .puml file (PlantUML markup) used to create the diagram.

  • Added global method autoname_elements() to call set_name() on all locally defined ParserElements that haven't been explicitly named using set_name(), using their local variable name. Useful for setting names on multiple elements when creating a railroad diagram.

          a = pp.Literal("a")
          b = pp.Literal("b").set_name("bbb")
          pp.autoname_elements()
    

    a will get named "a", while b will keep its name "bbb".

pyparsing 3.0.0rc2

02 Oct 05:40
Compare
Choose a tag to compare
pyparsing 3.0.0rc2 Pre-release
Pre-release
  • Added url expression to pyparsing_common. (Sample code posted by Wolfgang Fahl, very nice!)

    This new expression has been added to the urlExtractorNew.py example, to show how it extracts URL fields into separate results names.

  • Added method to pyparsing_testing to help debugging, with_line_numbers. Returns a string with line and column numbers corresponding to values shown when parsing with expr.set_debug():

    data = """\
       A
          100"""
    expr = pp.Word(pp.alphanums).set_name("word").set_debug()
    print(ppt.with_line_numbers(data))
    expr[...].parseString(data)
    

    prints:

                  1
         1234567890
       1:   A
       2:      100
      Match word at loc 3(1,4)
           A
           ^
      Matched word -> ['A']
      Match word at loc 11(2,7)
              100
              ^
      Matched word -> ['100']
    
  • Added new example cuneiform_python.py to demonstrate creating a new Unicode range, and writing a Cuneiform->Python transformer (inspired by zhpy).

  • Fixed issue #272, reported by PhasecoreX, when LineStart() expressions would match expressions that were not necessarily at the beginning of a line.

    As part of this fix, two new classes have been added: AtLineStart and AtStringStart.
    The following expressions are equivalent:

    LineStart() + expr      and     AtLineStart(expr)
    StringStart() + expr    and     AtStringStart(expr)
    
  • Fixed ParseFatalExceptions failing to override normal exceptions or expression matches in MatchFirst expressions. Addresses issue #251, reported by zyp-rgb.

  • Fixed bug in which ParseResults replaces a collection type value with an invalid type annotation (changed behavior in Python 3.9). Addresses issue #276, reported by Rob Shuler, thanks.

  • Fixed bug in ParseResults when calling __getattr__ for special double-underscored methods. Now raises AttributeError for non-existent results when accessing a name starting with '__'. Addresses issue #208, reported by Joachim Metz.

  • Modified debug fail messages to include the expression name to make it easier to sync up match vs success/fail debug messages.

pyparsing 3.0.0rc1

09 Sep 02:10
Compare
Choose a tag to compare
pyparsing 3.0.0rc1 Pre-release
Pre-release
  • Railroad diagrams have been reformatted:
    . creating diagrams is easier - call

      expr.create_diagram("diagram_output.html")
    

    create_diagram() takes 3 arguments:
    . the filename to write the diagram HTML
    . optional 'vertical' argument, to specify the minimum number of items in a path to be shown vertically; default=3
    . optional 'show_results_names' argument, to specify whether results name annotations should be shown; default=False

    . every expression that gets a name using setName() gets separated out as a separate subdiagram
    . results names can be shown as annotations to diagram items
    . Each, FollowedBy, and PrecededBy elements get [ALL], [LOOKAHEAD], and [LOOKBEHIND] annotations
    . removed annotations for Suppress elements
    . some diagram cleanup when a grammar contains Forward elements
    . check out the examples make_diagram.py and railroad_diagram_demo.py

  • Type annotations have been added to most public API methods and classes.

  • Better exception messages to show full word where an exception occurred.

    Word(alphas)[...].parseString("abc 123", parseAll=True)
    

    Was:

    pyparsing.ParseException: Expected end of text, found '1'  (at char 4), (line:1, col:5)
    

    Now:

    pyparsing.exceptions.ParseException: Expected end of text, found '123'  (at char 4), (line:1, col:5)
    
  • Suppress can be used to suppress text skipped using "...".

    source = "lead in START relevant text END trailing text"
    start_marker = Keyword("START")
    end_marker = Keyword("END")
    find_body = Suppress(...) + start_marker + ... + end_marker
    print(find_body.parseString(source).dump())
    

    Prints:

    ['START', 'relevant text ', 'END']
    - _skipped: ['relevant text ']
    
  • New string constants identchars and identbodychars to help in defining identifier Word expressions

    Two new module-level strings have been added to help when defining identifiers, identchars and identbodychars.

    Instead of writing::

    import pyparsing as pp
    identifier = pp.Word(pp.alphas + "_", pp.alphanums + "_")
    

    you will be able to write::

    identifier = pp.Word(pp.indentchars, pp.identbodychars)
    

    Those constants have also been added to all the Unicode string classes::

    import pyparsing as pp
    ppu = pp.pyparsing_unicode
    
    cjk_identifier = pp.Word(ppu.CJK.identchars, ppu.CJK.identbodychars)
    greek_identifier = pp.Word(ppu.Greek.identchars, ppu.Greek.identbodychars)
    
  • Added a caseless parameter to the CloseMatch class to allow for casing to be ignored when checking for close matches. (Issue #281) (PR by Adrian Edwards, thanks!)

  • Fixed bug in Located class when used with a results name. (Issue #294)

  • Fixed bug in QuotedString class when the escaped quote string is not a repeated character. (Issue #263)

  • parseFile() and create_diagram() methods now will accept pathlib.Path arguments.

pyparsing_3.0.0b3

08 Aug 13:54
Compare
Choose a tag to compare
pyparsing_3.0.0b3 Pre-release
Pre-release
  • PEP-8 compatible names are being introduced in pyparsing version 3.0!
    All methods such as parseString have been replaced with the PEP-8
    compliant name parse_string. In addition, arguments such as parseAll
    have been renamed to parse_all. For backward-compatibility, synonyms for
    all renamed methods and arguments have been added, so that existing
    pyparsing parsers will not break. These synonyms will be removed in a future
    release.

    In addition, the Optional class has been renamed to Opt, since it clashes
    with the common typing.Optional type specifier that is used in the Python
    type annotations. A compatibility synonym is defined for now, but will be
    removed in a future release.

  • HUGE NEW FEATURE - Support for left-recursive parsers!
    Following the method used in Python's PEG parser, pyparsing now supports
    left-recursive parsers when left recursion is enabled.

      import pyparsing as pp
      pp.ParserElement.enable_left_recursion()
    
      # a common left-recursion definition
      # define a list of items as 'list + item | item'
      # BNF:
      #   item_list := item_list item | item
      #   item := word of alphas
      item_list = pp.Forward()
      item = pp.Word(pp.alphas)
      item_list <<= item_list + item | item
    
      item_list.run_tests("""\
          To parse or not to parse that is the question
          """)
    

    Prints:

      ['To', 'parse', 'or', 'not', 'to', 'parse', 'that', 'is', 'the', 'question']
    

    Great work contributed by Max Fischer!

  • delimited_list now supports an additional flag allow_trailing_delim,
    to optionally parse an additional delimiter at the end of the list.
    Contributed by Kazantcev Andrey, thanks!

  • Removed internal comparison of results values against b"", which
    raised a BytesWarning when run with python -bb. Fixes issue #271 reported
    by Florian Bruhin, thank you!

  • Fixed STUDENTS table in sql2dot.py example, fixes issue #261 reported by
    legrandlegrand - much better.

  • Python 3.5 will not be supported in the pyparsing 3 releases. This will allow
    for future pyparsing releases to add parameter type annotations, and to take
    advantage of dict key ordering in internal results name tracking.

Pyparsing 3.0.0b2

30 Dec 22:17
Compare
Choose a tag to compare
Pyparsing 3.0.0b2 Pre-release
Pre-release
  • API CHANGE
    locatedExpr is being replaced by the class Located. Located has the same constructor interface as locatedExpr, but fixes bugs in the returned ParseResults when the searched expression contains multiple tokens, or has internal results names.

    locatedExpr is deprecated, and will be removed in a future release.