[1.7.1] - 2020-06-07 - Ammar Ali
- Support for literals that include the unescaped delimiters
{
,}
, and]
. These delimiters are informally supported by various regexp engines.
[1.7.0] - 2020-02-23 - Janosch Müller
Expression#each_expression
and#traverse
can now be called without a block- this returns an
Enumerator
and allows chaining, e.g.each_expression.select
- thanks to Masataka Kuwabara
- this returns an
MatchLength#each
no longer ignores the givenlimit:
when called without a block
[1.6.0] - 2019-06-16 - Janosch Müller
- Added support for 16 new unicode properties introduced in Ruby 2.6.2 and 2.6.3
[1.5.1] - 2019-05-23 - Janosch Müller
- Fixed
#options
(and thus#i?
,#u?
etc.) not being set for some expressions:- this affected posix classes as well as alternation, conditional, and intersection branches
#options
was already correct for all child expressions of such branches- this only made an operational difference for posix classes as they respect encoding flags
- Fixed
#options
not respecting all negative options in weird cases like '(?u-m-x)' - Fixed
Group#option_changes
not accounting for indirectly disabled (overridden) encoding flags - Fixed
Scanner
allowing negative encoding options if there were no positive options, e.g. '(?-u)' - Fixed
ScannerError
for some valid meta/control sequences such as '\C-\\' - Fixed
Expression#match
and#=~
not working with a single argument
[1.5.0] - 2019-05-14 - Janosch Müller
- Added
#referenced_expression
for backrefs, subexp calls and conditionals- returns the
Group
expression that is being referenced via name or number
- returns the
- Added
Expression#repetitions
- returns a
Range
of allowed repetitions (1..1
if there is no quantifier) - like
#quantity
but with a more uniform interface
- returns a
- Added
Expression#match_length
- allows to inspect and iterate over String lengths matched by the Expression
- Fixed
Expression#clone
"direction"- it used to dup ivars onto the callee, leaving only the clone referencing the original objects
- this will affect you if you call
#eql?
/#equal?
on expressions or use them as Hash keys
- Fixed
#clone
results forSequences
, e.g. alternations and conditionals- the inner
#text
was cloned onto theSequence
and thus duplicated - e.g.
Regexp::Parser.parse(/(a|bc)/).clone.to_s # => (aa|bcbc)
- the inner
- Fixed inconsistent
#to_s
output forSequences
- it used to return only the "specific" text, e.g. "|" for an alternation
- now it includes nested expressions as it does for all other
Subexpressions
- Fixed quantification of codepoint lists with more than one entry (
\u{62 63 64}+
)- quantifiers apply only to the last entry, so this token is now split up if quantified
[1.4.0] - 2019-04-02 - Janosch Müller
- Added support for 19 new unicode properties introduced in Ruby 2.6.0
[1.3.0] - 2018-11-14 - Janosch Müller
Syntax#features
returns aHash
of all types and tokens supported by a givenSyntax
- Thanks to Akira Matsuda
- eliminated warning "assigned but unused variable - testEof"
[1.2.0] - 2018-09-28 - Janosch Müller
Subexpression
(branch node) includesEnumerable
, allowing to#select
children etc.
- Fixed missing quantifier in
Conditional::Expression
methods#to_s
,#to_re
Conditional::Condition
no longer lives outside the recursive#expressions
tree- it used to be the only expression stored in a custom ivar, complicating traversal
- its setter and getter (
#condition=
,#condition
) still work as before
[1.1.0] - 2018-09-17 - Janosch Müller
- Added
Quantifier
methods#greedy?
,#possessive?
,#reluctant?
/#lazy?
- Added
Group::Options#option_changes
- shows the options enabled or disabled by the given options group
- as with all other expressions,
#options
shows the overall active options
- Added
Conditional#reference
andCondition#reference
, indicating the determinative group - Added
Subexpression#dig
, acts likeArray#dig
- Fixed parsing of quantified conditional expressions (quantifiers were assigned to the wrong expression)
- Fixed scanning and parsing of forward-referring subexpression calls (e.g.
\g<+1>
) Root
andSequence
expressions now support the same constructor signature as all other expressions
[1.0.0] - 2018-09-01 - Janosch Müller
This release includes several breaking changes, mostly to character sets, #map and properties.
- Changed handling of sets (a.k.a. character classes or "bracket expressions")
- see PR #55 / issue #47 for details
- sets are now parsed to expression trees like other nestable expressions
#scan
now emits the same tokens as outside sets (no longer:set, :member
)CharacterSet#members
has been removed- new
Range
andIntersection
classes represent corresponding syntax features - a new
PosixClass
expression class represents e.g.[[:ascii:]]
PosixClass
instances behave likeProperty
ones, e.g. support#negative?
#scan
emits:(non)posixclass, :<type>
instead of:set, :char_(non)<type>
- Changed
Subexpression#map
to act like regularEnumerable#map
- the old behavior is available as
Subexpression#flat_map
- e.g.
parse(/[a]/).map(&:to_s) == ["[a]"]
; used to be["[a]", "a"]
- the old behavior is available as
- Changed expression emissions for some escape sequences
EscapeSequence::Codepoint
,CodepointList
,Hex
andOctal
are now all used- they already existed, but were all parsed as
EscapeSequence::Literal
- e.g.
\x97
is nowEscapeSequence::Hex
instead ofEscapeSequence::Literal
- Changed naming of many property tokens (emitted for
\p{...}
)- if you work with these tokens, see PR #56 for details
- e.g.
:punct_dash
is now:dash_punctuation
- Changed
(?m)
and the likes to emit as:options_switch
token (@4ade4d1)- allows differentiating from group-local
:options
, e.g.(?m:.)
- allows differentiating from group-local
- Changed name of
Backreference::..NestLevel
to..RecursionLevel
(@4184339) - Changed
Backreference::Number#number
fromString
toInteger
(@40a2231)
- Added support for all previously missing properties (about 250)
- Added
Expression::UnicodeProperty#shortcut
(e.g. returns "m" for\p{mark}
) - Added
#char(s)
and#codepoint(s)
methods to allEscapeSequence
expressions - Added
#number
/#name
/#recursion_level
to all backref/call expressions (@174bf21) - Added
#number
and#number_at_level
to capturing group expressions (@40a2231)
- Fixed Ruby version mapping of some properties
- Fixed scanning of some property spellings, e.g. with dashes
- Fixed some incorrect property alias normalizations
- Fixed scanning of codepoint escapes with 6 digits (e.g.
\u{10FFFF}
) - Fixed scanning of
\R
and\X
within sets; they act as literals there
[0.5.0] - 2018-04-29 - Janosch Müller
- Changed handling of Ruby versions (PR #53)
- New Ruby versions are now supported by default
- Some deep-lying APIs have changed, which should not affect most users:
Regexp::Syntax::VERSIONS
is gone- Syntax version names have changed from
Regexp::Syntax::Ruby::Vnnn
toRegexp::Syntax::Vn_n_n
- Syntax version classes for Ruby versions without regex feature changes are no longer predefined and are now only created on demand / lazily
Regexp::Syntax::supported?
returns true for any argument >= 1.8.6
- Fixed some use cases of Expression methods #strfregexp and #to_h (@e738107)
- Added full signature support to collection methods of Expressions (@aa7c55a)
[0.4.13] - 2018-04-04 - Ammar Ali
- Added ruby version files for 2.2.10 and 2.3.7
[0.4.12] - 2018-03-30 - Janosch Müller
- Added ruby version files for 2.4.4 and 2.5.1
[0.4.11] - 2018-03-04 - Janosch Müller
- Fixed UnknownSyntaxNameError introduced in v0.4.10 if the gems parent dir tree included a 'ruby' dir
[0.4.10] - 2018-03-04 - Janosch Müller
- Added ruby version file for 2.6.0
- Added support for Emoji properties (available in Ruby since 2.5.0)
- Added support for XPosixPunct and Regional_Indicator properties
- Fixed parsing of Unicode 6.0 and 7.0 script properties
- Fixed parsing of the special Assigned property
- Fixed scanning of InCyrillic_Supplement property
[0.4.9] - 2017-12-25 - Ammar Ali
- Added ruby version file for 2.5.0
[0.4.8] - 2017-12-18 - Janosch Müller
- Added ruby version files for 2.2.9, 2.3.6, and 2.4.3
[0.4.7] - 2017-10-15 - Janosch Müller
- Fixed a thread safety issue (issue #45)
- Some public class methods that were only reliable for internal use are now private instance methods (PR #46)
- Improved the usefulness of Expression#options (issue #43) - #options and derived methods such as #i?, #m? and #x? are now defined for all Expressions that are affected by such flags.
- Fixed scanning of whitespace following (?x) (commit 5c94bd2)
- Fixed a Parser bug where the #number attribute of traditional numerical backreferences was not set correctly (commit 851b620)
[0.4.6] - 2017-09-18 - Janosch Müller
- Added Parser support for hex escapes in sets (PR #36)
- Added Parser support for octal escapes (PR #37)
- Added support for cluster types \R and \X (PR #38)
- Added support for more metacontrol notations (PR #39)
[0.4.5] - 2017-09-17 - Ammar Ali
- Thanks to Janosch Müller:
- Support ruby 2.2.7 (PR #42)
- Added ruby version files for 2.2.8, 2.3.5, and 2.4.2
[0.4.4] - 2017-07-10 - Ammar Ali
- Thanks to Janosch Müller:
- Add support for new absence operator (PR #33)
- Thanks to Bartek Bułat:
- Add support for Ruby 2.3.4 version (PR #40)
[0.4.3] - 2017-03-24 - Ammar Ali
- Added ruby version file for 2.4.1
[0.4.2] - 2017-01-10 - Ammar Ali
- Thanks to Janosch Müller:
- Support ruby 2.4 (PR #30)
- Improve codepoint handling (PR #27)
[0.4.1] - 2016-11-22 - Ammar Ali
- Updated ruby version file for 2.3.3
[0.4.0] - 2016-11-20 - Ammar Ali
- Added Syntax.supported? method
- Updated ruby versions for latest releases; 2.1.10, 2.2.6, and 2.3.2
[0.3.6] - 2016-06-08 - Ammar Ali
- Thanks to John Backus:
- Remove warnings (PR #26)
[0.3.5] - 2016-05-30 - Ammar Ali
- Thanks to John Backus:
- Fix parsing of /\xFF/n (hex:escape) (PR #24)
[0.3.4] - 2016-05-25 - Ammar Ali
- Thanks to John Backus:
- Fix warnings (PR #19)
- Thanks to Dana Scheider:
- Correct error in README (PR #20)
- Fixed mistyped \h and \H character types (issue #21)
- Added ancestry syntax files for latest rubies (issue #22)
[0.3.3] - 2016-04-26 - Ammar Ali
- Thanks to John Backus:
- Fixed scanning of zero length comments (PR #12)
- Fixed missing escape:codepoint_list syntax token (PR #14)
- Fixed to_s for modified interval quantifiers (PR #17)
- Added a note about MRI implementation quirks to Scanner section
[0.3.2] - 2016-01-01 - Ammar Ali
- Updated ruby versions for latest releases; 2.1.8, 2.2.4, and 2.3.0
- Fixed class name for UnknownSyntaxNameError exception
- Added UnicodeBlocks support to the parser.
- Added UnicodeBlocks support to the scanner.
- Added expand_members method to CharacterSet, returns traditional or unicode property forms of shothands (\d, \W, \s, etc.)
- Improved meaning and output of %t and %T in strfregexp.
- Added syntax versions for ruby 2.1.4 and 2.1.5 and updated latest 2.1 version.
- Added to_h methods to Expression, Subexpression, and Quantifier.
- Added traversal methods; traverse, each_expression, and map.
- Added token/type test methods; type?, is?, and one_of?
- Added printing method strfregexp, inspired by strftime.
- Added scanning and parsing of free spacing (x mode) expressions.
- Improved handling of inline options (?mixdau:...)
- Added conditional expressions. Ruby 2.0.
- Added keep (\K) markers. Ruby 2.0.
- Added d, a, and u options. Ruby 2.0.
- Added missing meta sequences to the parser. They were supported by the scanner only.
- Renamed Lexer's method to lex, added an alias to the old name (scan)
- Use #map instead of #each to run the block in Lexer.lex.
- Replaced VERSION.yml file with a constant.
- Updated README
- Update tokens and scanner with new additions in Unicode 7.0.
[0.1.6] - 2014-10-06 - Ammar Ali
-
Fixed test and gem building rake tasks and extracted the gem specification from the Rakefile into a .gemspec file.
-
Added syntax files for missing ruby 2.x versions. These do not add extra syntax support, they just make the gem work with the newer ruby versions.
-
Added .travis.yml to project root.
-
README:
- Removed note purporting runtime support for ruby 1.8.6.
- Added a section identifying the main unsupported syntax features.
- Added sections for Testing and Building
- Added badges for gem version, Travis CI, and code climate.
-
Updated README, fixing broken examples, and converting it from a rdoc file to Github's flavor of Markdown.
-
Fixed a parser bug where an alternation sequence that contained nested expressions was incorrectly being appended to the parent expression when the nesting was exited. e.g. in /a|(b)c/, c was appended to the root.
-
Fixed a bug where character types were not being correctly scanned within character sets. e.g. in [\d], two tokens were scanned; one for the backslash '' and one for the 'd'
[0.1.5] - 2014-01-14 - Ammar Ali
- Correct ChangeLog.
- Added syntax stubs for ruby versions 2.0 and 2.1
- Added clone methods for deep copying expressions.
- Added optional format argument for to_s on expressions to return the text of the expression with (:full, the default) or without (:base) its quantifier.
- Renamed the :beginning_of_line and :end_of_line tokens to :bol and :eol.
- Fixed a bug where alternations with more than two alternatives and one of them ending in a group were being incorrectly nested.
- Improved EOF handling in general and especially from sequences like hex and control escapes.
- Fixed a bug where named groups with an empty name would return a blank token [].
- Fixed a bug where member of a parent set where being added to its last subset.
- Various code cleanups in scanner.rl
- Fixed a few mutable string bugs by calling dup on the originals.
- Made ruby 1.8.6 the base for all 1.8 syntax, and the 1.8 name a pointer to the latest (1.8.7 at this time)
- Removed look-behind assertions (positive and negative) from 1.8 syntax
- Added control (\cc and \C-c) and meta (\M-c) escapes to 1.8 syntax
- The default syntax is now the one of the running ruby version in both the lexer and the parser.
[0.1.0] - 2010-11-21 - Ammar Ali
- Initial release