Skip to content

Commit

Permalink
Add some missing unicode properties
Browse files Browse the repository at this point in the history
  • Loading branch information
jaynetics committed Jun 10, 2023
1 parent 7413c52 commit 810f8d8
Show file tree
Hide file tree
Showing 5 changed files with 33 additions and 2 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.md
Expand Up @@ -7,6 +7,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Fixed

- support for extpict unicode property, added in Ruby 2.6
- support for 10 unicode script/block properties added in Ruby 3.2

## [2.8.0] - 2023-04-17 - [Janosch Müller](mailto:janosch84@gmail.com)

### Added
Expand Down
2 changes: 1 addition & 1 deletion Gemfile
Expand Up @@ -5,7 +5,7 @@ gemspec
group :development, :test do
gem 'leto', '~> 2.0'
gem 'rake', '~> 13.0'
gem 'regexp_property_values', '~> 1.3'
gem 'regexp_property_values', '~> 1.4'
gem 'rspec', '~> 3.10'
if RUBY_VERSION.to_f >= 2.7
gem 'benchmark-ips', '~> 2.1'
Expand Down
11 changes: 11 additions & 0 deletions lib/regexp_parser/scanner/properties/long.csv
Expand Up @@ -7,6 +7,7 @@ age=12.0,age=12.0
age=12.1,age=12.1
age=13.0,age=13.0
age=14.0,age=14.0
age=15.0,age=15.0
age=2.0,age=2.0
age=2.1,age=2.1
age=3.0,age=3.0
Expand Down Expand Up @@ -97,6 +98,7 @@ emojimodifierbase,emoji_modifier_base
emojipresentation,emoji_presentation
enclosingmark,enclosing_mark
ethiopic,ethiopic
extendedpictographic,extended_pictographic
extender,extender
finalpunctuation,final_punctuation
format,format
Expand Down Expand Up @@ -139,6 +141,7 @@ inancientsymbols,in_ancient_symbols
inarabic,in_arabic
inarabicextendeda,in_arabic_extended_a
inarabicextendedb,in_arabic_extended_b
inarabicextendedc,in_arabic_extended_c
inarabicmathematicalalphabeticsymbols,in_arabic_mathematical_alphabetic_symbols
inarabicpresentationformsa,in_arabic_presentation_forms_a
inarabicpresentationformsb,in_arabic_presentation_forms_b
Expand Down Expand Up @@ -186,6 +189,7 @@ incjkunifiedideographsextensiond,in_cjk_unified_ideographs_extension_d
incjkunifiedideographsextensione,in_cjk_unified_ideographs_extension_e
incjkunifiedideographsextensionf,in_cjk_unified_ideographs_extension_f
incjkunifiedideographsextensiong,in_cjk_unified_ideographs_extension_g
incjkunifiedideographsextensionh,in_cjk_unified_ideographs_extension_h
incombiningdiacriticalmarks,in_combining_diacritical_marks
incombiningdiacriticalmarksextended,in_combining_diacritical_marks_extended
incombiningdiacriticalmarksforsymbols,in_combining_diacritical_marks_for_symbols
Expand All @@ -205,10 +209,12 @@ incyrillic,in_cyrillic
incyrillicextendeda,in_cyrillic_extended_a
incyrillicextendedb,in_cyrillic_extended_b
incyrillicextendedc,in_cyrillic_extended_c
incyrillicextendedd,in_cyrillic_extended_d
incyrillicsupplement,in_cyrillic_supplement
indeseret,in_deseret
indevanagari,in_devanagari
indevanagariextended,in_devanagari_extended
indevanagariextendeda,in_devanagari_extended_a
indingbats,in_dingbats
indivesakuru,in_dives_akuru
indogra,in_dogra
Expand Down Expand Up @@ -268,6 +274,7 @@ inipaextensions,in_ipa_extensions
initialpunctuation,initial_punctuation
injavanese,in_javanese
inkaithi,in_kaithi
inkaktoviknumerals,in_kaktovik_numerals
inkanaextendeda,in_kana_extended_a
inkanaextendedb,in_kana_extended_b
inkanasupplement,in_kana_supplement
Expand All @@ -276,6 +283,7 @@ inkangxiradicals,in_kangxi_radicals
inkannada,in_kannada
inkatakana,in_katakana
inkatakanaphoneticextensions,in_katakana_phonetic_extensions
inkawi,in_kawi
inkayahli,in_kayah_li
inkharoshthi,in_kharoshthi
inkhitansmallscript,in_khitan_small_script
Expand Down Expand Up @@ -339,6 +347,7 @@ inmyanmar,in_myanmar
inmyanmarextendeda,in_myanmar_extended_a
inmyanmarextendedb,in_myanmar_extended_b
innabataean,in_nabataean
innagmundari,in_nag_mundari
innandinagari,in_nandinagari
innewa,in_newa
innewtailue,in_new_tai_lue
Expand Down Expand Up @@ -457,6 +466,7 @@ joincontrol,join_control
kaithi,kaithi
kannada,kannada
katakana,katakana
kawi,kawi
kayahli,kayah_li
kharoshthi,kharoshthi
khitansmallscript,khitan_small_script
Expand Down Expand Up @@ -503,6 +513,7 @@ mro,mro
multani,multani
myanmar,myanmar
nabataean,nabataean
nagmundari,nag_mundari
nandinagari,nandinagari
newa,newa
newline,newline
Expand Down
2 changes: 2 additions & 0 deletions lib/regexp_parser/scanner/properties/short.csv
Expand Up @@ -57,6 +57,7 @@ emod,emoji_modifier
epres,emoji_presentation
ethi,ethiopic
ext,extender
extpict,extended_pictographic
geor,georgian
glag,glagolitic
gong,gunjala_gondi
Expand Down Expand Up @@ -133,6 +134,7 @@ mtei,meetei_mayek
mult,multani
mymr,myanmar
n,number
nagm,nag_mundari
nand,nandinagari
narb,old_north_arabian
nbat,nabataean
Expand Down
15 changes: 14 additions & 1 deletion lib/regexp_parser/syntax/token/unicode_property.rb
Expand Up @@ -59,7 +59,7 @@ module Category

Age_V3_1_0 = %i[age=13.0]

Age_V3_2_0 = %i[age=14.0]
Age_V3_2_0 = %i[age=14.0 age=15.0]

Age = all[:Age_V]

Expand Down Expand Up @@ -321,6 +321,8 @@ module Category

Script_V3_2_0 = %i[
cypro_minoan
kawi
nag_mundari
old_uyghur
tangsa
toto
Expand Down Expand Up @@ -667,11 +669,18 @@ module Category

UnicodeBlock_V3_2_0 = %i[
in_arabic_extended_b
in_arabic_extended_c
in_cjk_unified_ideographs_extension_h
in_cypro_minoan
in_cyrillic_extended_d
in_devanagari_extended_a
in_ethiopic_extended_b
in_kaktovik_numerals
in_kana_extended_b
in_kawi
in_latin_extended_f
in_latin_extended_g
in_nag_mundari
in_old_uyghur
in_tangsa
in_toto
Expand All @@ -690,6 +699,10 @@ module Category
emoji_presentation
]

Emoji_V2_6_0 = %i[
extended_pictographic
]

Emoji = all[:Emoji_V]

V1_9_0 = Category::All + POSIX + all[:V1_9_0]
Expand Down

0 comments on commit 810f8d8

Please sign in to comment.