Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make number_of_words respect CJK characters #7813

Merged
merged 15 commits into from May 22, 2020
Merged
3 changes: 2 additions & 1 deletion lib/jekyll/filters.rb
Expand Up @@ -122,7 +122,8 @@ def normalize_whitespace(input)
#
# Returns the Integer word count.
def number_of_words(input)
input.split.length
cjk_regex = %r(\p{Han}|\p{Katakana}|\p{Hiragana}|\p{Hangul})
iBug marked this conversation as resolved.
Show resolved Hide resolved
input.scan(cjk_regex).length + input.gsub(cjk_regex, " ").split.length
end

# Join an array of things into a string by separating with commas and the
Expand Down