Skip to content

Releases: polm/cutlet

v0.3.0: Token-aligned romaji

11 Oct 13:11
Compare
Choose a tag to compare

This release adds the romaji_tokens function, which takes a list of input Node objects from fugashi and returns romaji for each individual token. This allows for romaji furigana or other applications.

The next release will likely be 1.0. No major extra functionality is planned, but some methods may be made internal, and the API will otherwise be cleaned up.

Fix Odoriji

20 Oct 14:22
Compare
Choose a tag to compare

This release of cutlet adds basic support for odoriji, or character repeating characters. In some cases it's impossible to handle them correctly, but at a minimum this makes sure they won't blow up.

Since the last release notes many other improvements have been included, and it's recommended you upgrade.

Fix Kana Unk Handling

26 Jul 09:44
Compare
Choose a tag to compare

This release fixes the issue (#8) where hiragana or katakana words not in the
dictionary would not be converted to romaji, but reproduced as-is. Now
they are romanized, though since they're not in the dictionary this will
often fail to capture original spelling.

A further consequence of this change is that unknown words in scripts
that aren't kana or ascii need to be handled. By default these
characters will be converted to "?" for maximum technical compatability,
though by setting the ensure_ascii property on a Cutlet to False you
can disable this behavior, which will cause unknown characters to pass
through.

Example:

import cutlet
cut = cutlet.Cutlet()
cutlet.romaji('彁')
# -> ?
cut.ensure_ascii = False
cut.romaji('彁')
# -> 彁

Note that besides unknown kanji this affects non-latin scripts like Cyrillic and Hangul.

Small improvements and bugfixes

16 Jul 11:37
Compare
Choose a tag to compare

Thanks to recent attention and PRs from the community this release of cutlet has several nice improvements.

  • fixed an issue with a few pronouns
  • fixed behavior of cli script on ctrlc
  • add support for Python 3.6
  • add Kyoto to the list of exceptions
  • don't blow up on empty strings

Thanks to @kinow, @krackers, and @kounoike for the PRs!