Skip to content

0.1.0

Latest
Compare
Choose a tag to compare
@jelmervdl jelmervdl released this 30 Oct 15:33
· 2 commits to master since this release

New perl_parity:bool argument for MosesPunctNormalizer that fixes differences between the latest Perl implementation and sacremoses. In a future release this will probably become the default and only behaviour. #146

MosesTokenizer speed up thanks to precompiled regular expressions #133, #139. Same for MosesDetokenizer #143.

A couple of bugfixes: The order of the protected_patterns list passed to MosesTokenizer.tokenize() is no longer significant. Also, use_known now works as expected MosesTruecaser.truecase(). #121. Since this change changes the output, I've decided to bump the version to 0.1.0 to signal a possibly breaking change.

Finally, long gone but never released: No more Python 2 support code (bye six 👋)

This is the first release of sacremoses under HPLT stewardship 🎉