Skip to content

Releases: hplt-project/sacremoses

0.1.0

30 Oct 15:33
Compare
Choose a tag to compare

New perl_parity:bool argument for MosesPunctNormalizer that fixes differences between the latest Perl implementation and sacremoses. In a future release this will probably become the default and only behaviour. #146

MosesTokenizer speed up thanks to precompiled regular expressions #133, #139. Same for MosesDetokenizer #143.

A couple of bugfixes: The order of the protected_patterns list passed to MosesTokenizer.tokenize() is no longer significant. Also, use_known now works as expected MosesTruecaser.truecase(). #121. Since this change changes the output, I've decided to bump the version to 0.1.0 to signal a possibly breaking change.

Finally, long gone but never released: No more Python 2 support code (bye six 👋)

This is the first release of sacremoses under HPLT stewardship 🎉