Skip to content

Releases: jawah/charset_normalizer

Version 1.3.6

09 Feb 00:04
e46ee12
Compare
Choose a tag to compare

Amend the previous release to allow prettytable 2.0
Thanks to @jayvdb #35

Version 1.3.5

08 Feb 21:39
a434ac1
Compare
Choose a tag to compare

Changes :

  • Miscellaneous: 🔧 Dependencies refactor, add python 3.9 and 3.10 to the supported interpreters
  • Bugfix: 🐛 Fix error while using the package with a python pre-release interpreter #33

Small refresh to keep the project up and running until further dev. (Upcoming version 1.4.0)
Thanks to the many adopters.

Charset Normalizer

16 Dec 13:01
a90a899
Compare
Choose a tag to compare

Changes :

  • Improvement/Bugfix : False positive when searching for successive upper, lower char. (ProbeChaos) (#31)

Charset Normalizer

16 Dec 09:22
48c2e6b
Compare
Choose a tag to compare

Changes :

  • Improvement : Noticeable better detection for jp #30

Charset Normalizer

13 Dec 13:43
b0e4e94
Compare
Choose a tag to compare

Changes :

  • Bugfix : Passing zero-length bytes to from_bytes (#29)

Charset Normalizer

11 Oct 13:06
cfa2fda
Compare
Choose a tag to compare

Changes :

  • Improvement : Expose version in package (#18)
  • Bugfix : Division by zero (#23)
  • Improvement : Prefers unicode (utf-8) when detected (#19)

Charset Normalizer

30 Sep 18:19
a2a4682
Compare
Choose a tag to compare

Changes :

  • Feature : Now support unicodedata2 backport. To benefit from it install using pip install charset-normalizer[UnicodeDataBackport]. Python 3.7 have UnicodeData v11. You could upgrade it to v12.
  • Feature : Added preemptive behaviour. Looking for a declared encoding. Using positional parameter preemptive_behaviour. Default to True. Does not take declared encoding for it, testing it first.
  • Improvement : Added aliases to CharsetNormalizerMatches class. CharsetDetector; EncodingDetector and CharsetDoctor.

Charset Normalizer

28 Sep 19:16
6ea66b2
Compare
Choose a tag to compare

Changes :

  • Feature : Added has_submatch, percent_chaos and percent_coherence properties on single match object.
  • Improvement : best() method of CharsetNormalizerMatches has been rewritten for better readability.
  • Feature : Added explain boolean positional parameter to print out what actually happen when searching for a match.
  • Improvement : Detection has been globally improved.
  • Feature : You can exclude some encoding when searching for a match with parameter cp_exclusion. List of str. for from_bytes from_path and from_fp.
  • Feature : You can limit the search to some encoding when looking for a match with parameter cp_isolation. List of str. for from_bytes from_path and from_fp.
  • Feature : import charset_normalizer is enough to provide additional help when you encounter UnicodeDecodeError exception.

Charset Normalizer

23 Sep 13:02
5abfb83
Compare
Choose a tag to compare

Changes :

  • Bugfix : from_bytes parameters steps and chunk_size were not adapted to sequence len if provided values were not fitted to content. Therefore could lead to misdetection on small content.

Charset Normalizer

21 Sep 16:17
Compare
Choose a tag to compare

Changes :

  • Bugfix : Sequence having lenght bellow 10 chars was not checked by ProbeChaos at all. (#14)
  • Bugfix : Legacy detect method inspired by chardet was not returning intended result when having no result. (#14)