Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cchardet seems to be obsolete, charset_normalizer as an alternative #6819

Closed
1 task done
suspectinside opened this issue Jul 7, 2022 · 6 comments
Closed
1 task done

Comments

@suspectinside
Copy link

suspectinside commented Jul 7, 2022

Is your feature request related to a problem?

cchardet from speedups is abandoned and does not work/couldn't be installed with Python3.11.

but there is another one and better alternative that has been actively supporting by this time - https://github.com/Ousret/charset_normalizer

Describe the solution you'd like

switch to https://github.com/Ousret/charset_normalizer as its in active development

Describe alternatives you've considered

just switch to https://github.com/Ousret/charset_normalizer as appropriate alternative

Related component

Client

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@Dreamsorcerer
Copy link
Member

We already use charset-normaliser instead of chardet. This was added be the author of that library: #5930.

That PR focused on the performance over chardet, so we'd probably want to see some comparisons against cchardet before replacing that one as well.

@Ousret Any thoughts on this?

@Ousret
Copy link
Contributor

Ousret commented Jul 12, 2022

Given aiohttp leading performances, I think that removing the support for cChardet (with its 4 ms on avg) would be a bit premature. Some would consider it as a too noticeable change at this point.
charset-normalizer currently (<= 3.10) run on avg for 30 ms and 16 ms for 3.11 and so forth.

Immediately, to remedy the "major" issue immediately (meaning: most end-user have difficulties building the sources), I would recommend stripping off cChardet from the on-setup dependencies (speedup) and keeping the import as-is (maybe adding a version control check in the code?). It's becoming almost certain that the maintainer won't support it anymore.

We have plans to ship the v3.0 (BC) of charset-normalizer (if everything goes as planned..) with an optional (compiled with mypyc) whl that would come closer to cChardet performance. See Ousret/charset_normalizer#182 But I clearly don't see it beating the uchardet binding, unfortunately.

You could also remove the support altogether, and explain very carefully the impact as discussed.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Jul 12, 2022

Immediately, to remedy the "major" issue immediately (meaning: most end-user have difficulties building the sources), I would recommend stripping off cChardet from the on-setup dependencies (speedup) and keeping the import as-is (maybe adding a version control check in the code?). It's becoming almost certain that the maintainer won't support it anymore.

I'm sure we'll do something like this once we actually have tests and things passing on 3.11 (and if cchardet has not been updated).

Workaround at present is to just uninstall cchardet, or manually install the other speedup packages (aiodns, Brotli).

@webknjaz
Copy link
Member

webknjaz commented Aug 3, 2022

I would recommend stripping off cChardet from the on-setup dependencies (speedup) and keeping the import as-is (maybe adding a version control check in the code?)

I came to the same conclusion and decided on the same solution right before reading this comment :)
PR incoming.

webknjaz added a commit to webknjaz/aiohttp that referenced this issue Aug 3, 2022
`cchardet` stopped being maintained a while ago so this patch removes
it from the `speedups` extra to keep it helpful. It also makes the
same adjustment in the CI under the most recent CPython versions to
keep the testing going.

Refs:
* aio-libs#6819 (comment)
* PyYoshi/cChardet#77
@webknjaz
Copy link
Member

webknjaz commented Aug 3, 2022

Implemented via #6857.

@shimondoodkin
Copy link

there is https://github.com/faust-streaming/cChardet
quick fix:
pip install faust-cchardet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants