Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installing toolbox on Mac - cchardet issue #83

Closed
Tinkaa opened this issue Mar 18, 2022 · 5 comments
Closed

Installing toolbox on Mac - cchardet issue #83

Tinkaa opened this issue Mar 18, 2022 · 5 comments

Comments

@Tinkaa
Copy link
Contributor

Tinkaa commented Mar 18, 2022

When installing toolbox from (test) PyPi on a Mac, we get the error
ERROR: Failed building wheel for cchardet .. clang: error: no such file or directory: 'src/ext/uchardet/src/CharDistribution.cpp' clang: error: no input files error: command '/usr/bin/clang' failed with exit code 1

However, when we first pip install cchardet, and then install toolbox, everything works fine.

It also seems to work fine when using the pypi version instead of test pypi.

We need to figure out what exactly is the cause of the error, and if we can fix it or not. As this is a known issue. Else, we should advise users to always install requirements.txt first, and include cchardet there.

cchardet is required for the HDX packages.

@turnerm
Copy link
Member

turnerm commented Mar 18, 2022

Tagging @mcarans so that he's aware. Mike, this leads me to a related question that I wanted to ask you: at the moment, we are only using the very basic download functionality from the HDX Python API in aa-toolbox. However, having HDX API as a requirement leads to a lot of dependencies which I suspect are not used, including cchardet (and other large packages such as beautifulsoup). Have you considered making a "light" version of the API, just for users that want to perform simple downloads? (sorry, annoying feature request, I know...)

@mcarans
Copy link

mcarans commented Mar 29, 2022

@Tinkaa In this comment on the known issue, the commenter suggests that using Python 3.7.6 instead of Python 3.6 fixed it for them. I guess you've already tried that?

The cchardet requirement is actually from HDX Python Utilities which uses the Frictionless Tabulator package with a requirement like this:
tabulator[cchardet]==1.53.5

Tabulator switched from cchardet back to chardet (due to the kind of problem you have run into), but I ran into a strange issue that made me think that cchardet worked better.

I also did a comparison of character encoding libraries. The change to make encoding library fully configurable was implemented in the new Frictionless Framework which replaces Tabulator, but I haven't switched over to it yet.

I could look at using sys.platform in the requirement and detecting Mac.

@turnerm On making a light HDX Python API, I'd need to look into making more of HDX Python Utilities optional as that is where many of the dependencies come from.

@mcarans
Copy link

mcarans commented Mar 30, 2022

I've had a look today at switching HDX Python Utilities to Frictionless (which appears no longer to have a configurable character encoding library and just uses chardet). It is a much bigger task than I had anticipated as the API has changed significantly from tabulator-py, but I made some progress.

I'll also look into making some dependencies optional like beautifulsoup.

@mcarans
Copy link

mcarans commented Apr 13, 2022

This took much longer than expected to resolve. Sorry about that. I upgraded from OKFN's Tabulator library to their newer Frictionless library that integrates streaming download functionality. Unfortunately far more had changed in structure, defaults etc. than I had anticipated.

Frictionless has no dependency on cchardet. Beautifulsoup is an optional dependency of HDX Python Utilities and HDX Python API does not bring in that dependency.

You can upgrade to HDX Python API 5.6.2 to get these changes.

@Tinkaa
Copy link
Contributor Author

Tinkaa commented Apr 13, 2022

@mcarans thank you so much for putting all this work into this! It is working perfectly for me now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants