Issue with change to chardet #305
Comments
Thanks I'll investigate |
@mcarans
|
@roll Thanks for fixing. I just wanted to ask about the change "Limit sample size for detection if remote" - if the character that caused the issue with chardet is at the beginning of the file, will there still be a difference of behaviour between chardet and cchardet? |
@mcarans |
Yes it is indeed confusing that it works as a local file but not as a remote url. I can only presume that the sample sent to chardet is different for the local file to the remote url somehow. |
@roll, It is odd chardet and cchardet give the same results when tested on the url outside of tabulator:
gives:
I'm not sure how Tabulator prior to your fix was using chardet in such a way that it behaves differently to cchardet on the url so cannot produce a cut down example to report against chardet. |
Overview
A script failed with the new Tabulator 1.38.1 and I wondered why. I narrowed it down to the change from cchardet to chardet. For this file: https://api.acleddata.com/acled/read.csv?limit=0&terms=accept&iso=112 cchardet has no issues but chardet gives:
I saw an issue #265 where someone experienced the opposite: chardet works but not cchardet. Obviously I can set things up to use cchardet, but I'd like to understand a bit better the discrepancies you've found between chardet and cchardet.
Please preserve this line to notify @roll (lead of this repository)
The text was updated successfully, but these errors were encountered: