-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_split_adaptive_aggregate_files
failing on main
#10721
Comments
I also encountered this in #10722 |
Yes, I also find this concerning. So, I definitely want to figure out what is going wrong. Side Note: When we start implementing #10602, I have some ideas for how we can get rid of all of this ugly "adaptive aggregation" code without sacrificing our ability to split files. |
I ran into the same TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType' when attempting to read in a partition with the Full traceback:
|
Thanks @jrbourbeau ! I have been struggling to figure out what is going wrong beyond "The fastparquet engine is messing up somewhere with hive-partitioned data" - This is helpful. |
I only saw this pop up once so far but the
test_split_adaptive_aggregate_files
was failing on mainhttps://github.com/dask/dask/actions/runs/7226871563/job/19693365591
I find this failure particularly concerning since this looks like data loss is possible even if it doesn't happen reliably. @rjzamora do you have time to poke at this?
The text was updated successfully, but these errors were encountered: