New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove code introduced by #17644 #21359
Comments
I agree with the proposed strategy. |
+1 on removing. Do we want to consider this a bug fix for 1.0.1 or 1.0.2? |
After some more investigation #17644 is still needed if using import pickle
import numpy as np
import joblib
arr = np.array([(1, 2.0)], dtype=[("myint", ">i8"), ("myfloat", ">f8")])
print(f"original dtype: {arr.dtype}")
numpy_dtype = pickle.loads(pickle.dumps(arr)).dtype
print(f"after numpy dump+load: {numpy_dtype}")
joblib.dump(arr, "/tmp/test.pkl")
joblib_dtype = joblib.load("/tmp/test.pkl").dtype
print(f"after joblib dump+load: {joblib_dtype}") Output (on a little-endian machine i.e. most common):
Side-comment: for simple dtypes e.g. ('float64', 'int64' etc ...) both Why this matters in the context of #17644 is that
|
#17644 introduced code in
Tree.__setstate__
to deal with pickles saved in a different endianness that the endianness on the machine the pickle is loaded on.#21237 showed that this fix was not working since there is an error in
Tree.__cinit__
(i.e .before__setstate__
). This is probably better to remove this tricky code if it is not useful.Side-comment: this problem is avoided with joblib 1.1 which loads arrays in native endianness (same behaviour as pickle).
While I am at it adding a test with a pickle generated on a big-endian machine would be nice.
The text was updated successfully, but these errors were encountered: