New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pickle portability little 馃 big endian #21237
Comments
Hi! Generally pickle is not supposed to be platform independent, so this is expected behavior. |
However on this specific topic little vs. big endian, @lesteve did we already encounter the issue and actually did something about it in |
I am guessing you have this joblib PR in mind: joblib/joblib#1181. I am not sure whether this would fix the problem reported here, but maybe worth a try (by installing joblib development version). |
The following sklearn documentation says, "Aside for a few exceptions, pickled models should be portable across architectures assuming the same versions of dependencies and Python are used. If you encounter an estimator that is not portable please open an issue on GitHub": https://scikit-learn.org/stable/modules/model_persistence.html So I thought sklearn supports dumps accoss little endian and big endian architectures. Also, I found following issue fixes a similar issue when running GradientBoostingClassifier model: #17644 And we saw that GradientBoostingClassifier model works fine. Can a similar fix be done for the other 2 models which we found that are not working? |
@amueller They are portable (#19561, #17644 (comment)) aside from custom C structs that we serialize where we should probably be more careful.
Yes, #17644 should have fixed it, I think, but apparently it didn't. It's the same issue with |
@sgundura can you try using joblib==1.1.0 (released yesterday) to load the pickle? That actually may fix it. My understanding is that with joblib/joblib#1181, joblib load arrays with native endianness and avoids dtypes non-matching that #17644 was supposed to address. Longer story: I was able to get a similar error by:
The next snippet loads a pickle, so only run it if you think you can trust me. This contains the pickle generated inside the s390x docker image. This should reproduce the error on a little-endian machine (so very likely on your machine): import io
import joblib
joblib.load(io.BytesIO(b"\x80\x04\x951\x02\x00\x00\x00\x00\x00\x00\x8c\x15sklearn.tree._classes\x94\x8c\x16DecisionTreeClassifier\x94\x93\x94)\x81\x94}\x94(\x8c\tcriterion\x94\x8c\x04gini\x94\x8c\x08splitter\x94\x8c\x04best\x94\x8c\tmax_depth\x94K\x01\x8c\x11min_samples_split\x94K\x02\x8c\x10min_samples_leaf\x94K\x01\x8c\x18min_weight_fraction_leaf\x94G\x00\x00\x00\x00\x00\x00\x00\x00\x8c\x0cmax_features\x94N\x8c\x0emax_leaf_nodes\x94N\x8c\x0crandom_state\x94N\x8c\x15min_impurity_decrease\x94G\x00\x00\x00\x00\x00\x00\x00\x00\x8c\x12min_impurity_split\x94N\x8c\x0cclass_weight\x94N\x8c\tccp_alpha\x94G\x00\x00\x00\x00\x00\x00\x00\x00\x8c\x0en_features_in_\x94K\x14\x8c\x0bn_features_\x94K\x14\x8c\nn_outputs_\x94K\x01\x8c\x08classes_\x94\x8c\x13joblib.numpy_pickle\x94\x8c\x11NumpyArrayWrapper\x94\x93\x94)\x81\x94}\x94(\x8c\x08subclass\x94\x8c\x05numpy\x94\x8c\x07ndarray\x94\x93\x94\x8c\x05shape\x94K\x02\x85\x94\x8c\x05order\x94\x8c\x01C\x94\x8c\x05dtype\x94h\x1eh%\x93\x94\x8c\x02i8\x94\x89\x88\x87\x94R\x94(K\x03\x8c\x01>\x94NNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK\x00t\x94b\x8c\nallow_mmap\x94\x88ub\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x95\x9a\x00\x00\x00\x00\x00\x00\x00\x8c\nn_classes_\x94\x8c\x15numpy.core.multiarray\x94\x8c\x06scalar\x94\x93\x94h)C\x08\x00\x00\x00\x00\x00\x00\x00\x02\x94\x86\x94R\x94\x8c\rmax_features_\x94K\x14\x8c\x05tree_\x94\x8c\x12sklearn.tree._tree\x94\x8c\x04Tree\x94\x93\x94K\x14h\x1a)\x81\x94}\x94(h\x1dh h!K\x01\x85\x94h#h$h%h)h,\x88ub\x00\x00\x00\x00\x00\x00\x00\x02\x95J\x01\x00\x00\x00\x00\x00\x00K\x01\x87\x94R\x94}\x94(h\tK\x01\x8c\nnode_count\x94K\x03\x8c\x05nodes\x94h\x1a)\x81\x94}\x94(h\x1dh h!K\x03\x85\x94h#h$h%h&\x8c\x03V56\x94\x89\x88\x87\x94R\x94(K\x03\x8c\x01|\x94N(\x8c\nleft_child\x94\x8c\x0bright_child\x94\x8c\x07feature\x94\x8c\tthreshold\x94\x8c\x08impurity\x94\x8c\x0en_node_samples\x94\x8c\x17weighted_n_node_samples\x94t\x94}\x94(hHh&\x8c\x02i8\x94\x89\x88\x87\x94R\x94(K\x03h*NNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK\x00t\x94bK\x00\x86\x94hIhSK\x08\x86\x94hJhSK\x10\x86\x94hKh&\x8c\x02f8\x94\x89\x88\x87\x94R\x94(K\x03h*NNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK\x00t\x94bK\x18\x86\x94hLhZK \x86\x94hMhSK(\x86\x94hNhZK0\x86\x94uK8K\x01K\x10t\x94bh,\x88ub\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x11\xbf\xe0\xb4\xb0 \x00\x00\x00?\xe0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00d@Y\x00\x00\x00\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xc0\x00\x00\x00\x00\x00\x00\x00?\xb8\xe8\xf1\x057\xb5\xf0\x00\x00\x00\x00\x00\x00\x00'@C\x80\x00\x00\x00\x00\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xc0\x00\x00\x00\x00\x00\x00\x00?\xd5w\x17/f\xfd\xde\x00\x00\x00\x00\x00\x00\x00=@N\x80\x00\x00\x00\x00\x00\x95,\x00\x00\x00\x00\x00\x00\x00\x8c\x06values\x94h\x1a)\x81\x94}\x94(h\x1dh h!K\x03K\x01K\x02\x87\x94h#h$h%hZh,\x88ub@I\x00\x00\x00\x00\x00\x00@I\x00\x00\x00\x00\x00\x00@B\x80\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x00@*\x00\x00\x00\x00\x00\x00@H\x00\x00\x00\x00\x00\x00\x95!\x00\x00\x00\x00\x00\x00\x00ub\x8c\x10_sklearn_version\x94\x8c\x060.24.1\x94ub.")) On my machine (little-endian) I get an error with joblib 1.0 and no error with joblib 1.1. Error
Edit: I pushed a docker image
|
@lesteve , thanks for the suggestion. I will try to update the joblib to latest version and see if it resolves the issue. |
I tried the 2 non-working models after updating joblib to 1.1.0(also updated sklearn to 1.0). Now they both worked fine. Thanks for the help! But I am not sure if this worked because of new version of sklearn or joblib, as I updated both of them. |
Thanks for the feed-back, I am confident that the fix comes from the Side-comment: in general pickles are not guaranteed to work when you update scikit-learn, see https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations. I'll open an issue about #17644 since I am not sure it is needed any more (I don't even understand how this could fix anything if I am being honest). |
Describe the bug
We are trying to load some of the AI models that are dumped in a little endian machine using joblib, on AIX which runs on power architecture which is big endian. Most of them worked, but there are 2 models which we found are giving errors. We tried to load it on a ubuntu linux that runs on power(big endian), and got the same error. We even tried building the latest nighly build of sklearn module, and still got this error. Attaching the programs we used to dump and load.
The two models that didn't work are:
Steps/Code to Reproduce
KNearestNeighbor_Dump.py(to be run on little endian machine):
KNearestNeighbor_Load.py(to be run on Big Endian machine after copying KNearestNeighbor.joblib file generated from above program):
Expected Results
no error is thrown
Actual Results
Versions
The text was updated successfully, but these errors were encountered: