New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tree pickle portability between 64bit and 32bit arch #19602
Comments
Are you running the same version scikit-learn than the one used to create the |
sklearn version : 0.22.2 |
Could you as well give information about the OS and CPU architecture where the model was trained and where it is deployed? |
From the type error, one could guess that the model was pickled with Python 32 bit on Windows and loaded on Python 64 bit (e.g. on a Linux server). Is this the case? If this is confirmed, we should probably consider this a bug of scikit-learn as @rth recently hinted in another issue or PR (that I failed to find). However we would need to make sure that this bug has not already been fixed. @aryanxk02, can you try to check if you can reproduce the problem with scikit-learn 0.24 or a even better a nightly build? https://scikit-learn.org/stable/developers/advanced_installation.html#installing-nightly-builds |
currently on Windows 64 bit Operating System |
Okay, I developed the entire code (including ML model to pickle file) and ran it on python 32 bit through Jupyter Notebook on Windows. Sure, regarding scikit-learn 0.24 I will try and will inform. |
What version / bitness of Python and OS do you use to run the flask app? |
|
If you use the same version and bitness of Python both the training the model and running the flask app then I do not understand how you can get this error. Can you please run |
Also, can you check that you can run |
Identical issue was earlier reported in #7891 when unpickling on 32bit arm and in pyodide/pyodide#521 BTW, shouldn't size_t be unsigned, it says
In the above linked issue, I mentioned,
but I no longer remember the reasoning behind that comment. Edit: WebAssembly VM is also little endian same as x86_64 so it's not a byte order issue. |
Hello, regarding this I just checked my python bitness in the Jupyter Notebook where I ran the entire code, it was 64 bit. And the one I used for flask was 32 bit. |
Below are minimal steps to reproduce on a 64bit Linux with a 32 bit docker image. On a 64bit Linux, conda create -n test-sklearn python=3.7.3
conda activate test-sklearn
pip install scikit-learn==0.24.1
mkdir tmp/
cd tmp/
python example.py with from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
import joblib
iris = load_iris()
clf = RandomForestClassifier().fit(iris.data, iris.target)
joblib.dump(clf, 'estimator.pkl') Start a 32bit Docker image,
apt update
apt install python3 python3-venv
python -m venv venv
source venv/bin/activate
pip install scikit-learn==0.24.1
python -c "import joblib; joblib.load('/shared/estimator.pkl')" produces,
I'll try to investigate in more details later. |
@rth any guidance on how to fix this for Windows OS would be appreciated sir. |
why is this closed? I'm still having this issue with saving a random forest classifier: when training/saving on a 64-bit python and loading/deploying on 32-bit python, I get the error
|
Nope, both the interpreters should be same. Either 64bit or 32bit. But same! |
Indeed it's still a valid issue. Re-opening. The general issue is that we should stop using size_t as dtype in objects that get serialized, since it has a different size depending on the architecture. |
Having looked recently at #21237 (somewhat related issue but for different endianness rather than 64bit vs 32bit) I think there will very likely be more issues down the line for example here: scikit-learn/sklearn/tree/_tree.pyx Lines 651 to 656 in 682bd05
Both dtypes won't match and there will be an error. |
Hmm, yes the endiannes issue is a bit more complex. And it looks like #17644 might have created more issues than it solved. But what do you think we should do for 32/64 bit compatibility, which is probably easier to manage? Aren't there portable dtypes defined in numpy we could use? Say replace all usize_t with np.int32 (or |
I think the endianess issue is (quite) orthogonal to the int precision issue: even if we had pickles from from the same endianess we would still have a problem with the handling of the precision. The reason is that the Cython ctypedef np.npy_float32 DTYPE_t # Type of X
ctypedef np.npy_float64 DOUBLE_t # Type of y, sample_weight
ctypedef np.npy_intp SIZE_t # Type for indices and counters
cdef struct Node:
# Base storage structure for the nodes in a Tree object
SIZE_t left_child # id of the left child of the node
SIZE_t right_child # id of the right child of the node
SIZE_t feature # Feature used for splitting the node
DOUBLE_t threshold # Threshold value at the node
DOUBLE_t impurity # Impurity of the node (i.e., the value of the criterion)
SIZE_t n_node_samples # Number of samples at the node
DOUBLE_t weighted_n_node_samples # Weighted number of samples at the node The Similarly the I think, the simplest solution would be to put some conversion logic in a new Note that when both |
For us to maintain code that enables portability between 32/64bit systems, I think we need a new CI for testing. The test would consist of:
Side note: Numpy has recently decided to not have 32-bit manylinux wheels, so building a 32bit wheel for scikit-learn may become harder. |
That's a possibility but it's quite heavy to setup and maintain. A simpler solution would be to dumps a bunch of potentially problematic models (e.g. Cython based models such as kd-trees, balltrees, RFs, GBRT and HistGBRT generated by a small helper scripts) on a 64 bit machine once in a while and store the results in a folder of the github repo (the model should be parameterized to be small, e.g. a few kB) and then write a test to check that we can call score and get the expected accuracy in a test that will be run both on 32 bit and 64 bit CIs. This will also be useful to be aware of the (lack of) forward compat of the pickles between consecutive scikit-learn versions. When the class structure changes we will have to regenerate some pickle files but at least we know when and what we break before our users complain about it, even if we do not support pickle compat between scikit-learn versions :) |
If they stop doing that, then maybe we should stop doing this as well... But I suppose 32 bit platforms will still be supported via debian and conda-forge packages for instance. |
Decision Tree Classifier
The text was updated successfully, but these errors were encountered: