Test decision tree pickle for different endianness #21539

lesteve · 2021-11-03T13:20:49Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This adds a test that checks that pickles containing big endian arrays can be loaded on a little-endian machine.

The way the pickles are created is to change the Pickle dispatch_table or to reimplement NumpyPickle.save for joblib pickles. This creates pickles with numpy arrays in big endian, which does not match exactly a pickle created on a big endian machine. The main advantage IMO is that we have some test for #17644 ...

I thought it was simpler that using qemu inside a docker image (look at #21237 (comment) for more details) as mentioned in #19602 (comment).

Doing something similar for more (or even all estimators) as mentioned in #19602 (comment) is left for a future PR.

ogrisel

LGTM!

And +1 for a similar test (+fix) for #19602 in a follow-up PR to support pickling of tree structures that have system specific precision levels in their integer fields.

lesteve · 2021-11-04T10:49:52Z

And +1 for a similar test (+fix) for #19602 in a follow-up PR to support pickling of tree structures that have system specific precision levels in their integer fields.

Yep I was working on it in parallel since this touches the same code, I opened a draft PR with a similar approach: #21552. This is better to merge this one first.

rth

Very nice! Thanks a lot @lesteve !

Maybe this should be put somewhere in common tests? Or at least in some folder where it would make sense to run it parametrized for a few estimators (I'm thinking also KDTree/BallTree #21553 ). Otherwise LGTM.

lesteve · 2021-11-04T13:08:32Z

Maybe this should be put somewhere in common tests? Or at least in some folder where it would make sense to run it parametrized for a few estimators (I'm thinking also KDTree/BallTree #21553 ). Otherwise LGTM.

I have this in mind eventually but for now my current strategy is:

have this PR merged
get Support cross 32bit/64bit pickles for decision tree #21552 merged (for now there seems to be some subtleties on Windows 32 bit that I need to investigate)
look at Cross endianness and bitness pickle issues with KNeighborsClassifier / KDTree #21553 in more details and fix it
once this is done and with the experience of having fixed a similar issue in two different places, revisit adding a common test about this kind of endianness 32bit/64bit issues

rth · 2021-11-04T13:38:56Z

OK, fair enough. Thanks!

lesteve added 2 commits November 3, 2021 13:19

Add test for cross-architecture-pickle

1d567c1

Tweak tests.

0ee5488

github-actions bot added the module:tree label Nov 3, 2021

lesteve changed the title ~~Cross architecture tree pickle~~ Test decision tree pickle for different endianness Nov 3, 2021

lesteve added the No Changelog Needed label Nov 3, 2021

lesteve added 2 commits November 3, 2021 15:00

Skip test for joblib < 1.1

ecb5021

Use pytest.mark.skipif

0a1abda

ogrisel approved these changes Nov 3, 2021

View reviewed changes

lesteve mentioned this pull request Nov 4, 2021

Support cross 32bit/64bit pickles for decision tree #21552

Merged

1 task

lesteve mentioned this pull request Nov 4, 2021

Cross endianness and bitness pickle issues with KNeighborsClassifier / KDTree #21553

Open

rth approved these changes Nov 4, 2021

View reviewed changes

rth merged commit 46485a9 into scikit-learn:main Nov 4, 2021

lesteve deleted the cross-architecture-tree-pickle branch November 4, 2021 13:49

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 29, 2021

Test decision tree pickle for different endianness (scikit-learn#21539)

27d7b4f

samronsin pushed a commit to samronsin/scikit-learn that referenced this pull request Nov 30, 2021

Test decision tree pickle for different endianness (scikit-learn#21539)

b76e4d5

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Dec 24, 2021

Test decision tree pickle for different endianness (scikit-learn#21539)

d6c7305

glemaitre pushed a commit that referenced this pull request Dec 25, 2021

Test decision tree pickle for different endianness (#21539)

f04487a

mathijs02 pushed a commit to mathijs02/scikit-learn that referenced this pull request Dec 27, 2022

Test decision tree pickle for different endianness (scikit-learn#21539)

c0a9886

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test decision tree pickle for different endianness #21539

Test decision tree pickle for different endianness #21539

lesteve commented Nov 3, 2021

ogrisel left a comment •

edited

lesteve commented Nov 4, 2021

rth left a comment •

edited

lesteve commented Nov 4, 2021

rth commented Nov 4, 2021

Test decision tree pickle for different endianness #21539

Test decision tree pickle for different endianness #21539

Conversation

lesteve commented Nov 3, 2021

Reference Issues/PRs

What does this implement/fix? Explain your changes.

ogrisel left a comment • edited

Choose a reason for hiding this comment

lesteve commented Nov 4, 2021

rth left a comment • edited

Choose a reason for hiding this comment

lesteve commented Nov 4, 2021

rth commented Nov 4, 2021

ogrisel left a comment •

edited

rth left a comment •

edited