Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure arrays are bytes aligned in joblib pickles #1254

Merged
merged 31 commits into from Feb 25, 2022

Conversation

lesteve
Copy link
Member

@lesteve lesteve commented Feb 3, 2022

fix #563

alternative to #570

@codecov
Copy link

codecov bot commented Feb 3, 2022

Codecov Report

Merging #1254 (188c089) into master (3d80506) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1254      +/-   ##
==========================================
+ Coverage   93.81%   93.83%   +0.01%     
==========================================
  Files          50       50              
  Lines        7181     7267      +86     
==========================================
+ Hits         6737     6819      +82     
- Misses        444      448       +4     
Impacted Files Coverage Δ
joblib/numpy_pickle.py 99.14% <100.00%> (+0.15%) ⬆️
joblib/test/test_numpy_pickle.py 94.10% <100.00%> (+0.47%) ⬆️
joblib/_parallel_backends.py 92.27% <0.00%> (-0.74%) ⬇️
joblib/parallel.py 96.02% <0.00%> (-0.54%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3d80506...188c089. Read the comment docs.

Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First comment, I haven't actually reviewed the code change in details.

@@ -95,6 +97,17 @@ def write_array(self, array, pickler):
# pickle protocol.
pickle.dump(array, pickler.file_handle, protocol=2)
else:
try:
current_pos = pickler.file_handle.tell()
alignment = current_pos % 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numpy documentation mentions that some dtypes would rather expect 16 bytes alignment (e.g. float128).

Also, since SIMD-optimized compute kernels would run more efficient (fully vectorized) if the buffers are directly aligned to their vector instructions sizes, maybe we should directly go for 64 bytes alignment (e.g. for AVX 512 which are currently the widest vector instructions).

In the ARM ecosystem there are also 512 bit wide vector instructions, e.g.:

https://www.fujitsu.com/global/products/computing/servers/supercomputer/a64fx/

But from what I read about SVE2 the size can be dynamic by 128 bit (16 bytes) increments.

So I have the feeling that padding by 16 bytes is a necessity to be safe (avoid crashs) but padding by 64 bytes (512 bits) can be a bit helpful for vectorized compute kernels to run more efficiently on such memmaped buffers. Going beyond is probably useless.

joblib/numpy_pickle.py Outdated Show resolved Hide resolved
@lesteve lesteve changed the title Make sure arrays are 8 bytes aligned in joblib pickled Make sure arrays are 8 bytes aligned in joblib pickles Feb 3, 2022
@lesteve lesteve changed the title Make sure arrays are 8 bytes aligned in joblib pickles Make sure arrays are bytes aligned in joblib pickles Feb 3, 2022
Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. However I don't like the implicit padding logic implemented redundantly at write and read time.

Instead I think we should adopt an explicit single byte code to store the effective padding size at write time (without the try / except io.UnsupportedOperation boilerplate for readability's sake):

current_pos = pickler.file_handle.tell()
alignment = (current_pos + 1) % NUMPY_ARRAY_ALIGNMENT_BYTES
padding_size = NUMPY_ARRAY_ALIGNMENT_BYTES - alignment
padding_bytecode = chr(padding_size).encode('ascii')
assert len(padding_bytecode) == 1  # should always hold
pickler.file_handle.write(padding_bytecode)  # always written
if padding_size != 0:
    padding = b'\x00' * padding_size
    pickler.file_handle.write(padding)

then at read time, again without the try / except io.UnsupportedOperation boilerplate:

padding_bytecode = unpickler.file_handle.read(1)
padding_size = ord(padding_bytecode.decode('ascii'))
offset += padding_size

Disclaimer: untested code snippet.

WDYT @lesteve?

EDIT: I think I had made a mistake in a first version of this code.

joblib/numpy_pickle.py Outdated Show resolved Hide resolved
joblib/test/test_numpy_pickle.py Outdated Show resolved Hide resolved
@lesteve
Copy link
Member Author

lesteve commented Feb 4, 2022

Instead I think we should adopt an explicit prefix-code of the padding size at write time (without the try / except io.UnsupportedOperation boilerplate for readability's sake):

That does seem cleaner indeed. The only problem I see is that for old pickles (i.e. written without the change in this PR), there is no prefix code (the array bytes starts right away) so you are going to interpret the first array byte as the prefix code. Somehow there needs to be some kind of "sentinel" byte to indicate whether you have a new or old pickle file.

@ogrisel
Copy link
Contributor

ogrisel commented Feb 4, 2022

That does seem cleaner indeed. The only problem I see is that for old pickles (i.e. written without the change in this PR), there is no prefix code (the array bytes starts right away) so you are going to interpret the first array byte as the prefix code.

This is a valid concern indeed. Good catch.

Somehow there needs to be some kind of "sentinel" byte to indicate whether you have a new or old pickle file.

Yes we need to put some joblib-dump format versioning info somewhere at the beginning of the pickle file. But I don't think we can do it without breaking the pickle format. But we are already breaking the pickle format so maybe this is not a problem.

Edit: Actually we already do this kind of hack in _detect_compressor. So we could extend this and store a integer version number of the joblib format.

Each time we change the formatting, we could increase that number and explicitly maintain code-paths for backward compatibility with previous versions.

joblib/numpy_pickle.py Outdated Show resolved Hide resolved
@lesteve
Copy link
Member Author

lesteve commented Feb 17, 2022

I think this is ready enough for a more thorough review, removing the draft status.

@lesteve lesteve deleted the memmap-align branch July 25, 2022 13:45
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Nov 22, 2022
Release 1.2.0
Fix a security issue where eval(pre_dispatch) could potentially run arbitrary code. Now only basic numerics are supported. joblib/joblib#1327
Make sure that joblib works even when multiprocessing is not available, for instance with Pyodide joblib/joblib#1256
Avoid unnecessary warnings when workers and main process delete the temporary memmap folder contents concurrently. joblib/joblib#1263
Fix memory alignment bug for pickles containing numpy arrays. This is especially important when loading the pickle with mmap_mode != None as the resulting numpy.memmap object would not be able to correct the misalignment without performing a memory copy. This bug would cause invalid computation and segmentation faults with native code that would directly access the underlying data buffer of a numpy array, for instance C/C++/Cython code compiled with older GCC versions or some old OpenBLAS written in platform specific assembly. joblib/joblib#1254
Vendor cloudpickle 2.2.0 which adds support for PyPy 3.8+.
Vendor loky 3.3.0 which fixes several bugs including:
robustly forcibly terminating worker processes in case of a crash (joblib/joblib#1269);
avoiding leaking worker processes in case of nested loky parallel calls;
reliability spawn the correct number of reusable workers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alignment issue when passing arrays though mmap
5 participants