New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #8477: Allow copies with different strides for 0-length data #8482
Conversation
Since NumPy 1.23, it is possible for zero-length ndarrays to have different strides (e.g. `(0,)`, and `(8,)`). If we attempt to copy from a zero-length device array to a zero-length host array where the strides differ, our compatibility check fails because it compares strides. This commit fixes the issue by only considering strides when checking compatibility of nonzero-length arrays. I believe this to be valid because the following works normally with NumPy 1.23: ```python import numpy as np ary1 = np.arange(0) ary2 = np.ndarray((0,), buffer=ary1.data) ary3 = np.empty_like(ary2) ary3[:] = ary2 ary3[...] = ary2 np.copyto(ary2, ary3) ``` i.e. copying zero-length arrays with different strides generally works as expected. The included test is written in such a way that it should test this change in behaviour regardless of the installed NumPy version - we explicitly construct zero-length device and host arrays with differing strides. The additional sanity check ensures that the host array has the strides we expect, just in case there is some version of NumPy in which setting the strides explicitly didn't result in the expected strides - I have observed that requesting nonzero strides for a zero length array can result still in zero strides (a separate but related behaviour), so this sanity check is provided to account for any other unexpected behaviour of this nature. I have tested locally with NumPy 1.22 and 1.23 (pre- and post-changes to strides). See also dask/distributed#7089 where a workaround for an observation of this issue was needed. This would not be needed with the fix in this commit.
gpuci run tests |
gpuci run tests |
Since making this PR I also decided it would be better to test copies in all directions, so the additional commit adds other cases (host-to-device, and device-to-device). |
gpuci run tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @gmarkall, this looks like an appropriate fix for the issue described. One minor query on the testing to resolve but otherwise looks good. Thanks again!
if ary1sq.strides != ary2sq.strides: | ||
# We check strides only if the size is nonzero, because strides are | ||
# irrelevant (and can differ) for zero-length copies. | ||
if ary1.size and ary1sq.strides != ary2sq.strides: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: cases like:
ary1 = np.ndarray(shape=(1, 1, 1), strides=(0, 2, 4), dtype=np.float64)
ary2 = np.ndarray(shape=(), strides=(), dtype=np.float64)
are "safe" due to the squeeze
above. It shouldn't be possible to reach here and need to check ary2.size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also checked manually with:
from numba import cuda
import numpy as np
x = np.ndarray(shape = (1, 1, 1), strides = (0, 2, 4), dtype=np.float64)
y = np.ndarray(shape = (), strides = (), dtype=np.float64)
print(x.shape, x.strides, x.size)
print(y.shape, y.strides, y.size)
dx = cuda.to_device(x)
dy = cuda.to_device(y)
dx.copy_to_host(y)
dy.copy_to_host(x)
without issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for checking this too!
# Ensure that the copy succeeds in both directions | ||
dev_array.copy_to_host(host_array) | ||
dev_array.copy_to_device(host_array) | ||
|
||
# Ensure that a device-to-device copy also succeeds when the strides | ||
# differ - one way of doing this is to copy the host array across and | ||
# use that for copies in both directions. | ||
dev_array_from_host = cuda.to_device(host_array) | ||
self.assertEqual(dev_array_from_host.shape, (0,)) | ||
self.assertEqual(dev_array_from_host.strides, (0,)) | ||
|
||
dev_array.copy_to_device(dev_array_from_host) | ||
dev_array_from_host.copy_to_device(dev_array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there's no merit in checking the values of the copies as this is about testing that the copies have succeeded without error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's no merit - I think there's already tests that copies work, this test is focused on an edge case in checking compatibility. Additionally, since these are empty arrays, there's not a lot to look at to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming, I'm inclined to agree, but thought it best to check!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch!
BFID |
How did the buildfarm run go? |
It was fine. |
Fix numba#8477: Allow copies with different strides for 0-length data
Since NumPy 1.23, it is possible for zero-length ndarrays to have different strides (e.g.
(0,)
, and(8,)
). If we attempt to copy from a zero-length device array to a zero-length host array where the strides differ, our compatibility check fails because it compares strides.This commit fixes the issue by only considering strides when checking compatibility of nonzero-length arrays. I believe this to be valid because the following works normally with NumPy 1.23:
i.e. copying zero-length arrays with different strides generally works as expected.
The included test is written in such a way that it should test this change in behaviour regardless of the installed NumPy version - we explicitly construct zero-length device and host arrays with differing strides. The additional sanity check ensures that the host array has the strides we expect, just in case there is some version of NumPy in which setting the strides explicitly didn't result in the expected strides - I have observed that requesting nonzero strides for a zero length array can result still in zero strides (a separate but related behaviour), so this sanity check is provided to account for any other unexpected behaviour of this nature. I have tested locally with NumPy 1.22 and 1.23 (pre- and post-changes to strides).
See also dask/distributed#7089 where a workaround for an observation of this issue was needed. This would not be needed with the fix in this commit.
Fixes #8477.