New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix integer overflow in in1d for mixed integer dtypes #22877 #22878
Conversation
83d5b2b
to
f7a1439
Compare
Local tests pass. Ready for review @seberg |
below_memory_constraint = ar2_range <= 6 * (ar1.size + ar2.size) | ||
# 2. Check overflows for (ar2 - ar2_min); dtype=ar2.dtype | ||
range_safe_from_overflow = ar2_range <= np.iinfo(ar2.dtype).max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR also corrects the bounds of the overflow check. It should have really been <=
in #12065, rather than <
. I noticed this after adding some new tests.
This is that last thing that would be great to have some fix for in 1.24.1 considering that it returns bad results. I expect this is good, but I need fresher eyes. But if it looks good to others: Maybe we should get it in, and I look at it again later and just follow up if I feel a different approach is better. |
I'm thinking about this proposed change now and while it does fix the problem, it is more conservative than necessary. Recall we are looking for overflows in this calculation: basic_mask = (ar1 <= ar2_max) & (ar1 >= ar2_min)
outgoing_array[basic_mask] = isin_helper_ar[ar1[basic_mask] - ar2_min]
# After masking, the range of ar1 is guaranteed to be
# within the range of ar2:
ar1_upper = min(int(ar1_max), int(ar2_max))
ar1_lower = max(int(ar1_min), int(ar2_min))
range_safe_from_overflow &= all((
ar1_upper - int(ar2_min) <= np.iinfo(ar1.dtype).max,
ar1_lower - int(ar2_min) >= np.iinfo(ar1.dtype).min
)) does that make sense? Edit: pushed this change. |
I wonder why integers larger the uint16 are not tested, are they too big? |
Thanks @MilesCranmer. @seberg If you see anything that bothers you we can make another PR. |
1 similar comment
Thanks @MilesCranmer. @seberg If you see anything that bothers you we can make another PR. |
numpy#22878) * TST: Mixed integer types for in1d * BUG: Fix mixed dtype overflows for in1d (numpy#22877) * BUG: Type conversion for integer overflow check * MAINT: Fix linting issues in in1d * MAINT: ar1 overflow check only for non-empty array * MAINT: Expand bounds of overflow check * TST: Fix integer overflow in mixed boolean test * TST: Include test for overflow on mixed dtypes * MAINT: Less conservative overflow checks
This fixes #22877 raised by @TonyXiang8787. The bug, introduced by #12065, results in integer overflows occurring in the following line:
numpy/numpy/lib/arraysetops.py
Lines 683 to 684 in 2b9851b
in1d
.The fix is to simply test for these in advance of the
kind='table'
method being used:I also added some unittests to evaluate this behavior.
cc @seberg