-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement np.isclose #7067
Implement np.isclose #7067
Conversation
Hello, |
Is this still work-in-progress? @guilhermeleobas Do you plan to add this to the list of supported functions in the docs? |
@gmarkall, I am waiting for a review from @stuartarchibald. The code for this function was copied from another PR (#4610) |
CC @jpivarski, xref #6074 |
As promised in the meeting, here's my implementation for cross-checking: @numba.njit
def _isclose_item(x, y, rtol, atol, equal_nan):
if numpy.isnan(x) and numpy.isnan(y):
return equal_nan
elif numpy.isinf(x) and numpy.isinf(y):
return (x > 0) == (y > 0)
elif numpy.isinf(x) or numpy.isinf(y):
return False
else:
return abs(x - y) <= atol + rtol * abs(y)
@numba.extending.overload(numpy.isclose)
def isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
if (isinstance(a, numba.types.Array) and a.ndim > 0) or (
isinstance(b, numba.types.Array) and b.ndim > 0
):
def isclose_impl(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
# FIXME: want to broadcast_arrays(a, b) here
x = a.reshape(-1)
y = b.reshape(-1)
out = numpy.zeros(len(y), numpy.bool_)
for i in range(len(out)):
out[i] = _isclose_item(x[i], y[i], rtol, atol, equal_nan)
return out.reshape(b.shape)
elif isinstance(a, numba.types.Array) or isinstance(b, numba.types.Array):
def isclose_impl(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
return numpy.asarray(
_isclose_item(a.item(), b.item(), rtol, atol, equal_nan)
)
else:
def isclose_impl(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
return _isclose_item(a, b, rtol, atol, equal_nan)
return isclose_impl The main difference is that @guilhermeleobas's implementation does many passes over the data (following NumPy's implementation) and mine does one pass, and is specialized for non-arrays if given non-arrays. It has a weakness, though: for correctness, my a, b = numpy.broadcast_arrays(a, b) before the The tests can be made more general by adding the following two items: yield [atol, np.inf, -np.inf, np.nan], [0], kw
yield [atol, np.inf, -np.inf, np.nan], 0, kw which are just reversing the argument order of the last two tests. Then my implementation fails (for lack of lowered Meta-question: are the "single pass over arrays" or "specialized for non-array types" performance considerations significant? |
@jpivarski, your implementation compiles WAY faster than the one I did. For reference, I am using this script to benchmark both implementations. Compare both implementations with:
|
The compilation speed probably depends strongly on types—if the arguments to my A "best of both worlds" might be to check for scalar arguments and drop to |
823f7b2
to
9ec6cd5
Compare
9c94e21
to
0d795e2
Compare
/AzurePipelines run |
Azure Pipelines successfully started running 1 pipeline(s). |
Although the current implementation seems correct (similar to NumPy ones), it takes a while to compile. I'll try to rewrite it based on @jpivarski implementation |
@jpivarski, can you review the code? Once #7437 gets merged, one can remove the |
Sure, I'll review. As part of that, I'm checking out the code and I'm going to try running it (in Vector, which motivated my interest in it). It will take tens of minutes to set up a new environment for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my test environment, I manually changed _min_llvm_version
to (0, 38, 0)
because I couldn't figure out how to install 0.39.0 (it's not released).
I tried it out manually, and all of the arguments (rtol
, atol
, equal_nan
) work and give me the values I'd expect. I may be manually reproducing your test suite, but okay.
And now the motivating case: can I remove our custom implementation of isclose
in Vector?
With this custom implementation removed and Numba 0.55.1, the last line fails:
>>> import numpy as np
>>> import numba as nb
>>> import vector
>>> one = vector.obj(x=1.1, y=2.2)
>>> two = vector.obj(x=1.1+1e-12, y=2.2+1e-12)
>>> (lambda x, y: x.isclose(y))(one, two)
True
>>> nb.njit()(lambda x, y: x.isclose(y))(one, two)
because the vector isclose
relies on the existence of np.isclose
to be defined for numeric arguments. In the environment with this branch installed, the last line succeeds (returns True
) because Vector is picking up on your new, lowered np.isclose
.
So it works!
And the implementation looks good to me.
Thanks for the review, @jpivarski. @stuartarchibald or @gmarkall, can one of you folks review this PR when possible? |
d6923fa
to
bd8c33a
Compare
…hange np.isclose impl. to use np.broadcast_shapes
…le and change np.isclose impl. to use np.broadcast_shapes" This reverts commit 4c42ec7.
One cannot use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @guilhermeleobas. I left some comments, but this looks pretty good.
return np.broadcast_to(out, tup) | ||
|
||
else: | ||
def isclose_impl(a, b, rtol=1e-05, atol=1e-08, equal_nan=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this path can be reached for types that shouldn't be supported. In particular since type_can_asarray
supports types.Number
you could take this path with complex numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like np.isclose
supports complex numbers
>>> import numpy as np
>>> print(np.isclose(2, 3j))
False
x = a | ||
y = b.reshape(-1) | ||
out = np.zeros(len(y), np.bool_) | ||
for i in range(len(out)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be supported with parfors as well? It seems like this should be supported with parallel=True.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, you're saying replace range
by prange
when parallel=True
?
As title.