Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement np.isclose #7067

Merged
merged 13 commits into from
Oct 19, 2022
Merged

Implement np.isclose #7067

merged 13 commits into from
Oct 19, 2022

Conversation

guilhermeleobas
Copy link
Contributor

As title.

@jeertmans
Copy link
Contributor

Hello,
A few months ago I also proposed an implementation of allclose and isclose but this was put in the PR Backlog milestone, see #6286. Maybe it would be worth merging my request with yours ? Or take the best one ?

@gmarkall
Copy link
Member

gmarkall commented Jun 1, 2021

Is this still work-in-progress? @guilhermeleobas Do you plan to add this to the list of supported functions in the docs?

@guilhermeleobas
Copy link
Contributor Author

@gmarkall, I am waiting for a review from @stuartarchibald. The code for this function was copied from another PR (#4610)

@gmarkall gmarkall added 3 - Ready for Review Effort - medium Medium size effort needed and removed 2 - In Progress labels Jun 1, 2021
@stuartarchibald
Copy link
Contributor

CC @jpivarski, xref #6074

@jpivarski
Copy link
Contributor

As promised in the meeting, here's my implementation for cross-checking:

@numba.njit
def _isclose_item(x, y, rtol, atol, equal_nan):
    if numpy.isnan(x) and numpy.isnan(y):
        return equal_nan
    elif numpy.isinf(x) and numpy.isinf(y):
        return (x > 0) == (y > 0)
    elif numpy.isinf(x) or numpy.isinf(y):
        return False
    else:
        return abs(x - y) <= atol + rtol * abs(y)

@numba.extending.overload(numpy.isclose)
def isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
    if (isinstance(a, numba.types.Array) and a.ndim > 0) or (
        isinstance(b, numba.types.Array) and b.ndim > 0
    ):
        def isclose_impl(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
            # FIXME: want to broadcast_arrays(a, b) here
            x = a.reshape(-1)
            y = b.reshape(-1)
            out = numpy.zeros(len(y), numpy.bool_)
            for i in range(len(out)):
                out[i] = _isclose_item(x[i], y[i], rtol, atol, equal_nan)
            return out.reshape(b.shape)

    elif isinstance(a, numba.types.Array) or isinstance(b, numba.types.Array):
        def isclose_impl(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
            return numpy.asarray(
                _isclose_item(a.item(), b.item(), rtol, atol, equal_nan)
            )

    else:
        def isclose_impl(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
            return _isclose_item(a, b, rtol, atol, equal_nan)

    return isclose_impl

The main difference is that @guilhermeleobas's implementation does many passes over the data (following NumPy's implementation) and mine does one pass, and is specialized for non-arrays if given non-arrays. It has a weakness, though: for correctness, my a and b would have to be broadcasted with

a, b = numpy.broadcast_arrays(a, b)

before the out.reshape(b.shape) would be fully general. As it is, it managed to pass the tests because all of the tests have the larger shape as the second argument.

The tests can be made more general by adding the following two items:

yield [atol, np.inf, -np.inf, np.nan], [0], kw
yield [atol, np.inf, -np.inf, np.nan], 0, kw

which are just reversing the argument order of the last two tests. Then my implementation fails (for lack of lowered broadcast_arrays, #4074 (comment)) and I believe @guilhermeleobas's implementation would pass.

Meta-question: are the "single pass over arrays" or "specialized for non-array types" performance considerations significant?

@guilhermeleobas
Copy link
Contributor Author

@jpivarski, your implementation compiles WAY faster than the one I did. For reference, I am using this script to benchmark both implementations.

Compare both implementations with:

IMPL=jim pytest -rs -sv --disable-warnings --tb=short -x -v --durations=0 a.py
IMPL=guilherme pytest -rs -sv --disable-warnings --tb=short -x -v --durations=0 a.py

@jpivarski
Copy link
Contributor

The compilation speed probably depends strongly on types—if the arguments to my isclose function are scalars, then it resolves to a couple of if-statements. If the arguments are arrays, my implementation is wrong because it lacks broadcasting. (I think the array functions you use don't broadcast, but they at least fail with the right error message when the shapes don't match.)

A "best of both worlds" might be to check for scalar arguments and drop to _isclose_item for that case, which is (I'm guessing) what is compiling so quickly. It should also run faster because it won't have to create array structs to evaluate scalars—I doubt LLVM is smart enough to compile that away, but again, I'm just guessing.

@stuartarchibald
Copy link
Contributor

/AzurePipelines run

@azure-pipelines
Copy link
Contributor

Azure Pipelines successfully started running 1 pipeline(s).

@guilhermeleobas
Copy link
Contributor Author

Although the current implementation seems correct (similar to NumPy ones), it takes a while to compile. I'll try to rewrite it based on @jpivarski implementation

@guilhermeleobas
Copy link
Contributor Author

@jpivarski, can you review the code? Once #7437 gets merged, one can remove the broadcast_shapes part.

@jpivarski
Copy link
Contributor

@jpivarski, can you review the code?

Sure, I'll review. As part of that, I'm checking out the code and I'm going to try running it (in Vector, which motivated my interest in it). It will take tens of minutes to set up a new environment for it.

jpivarski
jpivarski previously approved these changes Apr 12, 2022
Copy link
Contributor

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my test environment, I manually changed _min_llvm_version to (0, 38, 0) because I couldn't figure out how to install 0.39.0 (it's not released).

I tried it out manually, and all of the arguments (rtol, atol, equal_nan) work and give me the values I'd expect. I may be manually reproducing your test suite, but okay.

And now the motivating case: can I remove our custom implementation of isclose in Vector?

https://github.com/scikit-hep/vector/blob/e493e6f77d589571f85dcdf4dc4b61b8ca16f991/src/vector/_backends/numba_object.py#L95-L125

With this custom implementation removed and Numba 0.55.1, the last line fails:

>>> import numpy as np
>>> import numba as nb
>>> import vector
>>> one = vector.obj(x=1.1, y=2.2)
>>> two = vector.obj(x=1.1+1e-12, y=2.2+1e-12)
>>> (lambda x, y: x.isclose(y))(one, two)
True
>>> nb.njit()(lambda x, y: x.isclose(y))(one, two)

because the vector isclose relies on the existence of np.isclose to be defined for numeric arguments. In the environment with this branch installed, the last line succeeds (returns True) because Vector is picking up on your new, lowered np.isclose.

So it works!

And the implementation looks good to me.

@guilhermeleobas
Copy link
Contributor Author

guilhermeleobas commented Apr 13, 2022

Thanks for the review, @jpivarski.

@stuartarchibald or @gmarkall, can one of you folks review this PR when possible?

…hange np.isclose impl. to use np.broadcast_shapes
…le and change np.isclose impl. to use np.broadcast_shapes"

This reverts commit 4c42ec7.
@guilhermeleobas
Copy link
Contributor Author

One cannot use np.broadcast_shapes inside np.isclose because the former is available only on NumPy >= (1, 20)

Copy link
Contributor

@njriasan njriasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @guilhermeleobas. I left some comments, but this looks pretty good.

numba/core/typing/arraydecl.py Outdated Show resolved Hide resolved
return np.broadcast_to(out, tup)

else:
def isclose_impl(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this path can be reached for types that shouldn't be supported. In particular since type_can_asarray supports types.Number you could take this path with complex numbers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like np.isclose supports complex numbers

>>> import numpy as np
>>> print(np.isclose(2, 3j))
False

numba/np/arraymath.py Outdated Show resolved Hide resolved
x = a
y = b.reshape(-1)
out = np.zeros(len(y), np.bool_)
for i in range(len(out)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be supported with parfors as well? It seems like this should be supported with parallel=True.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, you're saying replace range by prange when parallel=True?

@gmarkall gmarkall added the needs initial review This PR needs an initial review to check the code change is well formed, documented, efficient etc. label Jun 28, 2022
@stuartarchibald stuartarchibald removed the needs initial review This PR needs an initial review to check the code change is well formed, documented, efficient etc. label Jul 5, 2022
@apmasell apmasell self-assigned this Oct 19, 2022
@apmasell apmasell added 5 - Ready to merge Review and testing done, is ready to merge and removed 3 - Ready for Review labels Oct 19, 2022
@sklam sklam added this to the Numba 0.57 RC milestone Oct 19, 2022
@sklam sklam merged commit 9b13ac7 into numba:main Oct 19, 2022
@guilhermeleobas guilhermeleobas deleted the np_isclose branch October 19, 2022 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to merge Review and testing done, is ready to merge Effort - medium Medium size effort needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants