RuntimeWarning: divide by zero encountered in reciprocal for np.ones(3) ** -1 #18555

ogrisel · 2021-03-05T17:10:49Z

Reproducing code example:

import numpy as np

with np.errstate(invalid="raise"):
    np.ones(3) ** -1

raises:

<ipython-input-43-02fd0a095546>:2: RuntimeWarning: divide by zero encountered in reciprocal
  np.ones(3) ** -1

but the result of the evaluation is still correct: array([1., 1., 1.])...

I also tried 1 / np.ones(3), np.ones(2) ** -1, np.ones(4) ** -1 which do not raise the warning.

I get it for any numpy array of 3 items with dtype np.float32 or np.float64 but not np.complex64 for instance.

NumPy/Python version information:

I got the problem with:

1.20.1 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:00:30) 
[Clang 11.0.1 ]

on macos/arm64.

I cannot get it from a linux/aarch64 docker environment running on the same machine numpy installed from PyPI.org with version:

1.20.1 3.9.1 (default, Dec 12 2020, 08:46:48) 
[GCC 8.3.0]

I cannot get it from a linux/aarch64 docker environment running on the same machine numpy installed from conda-forge with version:

1.20.1 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:26) 
[GCC 9.3.0]

So maybe this is a bug related to clang or macos/arm64? Could someone with an intel mac try to reproduce it, preferably using the numpy from conda-forge?

The text was updated successfully, but these errors were encountered:

seberg · 2021-03-05T17:16:45Z

@ogrisel I think we might be running into #18005, i.e. clang has the terrible default of not using -ffp-exception-behavior=strict! There are already a few places in the code working around that I think. Try setting the flag if that fixes it. We really should force that flag, but I am not well versed with distutils or clang so was never sure where to add it when I looked at it once.

ogrisel · 2021-03-05T17:56:10Z

I built numpy master with:

CFLAGS="-ffp-exception-behavior=strict" python setup.py install &> /tmp/numpy_build_log.txt

and I still get the RuntimeWarning.

Here is the build log:

https://gist.github.com/ogrisel/d104c8cf1b1c13202c204f3cc0004e38

seberg · 2021-03-05T19:28:10Z

Hmm, thought it was worth a shot, since apparently without the flag clang feels free to do optimizations that can make the floating point exceptions all wrong.
Still seems likely to be a bug with clang or glibc on macos/arm64, but I am not certain.

ogrisel · 2021-03-07T19:48:06Z

Not a big problem because it's just a warning and the result is correct. Still it's annoying in for libraries that checks that that warnings are not raised in a given context.

isuruf · 2021-03-08T00:43:56Z

The fix should be the same as in #18571 for reciprocal

erykoff · 2021-03-08T00:44:45Z

Interesting. Is this also a problem with the latest Xcode compiler or just the conda compiler (which is a version behind). I believe the problem in #17712 was not present with the latest Xcode compiler (clang 12) which presumably fixed that look-ahead exception bug.

isuruf · 2021-03-08T00:46:17Z

Is this also a problem with the latest Xcode compiler

Yes.

which is a version behind

Not really. conda compiler is actually ahead. The versioning schemes are different though.

isuruf · 2021-04-28T04:00:16Z

This is not the same issue as #18571. It turns out that this is due to branch prediction going wrong in the for loop and floating point exceptions being raised for the wrong branch.
np.ones(120)[:120] ** -1 gives the warning, but np.ones(120)[:119] ** -1 doesn't.

ogrisel · 2021-04-28T08:09:46Z

Shall we report the issue upstream to LLVM (I assume) and just mark the failing TestUfuncGenericLoops::test_unary_PyUFunc_O_O_method_full[reciprocal] test (see #18143 (comment)) as XFAIL?

isuruf · 2021-04-29T01:07:05Z

I don't have a small testcase to report this issue upstream. Besides I'm not even sure if it's a compiler issue or a hardware bug.

charris · 2021-05-06T21:34:11Z

Going to push this off to 1.22.0 for tracking. It seems the problem is not directly connected to NumPy.

The code in this test can raise: RuntimeWarning: divide by zero encountered in reciprocal but this warning should not be raised. See the upstream report: numpy/numpy#18555 In the mean time, let's ignore those warnings in this test to make it pass and avoid crashing the scikit-learn tests on those systems.

clang has an optimization bug where a vector that is only partially loaded / stored will get narrowed to only the lanes used, which can be fine in some cases. However, in numpy's `reciprocal` function a partial load is explicitly extended to the full width of the register (filled with '1's) to avoid divide-by-zero. clang's optimization ignores the explicit filling with '1's. The changes here insert a dummy `volatile` variable. This convinces clang not to narrow the load and ignore the explicit filling of '1's. `volatile` can be expensive since it forces loads / stores to refresh contents whenever the variable is used. numpy has its own template / macro system that'll expand the loop function below for sqrt, absolute, square, and reciprocal. Additionally, the loop can be called on a full array if there's overlap between src and dst. Consequently, we try to limit the scope that we need to apply `volatile`. Intention is it should only be needed when compiling with clang, against Apple arm64, and only for the `reciprocal` function. Moreover, `volatile` is only needed when a vector is partially loaded. Testing: Beyond fixing the cases mentioned in the GitHub issue, the changes here also resolve several failures in numpy's test suite. Before: ``` FAILED numpy/core/tests/test_scalarmath.py::TestBaseMath::test_blocked - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/core/tests/test_ufunc.py::TestUfuncGenericLoops::test_unary_PyUFunc_O_O_method_full[reciprocal] - AssertionError: FloatingPointError not raised FAILED numpy/core/tests/test_umath.py::TestPower::test_power_float - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_tan - AssertionError: FloatingPointError not raised by tan FAILED numpy/core/tests/test_umath.py::TestAVXUfuncs::test_avx_based_ufunc - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/linalg/tests/test_linalg.py::TestNormDouble::test_axis - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/linalg/tests/test_linalg.py::TestNormSingle::test_axis - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/linalg/tests/test_linalg.py::TestNormInt64::test_axis - RuntimeWarning: divide by zero encountered in reciprocal 8 failed, 14759 passed, 204 skipped, 1268 deselected, 34 xfailed in 69.90s (0:01:09) ``` After: ``` FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_tan - AssertionError: FloatingPointError not raised by tan 1 failed, 14766 passed, 204 skipped, 1268 deselected, 34 xfailed in 70.37s (0:01:10) ```

Enhancement on top of workaround for clang bug in reciprocal (numpy#18555) Numpy's FP unary loops use a partial load / store on every iteration. The partial load / store helpers each insert a switch statement to know how many elements to handle. This causes a lot of unnecessary branches to be inserted in the loops. The partial load / store is only needed on the final iteration of the loop if it isn't a full vector. The changes here breakout the final iteration to use the partial load / stores. The loop has been changed to use full load / stores. Additionally, this means we don't need to conditionalize the volatile workaround in the loop.

* Resolve divide by zero in reciprocal #18555 clang has an optimization bug where a vector that is only partially loaded / stored will get narrowed to only the lanes used, which can be fine in some cases. However, in numpy's `reciprocal` function a partial load is explicitly extended to the full width of the register (filled with '1's) to avoid divide-by-zero. clang's optimization ignores the explicit filling with '1's. The changes here insert a dummy `volatile` variable. This convinces clang not to narrow the load and ignore the explicit filling of '1's. `volatile` can be expensive since it forces loads / stores to refresh contents whenever the variable is used. numpy has its own template / macro system that'll expand the loop function below for sqrt, absolute, square, and reciprocal. Additionally, the loop can be called on a full array if there's overlap between src and dst. Consequently, we try to limit the scope that we need to apply `volatile`. Intention is it should only be needed when compiling with clang, against Apple arm64, and only for the `reciprocal` function. Moreover, `volatile` is only needed when a vector is partially loaded. Testing: Beyond fixing the cases mentioned in the GitHub issue, the changes here also resolve several failures in numpy's test suite. Before: ``` FAILED numpy/core/tests/test_scalarmath.py::TestBaseMath::test_blocked - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/core/tests/test_ufunc.py::TestUfuncGenericLoops::test_unary_PyUFunc_O_O_method_full[reciprocal] - AssertionError: FloatingPointError not raised FAILED numpy/core/tests/test_umath.py::TestPower::test_power_float - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_tan - AssertionError: FloatingPointError not raised by tan FAILED numpy/core/tests/test_umath.py::TestAVXUfuncs::test_avx_based_ufunc - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/linalg/tests/test_linalg.py::TestNormDouble::test_axis - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/linalg/tests/test_linalg.py::TestNormSingle::test_axis - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/linalg/tests/test_linalg.py::TestNormInt64::test_axis - RuntimeWarning: divide by zero encountered in reciprocal 8 failed, 14759 passed, 204 skipped, 1268 deselected, 34 xfailed in 69.90s (0:01:09) ``` After: ``` FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_tan - AssertionError: FloatingPointError not raised by tan 1 failed, 14766 passed, 204 skipped, 1268 deselected, 34 xfailed in 70.37s (0:01:10) ``` * Enhancement on top of workaround for clang bug in reciprocal Enhancement on top of workaround for clang bug in reciprocal (#18555) Numpy's FP unary loops use a partial load / store on every iteration. The partial load / store helpers each insert a switch statement to know how many elements to handle. This causes a lot of unnecessary branches to be inserted in the loops. The partial load / store is only needed on the final iteration of the loop if it isn't a full vector. The changes here breakout the final iteration to use the partial load / stores. The loop has been changed to use full load / stores. Additionally, this means we don't need to conditionalize the volatile workaround in the loop. * Address Azure CI failures with older versions of clang - -ftrapping-math is default enabled for Numpy, but support in clang is mainly for x86_64 - Apple Clang and Clang have different, but overlapping versions - Non-Apple Clang versions come from looking at when they started supporting -ftrapping-math for x86_64 Testing was done against Apple Clang versions - v11 / x86_64 - failed previously, now passes (azure failure) - v12+ / x86_64 - passes before and after - v13 / arm64 - failed before initial patch, passes after

…9926) * Resolve divide by zero in reciprocal numpy#18555 clang has an optimization bug where a vector that is only partially loaded / stored will get narrowed to only the lanes used, which can be fine in some cases. However, in numpy's `reciprocal` function a partial load is explicitly extended to the full width of the register (filled with '1's) to avoid divide-by-zero. clang's optimization ignores the explicit filling with '1's. The changes here insert a dummy `volatile` variable. This convinces clang not to narrow the load and ignore the explicit filling of '1's. `volatile` can be expensive since it forces loads / stores to refresh contents whenever the variable is used. numpy has its own template / macro system that'll expand the loop function below for sqrt, absolute, square, and reciprocal. Additionally, the loop can be called on a full array if there's overlap between src and dst. Consequently, we try to limit the scope that we need to apply `volatile`. Intention is it should only be needed when compiling with clang, against Apple arm64, and only for the `reciprocal` function. Moreover, `volatile` is only needed when a vector is partially loaded. Testing: Beyond fixing the cases mentioned in the GitHub issue, the changes here also resolve several failures in numpy's test suite. Before: ``` FAILED numpy/core/tests/test_scalarmath.py::TestBaseMath::test_blocked - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/core/tests/test_ufunc.py::TestUfuncGenericLoops::test_unary_PyUFunc_O_O_method_full[reciprocal] - AssertionError: FloatingPointError not raised FAILED numpy/core/tests/test_umath.py::TestPower::test_power_float - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_tan - AssertionError: FloatingPointError not raised by tan FAILED numpy/core/tests/test_umath.py::TestAVXUfuncs::test_avx_based_ufunc - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/linalg/tests/test_linalg.py::TestNormDouble::test_axis - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/linalg/tests/test_linalg.py::TestNormSingle::test_axis - RuntimeWarning: divide by zero encountered in reciprocal FAILED numpy/linalg/tests/test_linalg.py::TestNormInt64::test_axis - RuntimeWarning: divide by zero encountered in reciprocal 8 failed, 14759 passed, 204 skipped, 1268 deselected, 34 xfailed in 69.90s (0:01:09) ``` After: ``` FAILED numpy/core/tests/test_umath.py::TestSpecialFloats::test_tan - AssertionError: FloatingPointError not raised by tan 1 failed, 14766 passed, 204 skipped, 1268 deselected, 34 xfailed in 70.37s (0:01:10) ``` * Enhancement on top of workaround for clang bug in reciprocal Enhancement on top of workaround for clang bug in reciprocal (numpy#18555) Numpy's FP unary loops use a partial load / store on every iteration. The partial load / store helpers each insert a switch statement to know how many elements to handle. This causes a lot of unnecessary branches to be inserted in the loops. The partial load / store is only needed on the final iteration of the loop if it isn't a full vector. The changes here breakout the final iteration to use the partial load / stores. The loop has been changed to use full load / stores. Additionally, this means we don't need to conditionalize the volatile workaround in the loop. * Address Azure CI failures with older versions of clang - -ftrapping-math is default enabled for Numpy, but support in clang is mainly for x86_64 - Apple Clang and Clang have different, but overlapping versions - Non-Apple Clang versions come from looking at when they started supporting -ftrapping-math for x86_64 Testing was done against Apple Clang versions - v11 / x86_64 - failed previously, now passes (azure failure) - v12+ / x86_64 - passes before and after - v13 / arm64 - failed before initial patch, passes after

ogrisel mentioned this issue Mar 5, 2021

DEP Deprecate 'normalize' in ridge models scikit-learn/scikit-learn#17772

Merged

seberg mentioned this issue Mar 7, 2021

BUG: Fix overflow warning on apple silicon #18571

Merged

ogrisel mentioned this issue Mar 7, 2021

BUG: Refactor complex floor_divide to avoid osx-arm64 opt warnings #17712

Closed

rgommers added the 00 - Bug label Mar 25, 2021

rgommers mentioned this issue Mar 25, 2021

numpy 1.20 on MacOSX: spurious RuntimeWarning: invalid value encountered in reciprocal conda-forge/numpy-feedstock#229

Closed

1 task

rgommers added this to the 1.20.2 release milestone Mar 25, 2021

charris modified the milestones: 1.20.2 release, 1.20.3 release Mar 27, 2021

ogrisel mentioned this issue Apr 27, 2021

Please provide universal2 wheels on macOS #18143

Closed

charris modified the milestones: 1.20.3 release, 1.22.0 release May 6, 2021

isuruf mentioned this issue Jun 25, 2021

Segmentation fault on import of scipy.integrate on Apple M1 ARM silicon scipy/scipy#13364

Closed

ogrisel mentioned this issue Sep 16, 2021

FIX ignore spurious RuntimeWarning in test on macOS M1 machines scikit-learn/scikit-learn#21070

Closed

Developer-Ecosystem-Engineering mentioned this issue Sep 22, 2021

BUG: Resolve Divide by Zero on Apple silicon + test failures #19926

Merged

charris closed this as completed in #19926 Sep 25, 2021

charris mentioned this issue Sep 25, 2021

BUG: Resolve Divide by Zero on Apple silicon + test failures (#19926) #19955

Merged

ogrisel mentioned this issue Oct 22, 2021

RuntimeWarning "invalid value encountered in reciprocal" when running tests on macOS scikit-learn/scikit-learn#21395

Closed

jakirkham mentioned this issue Jan 11, 2022

Random CI failure on MacOS: RuntimeWarning: invalid value encountered in reciprocal dask/dask#7189

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeWarning: divide by zero encountered in reciprocal for np.ones(3) ** -1 #18555

RuntimeWarning: divide by zero encountered in reciprocal for np.ones(3) ** -1 #18555

ogrisel commented Mar 5, 2021 •

edited

seberg commented Mar 5, 2021

ogrisel commented Mar 5, 2021

seberg commented Mar 5, 2021

ogrisel commented Mar 7, 2021

isuruf commented Mar 8, 2021

erykoff commented Mar 8, 2021

isuruf commented Mar 8, 2021

isuruf commented Apr 28, 2021 •

edited

ogrisel commented Apr 28, 2021 •

edited

isuruf commented Apr 29, 2021

charris commented May 6, 2021

RuntimeWarning: divide by zero encountered in reciprocal for np.ones(3) ** -1 #18555

RuntimeWarning: divide by zero encountered in reciprocal for np.ones(3) ** -1 #18555

Comments

ogrisel commented Mar 5, 2021 • edited

Reproducing code example:

NumPy/Python version information:

seberg commented Mar 5, 2021

ogrisel commented Mar 5, 2021

seberg commented Mar 5, 2021

ogrisel commented Mar 7, 2021

isuruf commented Mar 8, 2021

erykoff commented Mar 8, 2021

isuruf commented Mar 8, 2021

isuruf commented Apr 28, 2021 • edited

ogrisel commented Apr 28, 2021 • edited

isuruf commented Apr 29, 2021

charris commented May 6, 2021

ogrisel commented Mar 5, 2021 •

edited

isuruf commented Apr 28, 2021 •

edited

ogrisel commented Apr 28, 2021 •

edited