Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numpy 1.20 on MacOSX: spurious RuntimeWarning: invalid value encountered in reciprocal #229

Closed
1 task done
crusaderky opened this issue Feb 13, 2021 · 13 comments
Closed
1 task done
Labels

Comments

@crusaderky
Copy link

Reopen of #228
XREF dask/dask#7189
CC @jrbourbeau @jakirkham @isuruf

Issue:

numpy 1.20 on MacOSX randomly emits RuntimeWarnings when many small elementary operations are executed in very fast sequence. In the below POC, da.linalg.norm under the hood invokes operator.sum and operator.pow many times.
The problem disappears after switching the numpy package from conda-forge to the pypi wheel, leaving everything else unaltered.

import sys
import numpy as np
import dask.array as da
from dask.array.utils import assert_eq

a = np.ones((2, 5, 2, 4, 3))
d = da.from_array(a, chunks=-1)

for run in range(5):
    for axis in range(5):
        a_r = np.linalg.norm(a, ord=-1, axis=axis, keepdims=True)
        d_r = da.linalg.norm(d, ord=-1, axis=axis, keepdims=True)
        print(f"{run=} {axis=}", file=sys.stderr)
        c_r = d_r.compute(scheduler="sync")
        assert_eq(a_r, c_r)
OS numpy version result
Linux x64 numpy 1.20.1 (conda-forge) no issue
Windows numpy 1.20.1 (conda-forge) no issue
MacOSX Catalina (AWS mac1.metal) numpy 1.19.5 (conda-forge) no issue
MacOSX Catalina (AWS mac1.metal) numpy 1.20.1 (pypi wheel) no issue
MacOSX Catalina (AWS mac1.metal) numpy 1.20.0 (conda-forge) RuntimeWarnings are randomly emitted; see below
MacOSX Catalina (AWS mac1.metal) numpy 1.20.1 (conda-forge) RuntimeWarnings are randomly emitted; see below

This has been reproduced on both Python 3.7 and 3.8.

run=0 axis=0
/usr/local/Caskroom/miniconda/base/envs/dask/lib/python3.8/site-packages/dask/core.py:121: RuntimeWarning: invalid value encountered in reciprocal
  return func(*(_execute_task(a, cache) for a in args))
run=0 axis=1
run=0 axis=2
run=0 axis=3
run=0 axis=4
run=1 axis=0
/usr/local/Caskroom/miniconda/base/envs/dask/lib/python3.8/site-packages/dask/core.py:121: RuntimeWarning: invalid value encountered in reciprocal
  return func(*(_execute_task(a, cache) for a in args))
run=1 axis=1
run=1 axis=2
run=1 axis=3
run=1 axis=4
/usr/local/Caskroom/miniconda/base/envs/dask/lib/python3.8/site-packages/dask/core.py:121: RuntimeWarning: invalid value encountered in reciprocal
  return func(*(_execute_task(a, cache) for a in args))
run=2 axis=0
/usr/local/Caskroom/miniconda/base/envs/dask/lib/python3.8/site-packages/dask/core.py:121: RuntimeWarning: invalid value encountered in reciprocal
  return func(*(_execute_task(a, cache) for a in args))
run=2 axis=1
run=2 axis=2
run=2 axis=3
run=2 axis=4
run=3 axis=0
/usr/local/Caskroom/miniconda/base/envs/dask/lib/python3.8/site-packages/dask/core.py:121: RuntimeWarning: invalid value encountered in reciprocal
  return func(*(_execute_task(a, cache) for a in args))
run=3 axis=1
run=3 axis=2
run=3 axis=3
run=3 axis=4
/usr/local/Caskroom/miniconda/base/envs/dask/lib/python3.8/site-packages/dask/core.py:121: RuntimeWarning: invalid value encountered in reciprocal
  return func(*(_execute_task(a, cache) for a in args))
run=4 axis=0
/usr/local/Caskroom/miniconda/base/envs/dask/lib/python3.8/site-packages/dask/core.py:121: RuntimeWarning: invalid value encountered in reciprocal
  return func(*(_execute_task(a, cache) for a in args))
run=4 axis=1
run=4 axis=2
run=4 axis=3
run=4 axis=4

The number of emitted warnings goes down drastically if I invert the for run and for axis; in other words, if I do the exact same operations on the exact same numpy arrays 5 times in a row, I get a lot less warnings than if I frequently switch which numpy arrays I'm working on.


Environment (conda list):
$ conda list
# packages in environment at /usr/local/Caskroom/miniconda/base/envs/dask:
#
# Name                    Version                   Build  Channel
bokeh                     2.2.3            py38h50d1736_0    conda-forge
ca-certificates           2020.12.5            h033912b_0    conda-forge
certifi                   2020.12.5        py38h50d1736_1    conda-forge
click                     7.1.2              pyh9f0ad1d_0    conda-forge
cloudpickle               1.6.0                      py_0    conda-forge
cytoolz                   0.11.0           py38h5406a74_3    conda-forge
dask                      2021.2.0           pyhd8ed1ab_0    conda-forge
dask-core                 2021.2.0           pyhd8ed1ab_0    conda-forge
distributed               2021.2.0         py38h50d1736_0    conda-forge
freetype                  2.10.4               h4cff582_1    conda-forge
fsspec                    0.8.5              pyhd8ed1ab_0    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
jinja2                    2.11.3             pyh44b312d_0    conda-forge
jpeg                      9d                   hbcb3906_0    conda-forge
lcms2                     2.12                 h577c468_0    conda-forge
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
libcxx                    11.0.1               habf9029_0    conda-forge
libffi                    3.3                  h046ec9c_2    conda-forge
libgfortran               5.0.0           9_3_0_h6c81a4c_18    conda-forge
libgfortran5              9.3.0               h6c81a4c_18    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
libopenblas               0.3.12          openmp_h54245bb_1    conda-forge
libpng                    1.6.37               h7cec526_2    conda-forge
libtiff                   4.2.0                h355d032_0    conda-forge
libwebp-base              1.2.0                hbcf498f_0    conda-forge
llvm-openmp               11.0.1               h7c73e74_0    conda-forge
locket                    0.2.0                      py_2    conda-forge
lz4-c                     1.9.3                h046ec9c_0    conda-forge
markupsafe                1.1.1            py38h5406a74_3    conda-forge
msgpack-python            1.0.2            py38hd9c93a9_1    conda-forge
ncurses                   6.2                  h2e338ed_4    conda-forge
numpy                     1.20.1           py38h64deac9_0    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openssl                   1.1.1i               h35c211d_0    conda-forge
packaging                 20.9               pyh44b312d_0    conda-forge
pandas                    1.2.2            py38hb77cc89_0    conda-forge
partd                     1.1.0                      py_0    conda-forge
pillow                    8.1.0            py38h4c06724_2    conda-forge
pip                       21.0.1             pyhd8ed1ab_0    conda-forge
psutil                    5.8.0            py38h5406a74_1    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
python                    3.8.6           h624753d_5_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                3.8                      1_cp38    conda-forge
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
pyyaml                    5.4.1            py38h5406a74_0    conda-forge
readline                  8.0                  h0678c8f_2    conda-forge
setuptools                49.6.0           py38h50d1736_3    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sortedcontainers          2.3.0              pyhd8ed1ab_0    conda-forge
sqlite                    3.34.0               h17101e1_0    conda-forge
tblib                     1.6.0                      py_0    conda-forge
tk                        8.6.10               h0419947_1    conda-forge
toolz                     0.11.1                     py_0    conda-forge
tornado                   6.1              py38h5406a74_1    conda-forge
typing_extensions         3.7.4.3                    py_0    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xz                        5.2.5                haf1e3a3_1    conda-forge
yaml                      0.2.5                haf1e3a3_0    conda-forge
zict                      2.0.0                      py_0    conda-forge
zlib                      1.2.11            h7795811_1010    conda-forge
zstd                      1.4.8                hf387650_1    conda-forge

Details about conda and system ( conda info ):
$ conda info
     active environment : dask
    active env location : /usr/local/Caskroom/miniconda/base/envs/dask
            shell level : 2
       user config file : /Users/ec2-user/.condarc
 populated config files : 
          conda version : 4.9.2
    conda-build version : not installed
         python version : 3.8.5.final.0
       virtual packages : __osx=10.15.7=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /usr/local/Caskroom/miniconda/base  (writable)
           channel URLs : https://repo.anaconda.com/pkgs/main/osx-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/osx-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /usr/local/Caskroom/miniconda/base/pkgs
                          /Users/ec2-user/.conda/pkgs
       envs directories : /usr/local/Caskroom/miniconda/base/envs
                          /Users/ec2-user/.conda/envs
               platform : osx-64
             user-agent : conda/4.9.2 requests/2.24.0 CPython/3.8.5 Darwin/19.6.0 OSX/10.15.7
                UID:GID : 501:20
             netrc file : None
           offline mode : False
@rgommers
Copy link
Contributor

This is a little weird. A simple np.linalg.norm on an array of all 1's should not contain any invalid values. Floating-point warnings can be platform-specific, so it only showing up on conda-forge dual-arch builds doesn't mean it's a packaging issue.

@crusaderky
Copy link
Author

@rgommers If you see the code and output above, np.linalg.norm does not raise the warning. I suspect this has to do with a difference in low-level implementation.

@rgommers
Copy link
Contributor

That np.linalg.norm function call does raise the warning - where else would it come from? The RuntimeWarning: invalid value encountered in reciprocal warning is coming from numpy. It indicates the np.reciprocal ufunc gets invoked (probably because ord=-1 results in a 1/x type call that gets optimized via reciprocal), and the "invalid value" means a nan or inf gets produced somewhere.

@crusaderky
Copy link
Author

@rgommers I created a minimal POC that reproduces the issue without dask.
The below triggers the warning about 80% of the times for me:

>>> import numpy
>>> _ = numpy.ones(120) ** -1
<stdin>:1: RuntimeWarning: invalid value encountered in reciprocal

With this one, too, the issue disappears when I either switch from numpy 1.20 (conda-forge) to either numpy 1.20.1 (pip wheel) or to numpy 1.19 (conda-forge).

The issue gets a lot less frequent with an array size of 80, completely disappears at 65, and disapperas again at 150.
This is on a AWS mac1.metal instance; if you have a macbook with a faster/slower CPU I wouldn't be surprised if you found the critical spot to be higher or lower than this.

@rgommers
Copy link
Contributor

This looks like a numpy issue. I had expected this to be the M1 (ARM64) build, because that's what's different for the conda-forge build of numpy 1.20. So what's a little surprising is that your conda info output says __archspec=1=x86_64.

@crusaderky
Copy link
Author

@rgommers the pypi wheel is unaffected though. Exclusively the conda-forge build exhibits the issue.

@jrbourbeau
Copy link
Member

Thanks for posting a simple reproducer @crusaderky! FWIW I was able to reproduce the same RuntimeWarning on a Mac with a fresh conda env created with conda create -n test -c conda-forge python=3.8 numpy=1.20.1 (I've included the output of conda list below)

(test) ➜  ~ python -c "import numpy as np; np.ones(120) ** -1"
<string>:1: RuntimeWarning: invalid value encountered in reciprocal

I also see the same RuntimeWarning if I switch to using the defaults channel instead of conda-forge (i.e. conda create -n test -c defaults python=3.8 numpy=1.20.1). Similar to @crusaderky, the RuntimeWarning no longer appears if I instead use NumPy 1.20.1 from PyPI.

Unfortunately I don't have a sense for what a good next step would be to determine where the RuntimeWarning is originating. But thought it was worth mentioning the RuntimeWarning seems to be reproducible.

conda list:
# Name                    Version                   Build  Channel
ca-certificates           2020.12.5            h033912b_0    conda-forge
certifi                   2020.12.5        py38h50d1736_1    conda-forge
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
libcxx                    11.0.1               habf9029_0    conda-forge
libffi                    3.3                  h046ec9c_2    conda-forge
libgfortran               5.0.0           9_3_0_h6c81a4c_18    conda-forge
libgfortran5              9.3.0               h6c81a4c_18    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
libopenblas               0.3.12          openmp_h54245bb_1    conda-forge
llvm-openmp               11.0.1               h7c73e74_0    conda-forge
ncurses                   6.2                  h2e338ed_4    conda-forge
numpy                     1.20.1           py38h64deac9_0    conda-forge
openssl                   1.1.1j               hbcf498f_0    conda-forge
pip                       21.0.1             pyhd8ed1ab_0    conda-forge
python                    3.8.6           h624753d_5_cpython    conda-forge
python_abi                3.8                      1_cp38    conda-forge
readline                  8.0                  h0678c8f_2    conda-forge
setuptools                49.6.0           py38h50d1736_3    conda-forge
sqlite                    3.34.0               h17101e1_0    conda-forge
tk                        8.6.10               h0419947_1    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xz                        5.2.5                haf1e3a3_1    conda-forge
zlib                      1.2.11            h7795811_1010    conda-forge

Also, hi @rgommers 👋 good to see you here : )

@rgommers
Copy link
Contributor

Also, hi @rgommers wave good to see you here : )

Hey @jrbourbeau, long time no see 👋

I also see the same RuntimeWarning if I switch to using the defaults channel instead of conda-forge (i.e. conda create -n test -c defaults python=3.8 numpy=1.20.1).

So not a conda-forge issue (which was unlikely anyway). My guess is it is related to the introduction of SIMD instructions in reciprocal. The upstream bug is numpy/numpy#18555, which has a solution suggested already.

We can leave this open till it's resolved in a new NumPy release, but there's nothing to do in this repo.

@jrbourbeau
Copy link
Member

Great, thanks for linking to the upstream issue!

@ogrisel
Copy link

ogrisel commented Apr 27, 2021

@rgommers the pypi wheel is unaffected though. Exclusively the conda-forge build exhibits the issue.

There is no wheel for macOS ARM64 yet on PyPI. But the experimental nightly build ARM64 wheel has the same issue (as expected): numpy/numpy#18143 (comment)

@isuruf
Copy link
Member

isuruf commented Apr 27, 2021

@ogrisel, this is for macOS x86_64.

@h-vetinari
Copy link
Member

So not a conda-forge issue (which was unlikely anyway). My guess is it is related to the introduction of SIMD instructions in reciprocal. The upstream bug is numpy/numpy#18555, which has a solution suggested already.

The upstream issue was fixed recently in numpy/numpy#19926. Once we are on 1.22, this should therefore go away.

@h-vetinari
Copy link
Member

Closing this as fixed by 1.22 / #251

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants