Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ecosystem compatibility with numpy 2.0 #26191

Open
rgommers opened this issue Apr 1, 2024 · 57 comments
Open

Ecosystem compatibility with numpy 2.0 #26191

rgommers opened this issue Apr 1, 2024 · 57 comments

Comments

@rgommers
Copy link
Member

rgommers commented Apr 1, 2024

This list tracks the compatibility status of packages that depend on or support NumPy. If "compatible release on PyPI" does not say "yes" but a version number is listed: this is based on plans announced in a tracking issue or other communication by the authors of the package.

Maintainers: please feel free to edit directly (please refresh the page first to avoid overwriting edits from others!). Others who want to update things: please do comment, or feel free to ping me elsewhere.

Package name Compatible release on PyPI? Min compatible version Notes
Cython yes 3.0.4 Version is an estimate, it's worked fine for quite a while
Pybind11 yes 2.12.0 pybind/pybind11#5009
Pythran 0.16.0 (0.15.0 works mostly, SciPy builds with it) serge-sans-paille/pythran#2189
SciPy yes 1.13.0 scipy/scipy#20375
Pandas yes 2.2.2 pandas-dev/pandas#55519
PyArrow yes 16.0 apache/arrow#39532
Matplotlib yes 3.8.4 matplotlib/matplotlib#26778
scikit-learn yes 1.4.2 scikit-learn/scikit-learn#27075
scikit-image yes 0.23.1 scikit-image/scikit-image#7282
statsmodels yes 0.14.2 statsmodels/statsmodels#9194
PyTorch yes 2.3.0 pytorch/pytorch#107302
JAX yes 0.4.26 google/jax#19246
TensorFlow (maybe in May) has <2 upper bound for 2.16.1, requirements, lock file
PyWavelets yes 1.6.0 PyWavelets/pywt#731
AstroPy yes 6.1.0 astropy/astropy#16200
Dask dask/dask#11066
CuPy 14.0.0 cupy/cupy#8306
NetworkX yes 3.3 networkx/networkx#7390
Keras
Numba 0.60 numba/numba#9544, Discourse post with context
PyData Sparse Should be compatible, waiting for Numba 0.60
Xarray pydata/xarray#8844
AwkwardArray yes 2.6.3 scikit-hep/awkward#3064
Seaborn yes 0.13.2 mwaskom/seaborn#3683
SymPy 1.12.1
PyMC depends on PyTensor
h5py yes 3.11.0 h5py/h5py#2353
PyTables PyTables/PyTables#1083
BioPython biopython/biopython#4676
scikit-bio 0.6.1 scikit-bio/scikit-bio#1964
Shapely yes 2.0.4 shapely/shapely#1972
GeoPandas yes 0.14.4 geopandas/geopandas#3258
Rasterio yes 1.3.10 rasterio/rasterio#3024
ml_dtypes yes 0.4.0 jax-ml/ml_dtypes#143
unyt yes 3.0.2 yt-project/unyt#493
yt yes 4.3.1 yt-project/yt#4859
numexpr yes 2.10.0 pydata/numexpr#478
Cartopy yes 0.23 SciTools/cartopy#2339
ContourPy yes 1.2.1 contourpy/contourpy#371
OpenCV opencv/opencv-python#943
MDAnalysis MDAnalysis/mdanalysis#4482
netCDF4 Unidata/netcdf4-python#1317
threadpoolctl yes 3.5.0 joblib/threadpoolctl#175
PyTensor pymc-devs/pytensor#689
Bokeh yes 3.4.1 bokeh/bokeh#13835
Imageio imageio/imageio#1077
imagecodecs cgohlke/imagecodecs#100
numcodecs yes 0.12.1 Likely older versions too; has been stable for a while. zarr-developers/numcodecs#521
tifffile yes 2024.4.24 cgohlke/tifffile#252
treelite dmlc/treelite#560
Zarr yes 2.18.0 zarr-developers/zarr-python#1818
XGBoost 2.1.0 dmlc/xgboost#10221
GDAL 3.9.0 OSGeo/gdal#9751
zfpy LLNL/zfp#210
hypothesis yes 6.100.2 HypothesisWorks/hypothesis#3950
Boost.Python 1.85.0 boostorg/python#431
@jakevdp
Copy link
Contributor

jakevdp commented Apr 1, 2024

Thanks - for JAX we are planning an 0.4.26 release in the next day or two which will be built against NumPy 2.0.0rc1.

@jakevdp
Copy link
Contributor

jakevdp commented Apr 1, 2024

Also, in case you want to add it, we just released ml_dtypes v0.4.0 (https://pypi.org/project/ml-dtypes/) which is compatible with NumPy 2.0.

@jakirkham
Copy link
Contributor

Thanks for putting this together Ralf! 🙏

This is incredibly helpful 🙂

@rgommers
Copy link
Member Author

rgommers commented Apr 1, 2024

Thanks @jakevdp, I added the info for both JAX and ml_dtypes.

@neutrinoceros
Copy link
Contributor

unyt 3.0.2 was also released a couple days ago for compat with numpy 2, in case you'd like to include it !

@neutrinoceros
Copy link
Contributor

Also, here's where to track progress for cartopy : SciTools/cartopy#2339

@rgommers
Copy link
Member Author

rgommers commented Apr 2, 2024

Thanks @neutrinoceros, much appreciated. Added both packages.

@ianthomas23
Copy link
Contributor

ContourPy 1.2.1 has just been released on PyPI (https://pypi.org/project/contourpy/1.2.1/) with NumPy 2 compatibility, most relevant PR is contourpy/contourpy#371. It is a compulsory dependency of Matplotlib.

@jorisvandenbossche
Copy link
Contributor

For PyArrow, it will be the upcoming 16.0 release that will be the first numpy-2.0-compatible release, but this will only be expected in around 3 weeks at the earliest (but normally certainly before the end of the month).
For people that need a PyArrow installed in an environment with numpy 2.0, it's worth noting that there are nightly wheels available though that already work with 2.0.

@dkbarn
Copy link

dkbarn commented Apr 3, 2024

Could we add OpenCV to this list? I filed a ticket here to track progress on a numpy 2.0 build for it.

@rgommers
Copy link
Member Author

rgommers commented Apr 3, 2024

Thanks @ianthomas23, @jorisvandenbossche, @dkbarn - all info added to the table.

@hawkinsp
Copy link
Contributor

hawkinsp commented Apr 3, 2024

JAX released 0.4.26 on pypi, which is compatible with NumPy 2.0.

(NumPy folks: congratulations on your imminent v2 release!)

@ksunden
Copy link
Contributor

ksunden commented Apr 4, 2024

mpl 3.8.4 is out, built with np 2

@neutrinoceros
Copy link
Contributor

neutrinoceros commented Apr 6, 2024

yt 4.3.1 is on PyPI and built with numpy 2.0.0rc1 !

EDIT(seberg): Added to table.

@neutrinoceros
Copy link
Contributor

@ianthomas23
Copy link
Contributor

@jakirkham wrote:

@rgommers could you please add...

This is not necessary, it is already in the list. There was no need to create an issue in ContourPy for this, I did all that was necessary 3 weeks ago.

Perhaps the list should be sorted in alphabetical order?

@rgommers
Copy link
Member Author

Thanks everyone for the input, and @jakirkham for following up with a large set of projects.

Perhaps the list should be sorted in alphabetical order?

I may reorganize it if it gets more unwieldy. Not purely alphabetically though; the top of the table is on purpose for packages that are lowest in the stack, because they were blocking pretty much everything else. So I'll probably want to keep such a set as "must haves, blocking for a final 2.0 release" and then the rest alphabetical.

@ianthomas23
Copy link
Contributor

Perhaps the list should be sorted in alphabetical order?

I may reorganize it if it gets more unwieldy. Not purely alphabetically though; the top of the table is on purpose for packages that are lowest in the stack, because they were blocking pretty much everything else. So I'll probably want to keep such a set as "must haves, blocking for a final 2.0 release" and then the rest alphabetical.

Understood. If I go into full anal retentive mode (as I reserve the right to do occasionally) then shouldn't ContourPy, as a compulsory dependency of Matplotlib, go above it in the list?

It would be really interesting to see the list plotted as a dependency graph, perhaps with individual project bus factor although I am not sure how to best determine that empirically.

@rgommers
Copy link
Member Author

It would be really interesting to see the list plotted as a dependency graph, perhaps with individual project bus factor

Any work on this would be great to see indeed. Using tools like https://deps.dev/ or https://libraries.io/ for this to obtain reverse dependency data and then finding ways to analyze/visualize would be interesting.

shouldn't ContourPy, as a compulsory dependency of Matplotlib, go above it in the list?

Probably not I'd say, since few projects have a direct dependency on it. "Matplotlib works" is pretty much what folks need to know. And that can itself have multiple dependencies. Here is a list: https://deps.dev/pypi/matplotlib/3.8.4/dependencies. I'd prefer to stay mostly out of the business of trying to figure out which of those 10 dependencies are actually blocking/critical.

Of course, the list probably reflects some of my knowledge/biases, it can never be objectively correct or complete. But that's not a real goal here - this is a tool for figuring out where we are in the migration and when we should be able to release 2.0 without creating major issues (or more major issues than necessary) for our users.

@jakevdp
Copy link
Contributor

jakevdp commented Apr 25, 2024

@jakevdp do you know what TensorFlow's NumPy 2 plans are? Or who we should ask?

I'm not sure what the TensorFlow release plans are. It looks like tensorflow currently pins numpy<2.0, so it will at least play nicely with numpy 2.0 post-release. @MichaelHudgins may be able to say more.

@MichaelHudgins
Copy link

@jakevdp do you know what TensorFlow's NumPy 2 plans are? Or who we should ask?

I'm not sure what the TensorFlow release plans are. It looks like tensorflow currently pins numpy<2.0, so it will at least play nicely with numpy 2.0 post-release. @MichaelHudgins may be able to say more.

Thanks for the ping Jake.

TensorFlow is targeting a branch cut end of May, we are unsure if numpy 2.0 support will make it but we are going to try and i should have better insight next week if the update will make next release. Like Jake said TensorFlow does have a <2.0 pin in at least both 2.16.1 and 2.15.1 which are our two supported versions currently, so I don't think we should immediately break if TensorFlow does not get the update done before a 2.0 release.

@jakirkham
Copy link
Contributor

Great thanks Michael and Jake! 🙏

Michael, if you or someone from the TensorFlow team, could please open a TensorFlow issue when you have more details and link that in this thread, that would be very helpful 🙂

@rootsmusic
Copy link

What about dpnp?

@jakirkham
Copy link
Contributor

What about dpnp?

@rootsmusic would suggest checking their repo and if it is unclear raising an issue on their repo asking their plans

@bryevdv
Copy link

bryevdv commented Apr 28, 2024

@jakirkham AFAICT Existing Bokeh 3.4.1 already works with Numpy 2.0. There is only one failing unit test for me locally and that only due to a strict check of the text of an expected exception not showing bool_ as it did before. Manual spot-testing seems to work fine as well. Unfortunately, I have not been able to successfully create any CI environments to do a full check. That is because, while a PyPI-installed Pandas 2.2.2 works fine, the CI envs install pandas 2.2.2 from conda-forge and that does not work (a situation I can reproduce locally, see details). If you have any advice about the conda-forge pandas, please advise.

~/work/bokeh/conda bv/numpy-2.0
dev312 ❯ ipython
Python 3.12.0 | packaged by conda-forge | (main, Oct  3 2023, 08:36:57) [Clang 15.0.7 ]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.18.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import pandas as pd

In [2]:
Do you really want to exit ([y]/n)?

~/work/bokeh/conda bv/numpy-2.0 11s
dev312 ❯ pip uninstall pandas
Found existing installation: pandas 2.2.2
Uninstalling pandas-2.2.2:
  Would remove:
    /Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/pandas-2.2.2.dist-info/*
    /Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/pandas/*
Proceed (Y/n)?
  Successfully uninstalled pandas-2.2.2

~/work/bokeh/conda bv/numpy-2.0 11s
dev312 ❯ conda list pandas
# packages in environment at /Users/bryan/anaconda3/envs/dev312:
#
# Name                    Version                   Build  Channel
pandas-datareader         0.10.0             pyh6c4a22f_0    conda-forge
pandas-stubs              2.1.4.231227             pypi_0    pypi

~/work/bokeh/conda bv/numpy-2.0
dev312 ❯ conda install -c conda-forge  "pandas==2.2.2"
Channels:
 - conda-forge
 - defaults
Platform: osx-arm64
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /Users/bryan/anaconda3/envs/dev312

  added / updated specs:
    - pandas==2.2.2


The following NEW packages will be INSTALLED:

  pandas             conda-forge/osx-arm64::pandas-2.2.2-py312h88edd18_0
  python-dateutil    conda-forge/noarch::python-dateutil-2.9.0-pyhd8ed1ab_0
  python-tzdata      conda-forge/noarch::python-tzdata-2024.1-pyhd8ed1ab_0
  pytz               conda-forge/noarch::pytz-2024.1-pyhd8ed1ab_0


Proceed ([y]/n)?


Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

~/work/bokeh/conda bv/numpy-2.0 16s
dev312 ❯ ipython
Python 3.12.0 | packaged by conda-forge | (main, Oct  3 2023, 08:36:57) [Clang 15.0.7 ]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.18.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import pandas as pd

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0rc1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/Users/bryan/anaconda3/envs/dev312/bin/ipython", line 10, in <module>
    sys.exit(start_ipython())
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/IPython/__init__.py", line 129, in start_ipython
    return launch_new_instance(argv=argv, **kwargs)
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/traitlets/config/application.py", line 1077, in launch_instance
    app.start()
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/IPython/terminal/ipapp.py", line 317, in start
    self.shell.mainloop()
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/IPython/terminal/interactiveshell.py", line 887, in mainloop
    self.interact()
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/IPython/terminal/interactiveshell.py", line 880, in interact
    self.run_cell(code, store_history=True)
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3048, in run_cell
    result = self._run_cell(
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3103, in _run_cell
    result = runner(coro)
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner
    coro.send(None)
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3308, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3490, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3550, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-1-7dd3504c366f>", line 1, in <module>
    import pandas as pd
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/pandas/__init__.py", line 49, in <module>
    from pandas.core.api import (
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/pandas/core/api.py", line 1, in <module>
    from pandas._libs import (
  File "/Users/bryan/anaconda3/envs/dev312/lib/python3.12/site-packages/pandas/_libs/__init__.py", line 17, in <module>
    import pandas._libs.pandas_datetime  # noqa: F401 # isort: skip # type: ignore[reportUnusedImport]
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
File ~/anaconda3/envs/dev312/lib/python3.12/site-packages/numpy/core/_multiarray_umath.py:44, in __getattr__(attr_name)
     39     # Also print the message (with traceback).  This is because old versions
     40     # of NumPy unfortunately set up the import to replace (and hide) the
     41     # error.  The traceback shouldn't be needed, but e.g. pytest plugins
     42     # seem to swallow it and we should be failing anyway...
     43     sys.stderr.write(msg + tb_msg)
---> 44     raise ImportError(msg)
     46 ret = getattr(_multiarray_umath, attr_name, None)
     47 if ret is None:

ImportError:
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0rc1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.


---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[1], line 1
----> 1 import pandas as pd

File ~/anaconda3/envs/dev312/lib/python3.12/site-packages/pandas/__init__.py:49
     46 # let init-time option registration happen
     47 import pandas.core.config_init  # pyright: ignore[reportUnusedImport] # noqa: F401
---> 49 from pandas.core.api import (
     50     # dtype
     51     ArrowDtype,
     52     Int8Dtype,
     53     Int16Dtype,
     54     Int32Dtype,
     55     Int64Dtype,
     56     UInt8Dtype,
     57     UInt16Dtype,
     58     UInt32Dtype,
     59     UInt64Dtype,
     60     Float32Dtype,
     61     Float64Dtype,
     62     CategoricalDtype,
     63     PeriodDtype,
     64     IntervalDtype,
     65     DatetimeTZDtype,
     66     StringDtype,
     67     BooleanDtype,
     68     # missing
     69     NA,
     70     isna,
     71     isnull,
     72     notna,
     73     notnull,
     74     # indexes
     75     Index,
     76     CategoricalIndex,
     77     RangeIndex,
     78     MultiIndex,
     79     IntervalIndex,
     80     TimedeltaIndex,
     81     DatetimeIndex,
     82     PeriodIndex,
     83     IndexSlice,
     84     # tseries
     85     NaT,
     86     Period,
     87     period_range,
     88     Timedelta,
     89     timedelta_range,
     90     Timestamp,
     91     date_range,
     92     bdate_range,
     93     Interval,
     94     interval_range,
     95     DateOffset,
     96     # conversion
     97     to_numeric,
     98     to_datetime,
     99     to_timedelta,
    100     # misc
    101     Flags,
    102     Grouper,
    103     factorize,
    104     unique,
    105     value_counts,
    106     NamedAgg,
    107     array,
    108     Categorical,
    109     set_eng_float_format,
    110     Series,
    111     DataFrame,
    112 )
    114 from pandas.core.dtypes.dtypes import SparseDtype
    116 from pandas.tseries.api import infer_freq

File ~/anaconda3/envs/dev312/lib/python3.12/site-packages/pandas/core/api.py:1
----> 1 from pandas._libs import (
      2     NaT,
      3     Period,
      4     Timedelta,
      5     Timestamp,
      6 )
      7 from pandas._libs.missing import NA
      9 from pandas.core.dtypes.dtypes import (
     10     ArrowDtype,
     11     CategoricalDtype,
   (...)
     14     PeriodDtype,
     15 )

File ~/anaconda3/envs/dev312/lib/python3.12/site-packages/pandas/_libs/__init__.py:17
     13 # Below imports needs to happen first to ensure pandas top level
     14 # module gets monkeypatched with the pandas_datetime_CAPI
     15 # see pandas_datetime_exec in pd_datetime.c
     16 import pandas._libs.pandas_parser  # isort: skip # type: ignore[reportUnusedImport]
---> 17 import pandas._libs.pandas_datetime  # noqa: F401 # isort: skip # type: ignore[reportUnusedImport]
     18 from pandas._libs.interval import Interval
     19 from pandas._libs.tslibs import (
     20     NaT,
     21     NaTType,
   (...)
     26     iNaT,
     27 )

ImportError: numpy.core.multiarray failed to import

In [2]:
Do you really want to exit ([y]/n)?

~/work/bokeh/conda bv/numpy-2.0 4m 41s
dev312 ❯ conda list pandas
# packages in environment at /Users/bryan/anaconda3/envs/dev312:
#
# Name                    Version                   Build  Channel
pandas                    2.2.2           py312h88edd18_0    conda-forge
pandas-datareader         0.10.0             pyh6c4a22f_0    conda-forge
pandas-stubs              2.1.4.231227             pypi_0    pypi
``

</details>

@bashtage
Copy link
Contributor

@bryevdv conda-froge hasn't fully worked out how to build with NumPy 2 yet. conda packages built for NumPy 2-compatible projects are still compiled against the default NumPy version and so will show that warning when NumPy 2 is installed in the environment.

@keewis
Copy link
Contributor

keewis commented Apr 28, 2024

hypothesis is compatible with numpy 2 since 6.100.2 (the tracking issue was HypothesisWorks/hypothesis#3950, the fix was in HypothesisWorks/hypothesis#3955).

cc @Zac-HD

@h-vetinari
Copy link
Contributor

conda-forge hasn't fully worked out how to build with NumPy 2 yet.

Actually we have it figured out. We just need to make a choice about how to roll it out, which has a couple of choices with different trade-offs, and none perfect.

@jorisvandenbossche
Copy link
Contributor

@rgommers you can update the GeoPandas row as being released (https://github.com/geopandas/geopandas/releases/tag/v0.14.4)

@neutrinoceros
Copy link
Contributor

astropy 6.1.0 is out !

@h-vetinari
Copy link
Contributor

boost/python sits pretty deep in the stack from the POV of the conda-forge dependency graph (though admittedly the graph conflates packages depending on numpy & libboost with those depending on numpy & libboost-python, because all boost components are build in the same feedstock). So from that POV, it becomes quite important to solve boostorg/python#431.

@jakirkham
Copy link
Contributor

IIUC Boost.Python fixed this with PR: boostorg/python#432

Have we tried that patch?

@h-vetinari
Copy link
Contributor

Yes, Matti pointed out that patch to me (it wasn't linked in the issue so I didn't see it), and it works fine when backported to 1.84 and even 1.82.

@jakirkham
Copy link
Contributor

jakirkham commented May 8, 2024

@rgommers could we please update these entries in the table in the OP (replacing the old ones)?

Package name Compatible release on PyPI? Min compatible version Notes
numcodecs  yes 0.12.1 Likely older versions too; has been stable for a while.
zarr-developers/numcodecs#521
Zarr yes  2.18.0  zarr-developers/zarr-python#1818

@rgommers
Copy link
Member Author

rgommers commented May 8, 2024

Updated for all recent comments, thanks all!

@jakirkham
Copy link
Contributor

jakirkham commented May 8, 2024

Thanks Ralf! 🙏

Looks like nearly all libraries have some kind of issue/PR reference (or already a working release)

One that appears to be missed is Keras, so have raised upstream issue: keras-team/keras#19691

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests