Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please provide universal2 wheels on macOS #18143

Closed
Tracked by #627
ronaldoussoren opened this issue Jan 9, 2021 · 61 comments
Closed
Tracked by #627

Please provide universal2 wheels on macOS #18143

ronaldoussoren opened this issue Jan 9, 2021 · 61 comments
Labels
32 - Installation Problems installing or compiling NumPy

Comments

@ronaldoussoren
Copy link

Feature

The binary wheels for macOS on PyPI are currently for x86_64. Please also provide a universal2 wheel (x86_64 and arm64) for use with the (currently experimental) universal2 Python 3.9.1 installer on Python.org.

@rgommers rgommers added the 32 - Installation Problems installing or compiling NumPy label Jan 9, 2021
@rgommers
Copy link
Member

rgommers commented Jan 9, 2021

Thanks @ronaldoussoren, we hope to do so when it's feasible (in addition to thin arm64 wheels, which seem preferable long-term). One blocker is that we build all our wheels on CI services, and none of them support MacOS ARM64 yet. And none of us have the hardware at the moment.

We also still have some issues, in particular this build issue: gh-17807. And this performance issue: gh-17989.

I'll link pypa/cibuildwheel#473 here as probably the most relevant issue for universal2 wheel building.

@ronaldoussoren
Copy link
Author

I ran into gh-17807 myself, my crude workaround was to remove the code adding "-faltivec" from setup.py. That worked for me, but is not a proper solution. I'm a very light user of numpy through pandas at best, I'm basically only using pandas to create graphs from a couple of CSV files with some light calculations.

That said, I can't promise quick responses but feel free to let me know if there's something I can do or test.

BTW. What's needed to build NumPy in a way that matches the official wheels? Looking at the azure-pipelines.yml in the repo I'd say gfortran and OpenBlas as the only non-system dependencies. Is that correct?

@mattip
Copy link
Member

mattip commented Jan 10, 2021

Accelerate should be avoided for NumPy as it has bugs. NumPy doesn't use gfortran, SciPy does. But there currently no arm64-darwin OpenBLAS build on https://anaconda.org/multibuild-wheels-staging/openblas-libs/files. I am not sure how we would build that without support for arm64-darwin from a CI system.

@mattip
Copy link
Member

mattip commented Jan 10, 2021

I opened MacPython/openblas-libs#49 about building OpenBLAS for Allple Apple M1 silicon.

@ronaldoussoren
Copy link
Author

I am not sure how we would build that without support for arm64-darwin from a CI system.

For building you only need a CI system that provides macOS 11 hosts, or with some effort macOS 10.15 hosts with Xcode 12.2. For testing you actually need M1 systems in the CI system, which will likely be a blocker for you and could take some time (I have no idea how constrained the supply of M1 systems is, but I expect that a CI provider like Azure Pipelines will require a lot of M1 systems).

@ronaldoussoren
Copy link
Author

Accelerate should be avoided for NumPy as it has bugs.

Have you (the numpy project) files bugs about this in Apple's tracker? Or are these bugs in numpy's use of Accelerate?

@rgommers
Copy link
Member

Have you (the numpy project) files bugs about this in Apple's tracker? Or are these bugs in numpy's use of Accelerate?

Yes, multiple. Also other scientific projects - Apple basically doesn't care and hasn't been serious about maintaining Accelerate in a long time. Not enough value in it for them I guess.

SciPy has more linear algebra functionality than NumPy, and dropped Accelerate earlier than NumPy.

We have two good options for BLAS/LAPACK - OpenBLAS (our default for wheels) and Intel MKL. So we'd rather spend our effort on OpenBLAS rather than work around Apple's poor support.

@rgommers
Copy link
Member

For testing you actually need M1 systems in the CI system, which will likely be a blocker for you and could take some time

Yes, I'd be very nervous about releasing something that we can't test. Our stance for other platforms (e.g. Linux on ARM64) is that CI services need to be available; in the absence of that we merge fixes but don't release wheels.

That said, I can't promise quick responses but feel free to let me know if there's something I can do or test.

Thanks!

BTW. What's needed to build NumPy in a way that matches the official wheels? Looking at the azure-pipelines.yml in the repo I'd say gfortran and OpenBlas as the only non-system dependencies. Is that correct?

In addition to what @mattip said, here is the actual wheel build machinery: https://github.com/MacPython/numpy-wheels

@ronaldoussoren
Copy link
Author

I feel your pain, I'm hearing similar experiences from app developers and most issues I've filed with them have gone completely unanswered :-(

@rgommers
Copy link
Member

@ronaldoussoren will the universal2 python.org installer pick up thin arm64 wheels if they are available and Python is started in native mode?

@ronaldoussoren
Copy link
Author

That's more a pip question than python version, but yes pip will use arm64 wheels when running in natively on M1 Macs.

@matthew-brett
Copy link
Contributor

@ronaldoussoren - sorry - just checking because I was surprised. You're saying that if I do:

python -m pip install numpy

on an M1 machine, using the Python.org universal2 Python, and PyPI only has an arm64 wheel for Numpy, it will neverthless install that wheel, even though it does not satisfy the x86_64 part of universal2?

@ronaldoussoren
Copy link
Author

That's correct. That's something that surprised me too.

Even worse (IMHO) pip will prefer a native wheel over a universal2 one. Most users won't care, except for folks like myself that wish to redistribute binaries to other systems (in my case mostly using py2app). I've filed an feature request about that, but the pip team is not convinced yet that preferring universal2 wheels for a universal2 build is a good idea. I'll probably create a PR to show the impact this change would have on their code base.

Note that it is also possible to install an x86_64 wheel on an M1 system by running python in emulated mode. That is,

This will prefer an arm64 wheel (also without the invocation of the arch command) :

$ arch -arm64 python3 -m pip install numpy

This will prefer an x86_64 wheel:

$ arch -x86_64 python3 -m pip install numpy

P.S. I've not looked into the python3 included in Apple's compiler tools, that version might not support the arch command in this way.

@matthew-brett
Copy link
Contributor

@ronaldoussoren - ouch! - and thanks for working that through, that's very helpful.

@rgommers
Copy link
Member

Interesting. I'm not sure that's a bad thing. There are scientific packages that already exceed the PyPI version limit, so a doubling of wheel size for what (for scientific libraries) is a very niche use case doesn't seem very sustainable.

I didn't have the energy to jump in that packaging discussion, but I suspect scientific libs may prefer thin wheels. We work hard on keeping binary sizes reasonable.

@ronaldoussoren
Copy link
Author

Yeah, I noticed that some some packages are very large. Those can always choose to not provide universal2 wheels at all, that will work for most users.

At a risk of going completely off-topic, I'd like to have more control on what gets installed than pip is currently giving. When I'm doing something on my local machine only I might prefer a thin wheel (smaller, faster install), but when I intend to redistribute I might prefer a universal wheel (and possibly even one that's synthesised from two thin ones on PyPI).

@rgommers
Copy link
Member

More control would be nice indeed, if universal2 stays - perhaps as both a pip command line argument and .rc-file setting.

I'd actually prefer if it didn't stay, because also for redistribution there isn't much reason to do it as universal2. The only good argument I saw was "it's less effort to introduce". There is no other situation where we mix OS and hardware platform support. The old universal PowerPC/Intel thing was also very annoying, and in my experience didn't work for the scientific stack anyway if you invoked it with the non-native arch.

@matthew-brett
Copy link
Contributor

Just superficially, wouldn't it be more simple for Python.org to provide separate M1 and x86_64 installers - as it does for Windows 32 and 64 bit?

It's hard to imagine many people using the universal2 Python.org Python and always / mostly doing:

arch -x86_64 python3

And, as Ralf says, I bet that will usually break, when you get to install packages.

@ronaldoussoren
Copy link
Author

Separate installers requires users to understand what kind of machine they have. With a universal2 installer things just work in that regard. The only problem is when projects do no provide wheels with native code for arm64 (either universal wheels or thin arm64 wheels), but that should correct itself in due time (especially once M1 systems are available in cloud CI systems).

I expect that it will be years before we can ignore intel Macs, even after Apple transitions all their hardware to arm64.

Universal support is more or less required for folks distributing application bundles, having separate application downloads for intel and arm is just not what Mac users expect. Universal support is not about being able to run x86_64 on arm, but about having a single binary that just works everywhere.

@matthew-brett
Copy link
Contributor

I don't think it's much to ask of a user that they know they have an M1 Mac, honestly. And for now, it would be reasonable to make the x86_64 installer the default, so they have to specifically ask for the M1 build. I presume an x86_64 build will also work on M1 via Rosetta?

What happens for:

arch -x86_64 python3 -m pip install numpy

Does this look for an x86_64 wheel before a universal2 wheel?

@mattip
Copy link
Member

mattip commented Jan 13, 2021

I don't understand what problem universal2 pip packages solve. If the package is pure-python, pip will download that. If the package has c-extension modules, pip should choose the proper hardware model.

@matthew-brett
Copy link
Contributor

I think the problem they solve is where the user sometimes wants to run Python in M1 mode, and sometimes in x86_64 mode.

They can choose their mode with the arch -x86_64 prefix.

universal2 wheels should properly install code for M1 and for x86_64, so python -m pip install numpy will mean that both of these will work after that single pip install.

python -c "import numpy"
arch -x86_64 python -c "import numpy"

The other problem is the one Ronald mentioned - a universal binary for Python itself means it will work on either hardware.

I agree though, that the first use-case doesn't seem all that important, and the second seems a rather minor advantage compared to the cost in terms of packaging confusion.

@rgommers
Copy link
Member

Universal support is more or less required for folks distributing application bundles, having separate application downloads for intel and arm is just not what Mac users expect. Universal support is not about being able to run x86_64 on arm, but about having a single binary that just works everywhere.

That's specifically a general "Mac end user" problem though, and is unrelated to PyPI and wheels. Having thin wheels plus the right py2app tooling to glue two thin wheels together in a single .pkg/.dmg installer would be much better.

@rgommers
Copy link
Member

I expect that it will be years before we can ignore intel Macs, even after Apple transitions all their hardware to arm64.

Agreed, at least 6-7 years I'd think.

@rgommers
Copy link
Member

cibuildwheel has full support now it looks like. Doesn't help without hardware or fixing the critical bugs we still have for M1 (see my first comment on this issue), but it's a good point of reference.

@matthew-brett
Copy link
Contributor

I think multibuild has full support too - thanks to @isuruf.

@isuruf
Copy link
Contributor

isuruf commented Apr 27, 2021

Those tests are disabled on windows where long double == double which is the only other platform with long double == double that numpy tests re run on.

@ogrisel
Copy link
Contributor

ogrisel commented Apr 27, 2021

I have a hard time understanding the meaning of those tests but here is what I get on an ARM64 macos numpy:

>>> from numpy.f2py.tests.test_array_from_pyobj import Type
>>> Type("LONGDOUBLE").elsize
8.0
>>> Type("LONGDOUBLE").dtype
dtype('float64')
>>> from pprint import pprint
>>> for t in Type("LONGDOUBLE").cast_types():
...     print(t, t.elsize, t.dtype, t.type_num)
... 
<numpy.f2py.tests.test_array_from_pyobj.Type object at 0x148bedfd0> 1.0 int8 1
<numpy.f2py.tests.test_array_from_pyobj.Type object at 0x148bff040> 1.0 uint8 2
<numpy.f2py.tests.test_array_from_pyobj.Type object at 0x148bff070> 2.0 int16 3
<numpy.f2py.tests.test_array_from_pyobj.Type object at 0x148bff0a0> 2.0 uint16 4
<numpy.f2py.tests.test_array_from_pyobj.Type object at 0x148bff0d0> 4.0 int32 5
<numpy.f2py.tests.test_array_from_pyobj.Type object at 0x148bff130> 8.0 int64 7
<numpy.f2py.tests.test_array_from_pyobj.Type object at 0x148bff160> 8.0 uint64 8
<numpy.f2py.tests.test_array_from_pyobj.Type object at 0x148bff1f0> 4.0 float32 11
<numpy.f2py.tests.test_array_from_pyobj.Type object at 0x148bff220> 8.0 float64 12
<numpy.f2py.tests.test_array_from_pyobj.Type object at 0x148bff280> 8.0 float64 13

@isuruf
Copy link
Contributor

isuruf commented Apr 27, 2021

Right. In linux-aarch64 and linux-x86_64, there's the extra

<numpy.f2py.tests.test_array_from_pyobj.Type object at 0xffff80175fa0> 16.0 float128

I think we should skip this test as it is skipped in windows where there's no float128.

@ogrisel
Copy link
Contributor

ogrisel commented Apr 27, 2021

Ok so the problem is in those lines then:

https://github.com/numpy/numpy/blob/main/numpy/f2py/tests/test_array_from_pyobj.py#L117-L128

but still I do not understand what the test is supposed to test so I am not confident such a PR.

@ogrisel
Copy link
Contributor

ogrisel commented Apr 27, 2021

For reference:

>>> import sys
>>> sys.platform
'darwin'
>>> import platform
>>> platform.platform()
'macOS-11.2.3-arm64-arm-64bit'
>>> platform.architecture()
('64bit', '')
>>> platform.processor()
'arm'
>>> platform.system()
'Darwin'

So adding a condition for (platform.system(), platform.processor()) != ('Darwin', 'arm') should work to skip those types.

I will open a PR.

@ogrisel
Copy link
Contributor

ogrisel commented Apr 27, 2021

We still need a fix for the np.reciprocal ufunc (#18555) similar to what #18571 did for np.floor.

@charris
Copy link
Member

charris commented Apr 27, 2021

The 3 test failures are all LONGDOUBLE. Maybe it's because long double == double on arm64.

Is that true? I thought AArch64 and Power9 both supported quad precision floats in hardware. It may depend on the compiler.

The procedure call standard for the ARM 64-bit architecture (AArch64) specifies that long double corresponds to the IEEE 754 quadruple-precision format

@isuruf
Copy link
Contributor

isuruf commented Apr 27, 2021

Apple silicon ABI is slightly different. See https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms#//apple_ref/doc/uid/TP40013702-SW1

The long double type is a double precision IEEE754 binary floating-point type, which makes it identical to the double type. This behavior contrasts to the standard specification, in which a long double is a quad-precision, IEEE754 binary, floating-point type.

@charris
Copy link
Member

charris commented Apr 27, 2021

Apple silicon ABI is slightly different.

Cheapskates :) Thanks for the information, it is like MSVC as far as long doubles go.

@charris
Copy link
Member

charris commented Nov 6, 2021

Going to close this, NumPy now has both thin and universal2 wheels, although we cannot test on M1.

@ocroquette
Copy link

Sorry if this is a trivial question, but is there a way to install the universal wheel explicitly with "pip install"? I would like to have an environment that supports both Intel and ARM, just like Python itself does. I am aware that I can download and install the wheel manually, I am looking specifically with an automated way.

@rgommers
Copy link
Member

There is no way to do that AFAIK.

@JPHutchins
Copy link

@charris I do not see universal wheels labeled at PyPI, where can I find the builds?

https://pypi.org/project/numpy/#files

Thanks!
JP

@rgommers
Copy link
Member

rgommers commented Mar 28, 2024

There aren't any, universal2 is not supported by NumPy nor by most other projects that build on top of NumPy. Please use delocate-fuse if you really do need it. See #21233 (comment) for the canonical issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
32 - Installation Problems installing or compiling NumPy
Projects
None yet
Development

No branches or pull requests

10 participants