Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add details on native packaging requirements exposed by mobile platforms #27

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
eb2419a
Add details on native packaging requirements exposed by mobile platfo…
freakboy3742 Jan 9, 2023
8fef63e
Clarified the role/impact of cross-compilation on non-macOS platforms.
freakboy3742 Jan 10, 2023
d16035f
Grammar cleanup.
freakboy3742 Jan 10, 2023
84dbd5f
Add note about Windows platform support
freakboy3742 Jan 10, 2023
2a40f47
Moved a paragraph about the universal2 to current state.
freakboy3742 Jan 10, 2023
2563270
Clarified how Android deals with dependencies.
freakboy3742 Jan 10, 2023
b9b904c
Added an alternative approach for handling iOS multi-arch.
freakboy3742 Jan 10, 2023
45f748f
Modified comments to use common section structure, and include specif…
freakboy3742 Jan 16, 2023
373bb09
Apply suggestions from code review
freakboy3742 Jan 16, 2023
d8a2ca6
More updates stemming from review.
freakboy3742 Jan 16, 2023
f533395
Expand note about Linux support.
freakboy3742 Jan 17, 2023
8475360
Correct an it's typo.
freakboy3742 Jan 17, 2023
2886f2c
Add content to page on cross compilation
rgommers Feb 27, 2023
7556850
Resolve the last cross-compilation comment, on `pip --platform`
rgommers Mar 10, 2023
cb85652
Merge branch 'main' into mobile-details
rgommers Mar 10, 2023
49806e2
Put back link to "multiple architectures" page from cross compile page
rgommers Mar 10, 2023
ea1fb60
Remove the `cross_platform.md` file
rgommers Mar 10, 2023
d249af6
Fix some formatting and typo issues
rgommers Mar 10, 2023
50d8c26
Revisions to multi-architecture notes following review.
freakboy3742 Mar 20, 2023
a9776e0
Add foldout for pros and cons of `universal2` wheels
rgommers Mar 21, 2023
8d46e06
Add the 'for' arguments for universal2.
freakboy3742 Mar 21, 2023
5d06a56
Clarified 'end user' language; added note about merge problems.
freakboy3742 Mar 22, 2023
3e1fc05
Clarify the state of arm64 on github actions.
freakboy3742 Mar 22, 2023
74705d8
Add reference to pip issue about universal2 wheel installation.
freakboy3742 Mar 22, 2023
f46d2b0
Fixed typo.
freakboy3742 Mar 22, 2023
e1c278f
Removed subjective language.
freakboy3742 Mar 22, 2023
1a926eb
Apply textual/typo suggestions
rgommers Mar 22, 2023
7967383
Rephrase universal2 usage frequency/demand phrasing
rgommers Mar 22, 2023
1fb0ffb
Tone down the statement on "must provide thin wheels"
rgommers Mar 22, 2023
b44a322
Rephrase note on needed robustness improvements in delocate-fuse
rgommers Mar 22, 2023
dd93f1f
Add "first-class support for fusing thin wheels" as a potential solution
rgommers Mar 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/glossary.md
Expand Up @@ -6,6 +6,8 @@
|---|---|---|
| ABI | Application Binary Interface | See [here](./background/binary_interface.md) |
| API | Application Programming Interface | The sum total of available functions, classes, etc. of a given program |
| AAB | Android Application Bundle | A distributable unit containing an Android application |
| APK | Android application Package | A "binary" unit for Android, installed on a device |
| ARM | Advanced RISC Machines | Family of RISC architectures, second-most widely used processor family after x86 |
| AVX | Advanced Vector eXtensions | Various extensions to the x86 instruction set (AVX, AVX2, AVX512), evolution after SSE |
| BLAS | Basic Linear Algebra Subprograms | Specification resp. implementation for low-level linear algebra routines |
Expand All @@ -29,13 +31,15 @@
| LAPACK | Linear Algebra PACKage | Standard software library for numerical linear algebra |
| ISA | Instruction Set Architecture | Specification of an instruction set for a CPU; e.g. x86-64, arm64, ... |
| JIT | Just-in-time Compilation | Compiling code just before execution; used in CUDA, PyTorch, PyPy, Numba etc. |
| JNI | Java Native Interface | The bridge API allowing access of Java runtime objects from native code (and vice versa) |
| LLVM | - | Cross-platform compiler framework, home of Clang, MLIR, BOLT etc. |
| LTO | Link-Time Optimization | See [here](./background/compilation_concepts.md#link-time-optimization-lto)|
| LTS | Long-Term Support | Version of a given software/library/distribution designated for long-term support |
| musl | - | An alternative implementation of the C standard library |
| MPI | Message Passing Interface | Standard for message-passing in parallel computing |
| MLIR | Multi-Level IR | Higher-level IR within LLVM; used i.a. in machine learning frameworks |
| MSVC | Microsoft Visual C++ | Main compiler on Windows |
| NDK | Native Development Kit | The Android toolchain supporting compilation of binary modules |
| NEP | Numpy Enhancement Proposal | See [here](https://numpy.org/neps/) |
| OpenMP | Open Multi Processing | Multi-platform API for enabling multi-processing in C/C++/Fortran |
| OS | Operating System | E.g. Linux, MacOS, Windows |
Expand Down
2 changes: 2 additions & 0 deletions docs/index.md
Expand Up @@ -67,6 +67,8 @@ workarounds for.
4. [Metadata handling on PyPI](key-issues/pypi_metadata_handling.md)
5. [Distributing a package containing SIMD code](key-issues/simd_support.md)
6. [Unsuspecting users getting failing from source builds](key-issues/unexpected_fromsource_builds.md)
7. [Platforms with multiple CPU architectures](key-issues/multiple_architectures.md)
8. [Cross-platform installation](key-issues/cross_platform.md)


## Contributing
Expand Down
147 changes: 147 additions & 0 deletions docs/key-issues/cross_platform.md
@@ -0,0 +1,147 @@
# Cross compilation

The historical assumption of compilation is that the platform where the code is
compiled will be the same as the platform where the final code will be executed
(if not literally the same machine, then at least one that is CPU and ABI
compatible at the operating system level). This is a reasonable assumption for
most desktop platforms; however, for some platforms, this isn't the case.

On mobile platforms, an app is compiled on a desktop platform, and transferred
to the mobile device (or a simulator) for testing. The compiler is not executed
on device. Therefore, it must be possible to build a binary artefact for a CPU
architecture and an ABI that is different from the platform that is running the
compiler. The situation is similar for embedded devices.

Cross compilation issues also emerge when dealing with continuous
integration/deployment (CI/CD). CI/CD platforms (such as Github Actions)
generally provide the "common" architectures - often only x86-64 - however, a
project may want to produce binaries for other platforms (e.g., ARM support for
Raspberry Pi devices; PowerPC or s390x for mainframe/server devices; or for
mobile platforms). These binaries won't run natively on the host CI/CD system
(without some sort of emulation, for example with QEMU); but code can be
compiled for the target platform.

macOS also experiences this as a result of the Apple Silicon transition. Apple
has provided the tools to make cross compilation from x86-64 to arm64 as easy
as possible, as well as to compile [fat binaries](multiple_architectures.md)
(supporting x86-64 and arm64 at the same time) on x86-64 hardware. In the latter
rgommers marked this conversation as resolved.
Show resolved Hide resolved
case, the host platform (macOS on x86-64) will still be one of the outputs
of the compilation process, and the resulting binary will run on the CI/CD
system.

## Current state

Native compiler and build toolchains (e.g., autoconf/automake, CMake, Meson) have long
supported cross-compilation; however, such cross-compilation capabilities for any
given project tend to bitrot and break easily unless they are exercised regularly.

CPython's build system includes some support for cross-compilation. This support
is largely based on leveraging autoconf's support for cross compilation. This
support wasn't well integrated into `distutils` and the compilation of the binary
portions of stdlib. The removal of `distutils` in Python 3.12 represents an
improvement the overall situation, but there is still a long way to go before
the ecosystem as a whole has fully integrated the consequences of this change.

The way build backend hooks in `pyproject.toml` are specified (see PEP 517)
means cross-platform compilation support has been partially converted into a
concern for individual build systems to manage.

In order to cross-compile a Python package, one needs a compiler toolchain as
well as two Python installs - one for the build system and one for the host
system.[^1] This can make it a little challenging to get started. If a compiler
toolchain is not already provided on the system of interest, it can be built
from source with, e.g., [crosstool-ng](https://crosstool-ng.github.io/) or
obtained from, e.g., [dockcross](https://github.com/dockcross/dockcross).
Or one can use a packaging system that has builtin support for cross-compilation.
[The Yocto Project](https://www.yoctoproject.org/),
[OpenEmbedded](https://www.openembedded.org/wiki/Main_Page) and
[Buildroot](https://buildroot.org/) are projects specifically focused on
cross-compilation for Linux embedded systems. More general-purpose packaging
ecosystems often have toolchains and supporting infrastructure to cross-compile packages for their own needs - see, e.g., info for
[Void Linux](https://github.com/void-linux/void-packages#cross-compiling),
[conda-forge](https://conda-forge.org/),
[Debian](https://wiki.debian.org/CrossCompiling) and
[Nix](https://nixos.org/guides/cross-compilation.html).

[^1]:
The "build", "host" and "target" terminology for identifying which system
is which in a cross-compilation setup is not consistent across build
systems and packaging tools. Always carefully check whether "build" means
the machine on which the compilation is run and "host" the machine on which
the produced binaries will run - or vice versa.

Tools like [crossenv](https://github.com/benfogle/crossenv) can be used to trick
Python into performing cross-platform builds. These tools use path hacks and
overrides of known sources of platform-specific details (like `distutils`) to
provide a cross-compilation environment. However, these solutions tend to be
somewhat fragile as they aren't first-class citizens of the Python ecosystem.

[The BeeWare Project](https://beeware.org) also uses a version of these
techniques. For both the platforms it supports, BeeWare provides a custom
package index that contains pre-compiled binaries ([Android](https://chaquo.com/pypi-7.0/);
[iOS](https://anaconda.org/beeware/repo)). These binaries are produced using a
set of tooling ([Android](https://github.com/chaquo/chaquopy/tree/master/server/pypi);
[iOS](https://github.com/freakboy3742/chaquopy/tree/iOS-support/server/pypi))
that is analogous to the tools used by conda-forge to build binary artefacts.


## Problems

There is currently a gap in _communicating target platform details to the
build system_. While a build system like autoconf or CMake may support
cross-platform compilation, and a project may be able to cross-compile binary
artefacts, invocation of a `pyproject.toml` build hook typically assumes that the
platform running the build will be the platform that ultimately runs the Python
code. As a result, `sys.platform`, or the various attributes of the `platform`
and `sysconfig` modules can't be used as part of the build process.

_Running Python code_ for the host (cross) platform is not possible (modulo
using an emulator), but Python packages have not designed for this. to be
avoided. For example, `numpy` and `pybind11` ship headers and have
`get_include()` functions in their main namespaces to obtain the path to those
headers. That is clearly a problem, which packages dependending on those
headers have to work around (often done by patching those packages with
hardcoded paths within a cross-compilation setup).

`pip` provides limited support for installing binaries for a different platform
rgommers marked this conversation as resolved.
Show resolved Hide resolved
by specifying a `--platform`, `--implementation` and `--abi` flags; however,
these flags only work for the selection of pre-built binary artefacts, and are
therefore constrained to the set of platform and ABI tags published by the
author.


## History

TODO


## Relevant resources

- ["Towards standardizing cross compiling "](https://discuss.python.org/t/towards-standardizing-cross-compiling/10357), Ben Fogle (2021),
- ["PEP xxxx - Standardized Config Settings for Cross-Compiling"](https://github.com/benfogle/peps/blob/master/pep-9999.rst), Ben Fogle (2021),
- [scipy#14812 - Tracking issue for cross-compilation needs and issues](https://github.com/scipy/scipy/issues/14812) (2021),

freakboy3742 marked this conversation as resolved.
Show resolved Hide resolved

## Potential solutions or mitigations

At the core, what is required is a recognition that the use case of
cross-platform builds is something that the Python ecosystem should support.

In concrete terms, for native modules, this would require at least:

1. Making it possible to retrieve relevant metadata from a Python installation
without having to run Python code.
2. Clear separation of metadata associated with the definition of build and
target platforms, rather than assuming that build and target platform will
always be the same.

In addition, to make cross-compilation easier to use and move from build system
specific configuration files - like a "toolchain file" for CMake or a "cross
file" for Meson - to a standardized version:

3. Extension of the `pyproject.toml` build interface to allow communicating the
desired target platform as part of a binary build; or
4. Formalization of the "platform identification" interface that can used by
build backends to identify the target platform, so that tools like
`crossenv` can provide a reliable proxied environment for cross-platform
builds.