From eb2419aec3b12a4fa0fcb6134267890238e562da Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Mon, 9 Jan 2023 14:27:39 +0800
Subject: [PATCH 01/30] Add details on native packaging requirements exposed by
 mobile platforms.

---
 docs/index.md                             |   2 +
 docs/key-issues/cross_platform.md         |  33 ++++++
 docs/key-issues/multiple_architectures.md | 129 ++++++++++++++++++++++
 mkdocs.yml                                |   4 +-
 4 files changed, 167 insertions(+), 1 deletion(-)
 create mode 100644 docs/key-issues/cross_platform.md
 create mode 100644 docs/key-issues/multiple_architectures.md

diff --git a/docs/index.md b/docs/index.md
index e05dab7..8a6cf86 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -67,6 +67,8 @@ workarounds for.
 4. [Metadata handling on PyPI](key-issues/pypi_metadata_handling.md)
 5. [Distributing a package containing SIMD code](key-issues/simd_support.md)
 6. [Unsuspecting users getting failing from source builds](key-issues/unexpected_fromsource_builds.md)
+7. [Platforms with multiple CPU architectures](key-issues/multiple_architectures.md)
+8. [Cross-platform installation](key-issues/cross_platform.md)
 
 
 ## Contributing
diff --git a/docs/key-issues/cross_platform.md b/docs/key-issues/cross_platform.md
new file mode 100644
index 0000000..2910ccf
--- /dev/null
+++ b/docs/key-issues/cross_platform.md
@@ -0,0 +1,33 @@
+# Cross-platform installation
+
+The historical assumption of compilation is that the platform where the code is
+compiled will be the same as the platform where the final code will be executed
+(if not literally the same machine, then at least one that is CPU and ABI
+compatible at the operating system level). This is a reasonable assumption for
+most desktop projects; However, for mobile platforms, this isn't the case.
+
+On mobile platforms, an app is compiled on a desktop platform, and transferred
+to the mobile device (or a simulator) for testing. The compiler is not executed
+on device. Therefore, it must be possible to build a binary artefact for a CPU
+architecture and a ABI that is different the platform that is running the
+compiler.
+
+A microcosm of this problem exists on macOS as a result of the Apple Silicon
+transition. Most CI systems don't provide native ARM hardware, but most
+developers will still want ARM64-compatible build artefacts. Apple has provided
+the tools compile [fat binaries](multiple_architectures.md) on x86_64 hardware;
+however, in this case, the host platform (macOS on x86_64) will still be one of
+the outputs of the compilation process. For mobile platforms, the computer that
+compiles the code will not be able to execute the code that has been compiled.
+
+## Potential solutions or mitigations
+
+Compiler and build toolchains (e.g., autoconf/automake) have long supported
+cross-compilation; however, these cross-compilation capabilities are easy to
+break unless they are exercised regularly.
+
+In the Python space, tools like [crossenv](https://github.com/benfogle/crossenv)
+also exist; these tools use a collection of path hacks and overrides of known
+sources of platform-specific details (like `distutils`) to provide a
+cross-compilation environment. However, these solutions tend to be somewhat
+fragile as aren't first-class citizens of the Python ecosystem.
diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
new file mode 100644
index 0000000..0f43e14
--- /dev/null
+++ b/docs/key-issues/multiple_architectures.md
@@ -0,0 +1,129 @@
+# Platforms with multiple CPU architectures
+
+In addition to any ABI requirements, a binary is compiled for a CPU
+architecture. That CPU architecture defines the CPU instructions that can be
+issued by the binary.
+
+Historically, it could be assumed that an executable or library would be
+compiled for a single CPU archicture. On the rare occasion that an operating
+system was available for mulitple CPU architectures, it became the
+responsibility of the user to find (or compile) a binary that was compiled for
+their host CPU architecture.
+
+However, on occasion, we see an operating system platform where multiple CPU
+architectures are supported:
+
+* In the early days of Windows NT, both x86 and DEC Alpha CPUs were supported
+* Although Linux started as an x86 project, the Linux kernel is now available a
+  wide range of other CPU architectures, including ARM64, RISC-V, PowerPC, s390
+  and more.
+* Apple transitioned Mac hardware from PowerPC to Intel (x86-64) CPUs, providing
+  a forwards compatibility path for binaries
+* Apple is currently transitioning Mac hardware from Intel (x86-64) to
+  Apple Silicon (ARM64) CPUs, again providing a forwards compatibility
+  path
+* Apple supports ARMv6, ARMv7, ARMv7s, ARM64 and ARM64e on iOS
+* Android currently supports ARMv7, ARM64, x86, and x86-64; it has historically
+  also supported ARMv5 and MIPS
+
+CPU architecture compatibility is a necessary, but not sufficient criterion for
+determining binary compatibility. Even if two binaries are compiled for the same
+CPU architecture, that doesn't guarantee [ABI compatibility](abi.md).
+
+In some respects, CPU architecture compatibility could be considered a superset
+of [GPU compatibility](gpus.md). When dealing with multiple CPU architectures,
+there may be some overal with the solutions that can be used to support GPUs in
+native binaries.
+
+## Platform approaches for dealing with multiple architectures
+
+Three approaches have emerged for handling multiple CPU architectures.
+
+### Multiple binaries
+
+The minimal solution is to distribute multiple binaries. This is the approach
+that was used by Windows NT, and is currently supported by Linux. At time of
+distribution, an installer or other downloadable artefact is provided for each
+supported platform, and it is up to the user to select and download the correct
+artefact.
+
+### Archiving
+
+The approach taken by Android is very similar to the multiple binary approach,
+with some affordances and tooling to simplify distribution.
+
+When building an Android project, each target architecture is compiled
+independently. If a native binary library is required to compile the Android
+application, a version must be provided for each supported CPU architecture. A
+directory layout convention exists for providing a binary for each platform,
+with the same library name. This yields an independent final binary (APK) for
+each CPU architecture. When running locally, a CPU-specific APK will be
+uploaded to the simulator or test device.
+
+To simplify the process of distributing the application, at time of publication,
+a single Android App Bundle (AAB) is generated from the multiple CPU-specific
+APKs. This AAB contains binaries for all platforms that can be uploaded to an
+app store.
+
+When an end-user requests the installation of an app, the app store strips out the
+binary that is appropriate for the end-user's device.
+
+### Fat binaries
+
+Apple has taken the approach of "fat" binaries. A fat binary is a single
+executable or library artefact that contains code for multiple CPU
+architectures.
+
+Fat binaries can be compiled in two ways:
+
+1. **Single pass** Apple has modified their compiler tooling with flags that
+   allow the user to specify a single compilation command, and instruct the
+   compiler to generate multiple output architectures in the output binary
+2. **Multiple pass** After compiling a binary for each platform, Apple provides
+   a call named `lipo` to combine multiple single-architecture binaries into a
+   single fat binary that contains all platforms.
+
+At runtime, the operating system loads the binary slice for the current CPU
+architecture, and the linker loads the appropriate slice from the fat binary of
+any dynamic libraries.
+
+On macOS ARM hardware, Apple also provides Rosetta as a support mechanism; if a
+user tries to run an binary that doesn't contain an ARM64 slice, but *does*
+contain an x86-64 slice, the x86-64 slice will be converted at runtime into an
+ARM64 binary. Complications can occur when only *some* of the binary is being
+converted (e.g., if the binary being executed is fat, but a dynamic library
+isn't).
+
+iOS has an additional complication of requiring support for mutiple *ABIs* in
+addition to multiple CPU archiectures. The ABI for the iOS simulator and
+physical iOS devices are different; however, ARM64 is a supported CPU
+architecture for both. As a result, it is not possible to produce a single fat
+library that supports both the iOS simulator and iOS devices. Apple provides an
+additional structure - the `XCFramework` - as a wrapper format for packaging
+libraries that need to span multiple ABIs. When developing an application for
+iOS, a developer will need to install binaries for both the simulator and
+physical devices.
+
+## Potential solutions or mitigations
+
+Python currently provides `universal2` wheels to support x86_64 and ARM64 in a
+single wheel. This is effectively a "fat wheel" format; the `.dylib` files
+contained in the wheel are fat binaries containing both x86_64 and ARM64 slices.
+
+However, "Universal2" is a macOS-specific definition that encompasses the scope
+of the specific "Apple Silicon" transition ("Universal" wheels also existed
+historically for the PowerPC to Intel transition). Even inside the Apple
+ecosystem, iOS, tvOS, and watchOS all have different combinations of supported
+CPU architectures.
+
+A more general solution for naming multi-architecture binaries, similar to how a
+wheel can declare compatibility with multiple CPython versions (e.g.,
+`cp34.cp35.cp36-abi3-manylinux1_x86_64`) may be called for. In such a scheme,
+`cp310-abi3-macosx_10_9_universal2` would be equivalent to
+`cp310-abi3-macosx_10_9_x86_64.arm64`.
+
+To support Android's multi-architecture approach, it may be necessary to extend
+installation tools to allow for installing multiple versions of a wheel in one
+installation pass. This can be emulated by making multiple independent calls to
+to package installer tools; but that results in independent dependency
+resolution, etc.
diff --git a/mkdocs.yml b/mkdocs.yml
index 230e854..3e0ef33 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -20,7 +20,7 @@ theme:
     - scheme: default
       primary: blue grey
       toggle:
-        icon: material/brightness-7 
+        icon: material/brightness-7
         name: Switch to dark mode
 
 nav:
@@ -42,6 +42,8 @@ nav:
     - 'key-issues/pypi_metadata_handling.md'
     - 'key-issues/simd_support.md'
     - 'key-issues/unexpected_fromsource_builds.md'
+    - 'key-issues/multiple_architectures.md'
+    - 'key-issues/cross_platform.md'
   - 'other_issues.md'
   - 'Background':
     - 'background/binary_interface.md'

From 8fef63e1a800c59338696b5932d379ab5cf59738 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Tue, 10 Jan 2023 08:30:27 +0800
Subject: [PATCH 02/30] Clarified the role/impact of cross-compilation on
 non-macOS platforms.

---
 docs/key-issues/cross_platform.md | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/docs/key-issues/cross_platform.md b/docs/key-issues/cross_platform.md
index 2910ccf..75bd458 100644
--- a/docs/key-issues/cross_platform.md
+++ b/docs/key-issues/cross_platform.md
@@ -12,13 +12,20 @@ on device. Therefore, it must be possible to build a binary artefact for a CPU
 architecture and a ABI that is different the platform that is running the
 compiler.
 
-A microcosm of this problem exists on macOS as a result of the Apple Silicon
-transition. Most CI systems don't provide native ARM hardware, but most
-developers will still want ARM64-compatible build artefacts. Apple has provided
-the tools compile [fat binaries](multiple_architectures.md) on x86_64 hardware;
-however, in this case, the host platform (macOS on x86_64) will still be one of
-the outputs of the compilation process. For mobile platforms, the computer that
-compiles the code will not be able to execute the code that has been compiled.
+Cross compilation issues also emerge when dealing with continuous
+integration/deployment (CI/CD). CI/CD platforms (such as Github Actions)
+generally provide the "common" architectures - often only x86-64 - however, a
+project may want to produce binaries for other platforms (e.g., ARM support for
+Raspberry Pi devices; PowerPC or s390 for mainframe/server devices; or for
+mobile platforms). These binaries won't run natively on the host CI/CD system
+(without some sort of emulation); but code can be compiled for the target
+platform.
+
+macOS also experiences this as a result of the Apple Silicon transition. Apple
+has provided the tools to compile [fat binaries](multiple_architectures.md) on
+x86_64 hardware; however, in this case, the host platform (macOS on x86_64) will
+still be one of the outputs of the compilation process, and the resulting binary
+will run on the CI/CD system.
 
 ## Potential solutions or mitigations
 

From d16035fbbe64d839fd5646446891244a2e0dd491 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Tue, 10 Jan 2023 08:33:04 +0800
Subject: [PATCH 03/30] Grammar cleanup.

---
 docs/key-issues/cross_platform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/key-issues/cross_platform.md b/docs/key-issues/cross_platform.md
index 75bd458..c7aadf0 100644
--- a/docs/key-issues/cross_platform.md
+++ b/docs/key-issues/cross_platform.md
@@ -37,4 +37,4 @@ In the Python space, tools like [crossenv](https://github.com/benfogle/crossenv)
 also exist; these tools use a collection of path hacks and overrides of known
 sources of platform-specific details (like `distutils`) to provide a
 cross-compilation environment. However, these solutions tend to be somewhat
-fragile as aren't first-class citizens of the Python ecosystem.
+fragile as they aren't first-class citizens of the Python ecosystem.

From 84dbd5fe97bfd0c817c7d381f068b2a54389c009 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Tue, 10 Jan 2023 08:34:55 +0800
Subject: [PATCH 04/30] Add note about Windows platform support

---
 docs/key-issues/multiple_architectures.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 0f43e14..5409c4e 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -14,6 +14,8 @@ However, on occasion, we see an operating system platform where multiple CPU
 architectures are supported:
 
 * In the early days of Windows NT, both x86 and DEC Alpha CPUs were supported
+* Windows 10 supports x86, x86-64, ARMv7 and ARM64; Windows 11 supports x86-64
+  and ARM64.
 * Although Linux started as an x86 project, the Linux kernel is now available a
   wide range of other CPU architectures, including ARM64, RISC-V, PowerPC, s390
   and more.

From 2a40f47597256de036caafaab2e2ebc4152b7ed4 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Tue, 10 Jan 2023 08:36:19 +0800
Subject: [PATCH 05/30] Moved a paragraph about the universal2 to current
 state.

---
 docs/key-issues/multiple_architectures.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 5409c4e..aeb8c3a 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -106,13 +106,13 @@ libraries that need to span multiple ABIs. When developing an application for
 iOS, a developer will need to install binaries for both the simulator and
 physical devices.
 
-## Potential solutions or mitigations
-
 Python currently provides `universal2` wheels to support x86_64 and ARM64 in a
 single wheel. This is effectively a "fat wheel" format; the `.dylib` files
 contained in the wheel are fat binaries containing both x86_64 and ARM64 slices.
 
-However, "Universal2" is a macOS-specific definition that encompasses the scope
+## Potential solutions or mitigations
+
+"Universal2" is a macOS-specific definition that encompasses the scope
 of the specific "Apple Silicon" transition ("Universal" wheels also existed
 historically for the PowerPC to Intel transition). Even inside the Apple
 ecosystem, iOS, tvOS, and watchOS all have different combinations of supported

From 256327027305aa86a2df594cbee19e3dd0189296 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Tue, 10 Jan 2023 08:57:30 +0800
Subject: [PATCH 06/30] Clarified how Android deals with dependencies.

---
 docs/key-issues/multiple_architectures.md | 25 ++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index aeb8c3a..54ff7e2 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -62,6 +62,15 @@ with the same library name. This yields an independent final binary (APK) for
 each CPU architecture. When running locally, a CPU-specific APK will be
 uploaded to the simulator or test device.
 
+This approach can be supported with a conventional "single platform wheel"
+approach. A library developer can package a wheel for each Android CPU
+architecture they wish to support; the Android project will install a
+CPU-architecture appropriate wheel when the compiler pass for that archictecture
+is performed. The only complication is that process of installing wheels will
+involve a dependency resolution pass on each supported platform; this could
+potentially lead to a situation where a single application has different
+versions of a Python library on different architectures.
+
 To simplify the process of distributing the application, at time of publication,
 a single Android App Bundle (AAB) is generated from the multiple CPU-specific
 APKs. This AAB contains binaries for all platforms that can be uploaded to an
@@ -124,8 +133,14 @@ wheel can declare compatibility with multiple CPython versions (e.g.,
 `cp310-abi3-macosx_10_9_universal2` would be equivalent to
 `cp310-abi3-macosx_10_9_x86_64.arm64`.
 
-To support Android's multi-architecture approach, it may be necessary to extend
-installation tools to allow for installing multiple versions of a wheel in one
-installation pass. This can be emulated by making multiple independent calls to
-to package installer tools; but that results in independent dependency
-resolution, etc.
+Supporting Android's archiving approach requires no particular modifications to
+the "single architecture" solutions in use today. However, there may be a
+benefit to the developer experience if it is possible to ensure consistency
+in the dependency resolution solutions that are found for each architecture.
+The could come in the form of:
+1. Allowing for the installation of multiple wheel architectures in a single
+   installation pass.
+2. Sharing dependency resolution solutions between installation passes.
+3. Tools to identify when two different install passes have generated different
+   dependency solutions.
+4. A "multi-architecture" Android wheel.

From b9b904c9723c1e78ee01f059b12621343f907507 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Tue, 10 Jan 2023 09:09:09 +0800
Subject: [PATCH 07/30] Added an alternative approach for handling iOS
 multi-arch.

---
 docs/key-issues/multiple_architectures.md | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 54ff7e2..740db14 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -133,6 +133,17 @@ wheel can declare compatibility with multiple CPython versions (e.g.,
 `cp310-abi3-macosx_10_9_universal2` would be equivalent to
 `cp310-abi3-macosx_10_9_x86_64.arm64`.
 
+Alternatively, this could be solved as an install-time problem. In this
+approach, package repositories would continue to store single-architecture,
+single-ABI artefacts; however, at time of installation, the installation tool
+would allow for the specification of multiple architectures/ABI combinations.
+The installer would download a wheel for each architecture/ABI requested, and as
+a post-processing step, merge the binaries for multiple architectures into a
+single fat binary for each ABI. This would simplify the story from the package
+archive's perspective, but would require significant modifications to installer
+tooling (some of which would require callouts to platform-specfic build
+tooling).
+
 Supporting Android's archiving approach requires no particular modifications to
 the "single architecture" solutions in use today. However, there may be a
 benefit to the developer experience if it is possible to ensure consistency

From 45f748f33db400ec3f18f33ca25ff36e72892e87 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Mon, 16 Jan 2023 15:56:21 +0800
Subject: [PATCH 08/30] Modified comments to use common section structure, and
 include specific references to prior art and existing discusions.

---
 docs/key-issues/cross_platform.md         |  70 ++++++-
 docs/key-issues/multiple_architectures.md | 242 +++++++++++++++-------
 2 files changed, 233 insertions(+), 79 deletions(-)

diff --git a/docs/key-issues/cross_platform.md b/docs/key-issues/cross_platform.md
index c7aadf0..4c42c58 100644
--- a/docs/key-issues/cross_platform.md
+++ b/docs/key-issues/cross_platform.md
@@ -4,7 +4,7 @@ The historical assumption of compilation is that the platform where the code is
 compiled will be the same as the platform where the final code will be executed
 (if not literally the same machine, then at least one that is CPU and ABI
 compatible at the operating system level). This is a reasonable assumption for
-most desktop projects; However, for mobile platforms, this isn't the case.
+most desktop platforms; however, for some platforms, this isn't the case.
 
 On mobile platforms, an app is compiled on a desktop platform, and transferred
 to the mobile device (or a simulator) for testing. The compiler is not executed
@@ -27,14 +27,66 @@ x86_64 hardware; however, in this case, the host platform (macOS on x86_64) will
 still be one of the outputs of the compilation process, and the resulting binary
 will run on the CI/CD system.
 
+## Current state
+
+Native compiler and build toolchains (e.g., autoconf/automake, CMake) have long
+supported cross-compilation; however, these cross-compilation capabilities are
+easy to break unless they are exercised regularly.
+
+CPython's build system includes some support for cross-compilation. This support
+is largely based on leveraging autoconf's support for cross compilation. This
+support wasn't well integrated into distutils and the compilation of the binary
+portions of stdlib; however, with the deprecation and removal of disutils in
+Python 3.12, this situation has improved.
+
+The specification of PEP517 means cross-platform compilation support has been
+largely converted into a concern for individual build systems to manage.
+
+## Problems
+
+There is currently a small gap in communicating target platform details to the
+build system. While a build system like autoconf or Cmake may support
+cross-platform compilation, and a project may be able to cross-compile binary
+artefacts, invocation of the PEP517 build interface currently assumes that the
+platform running the build will be the platform that ultimately runs the Python
+code. As a result, `sys.platform`, or the various attributes of the `platform`
+library can't be used as part of the build process.
+
+`pip` provides limited support for installing binaries for a different platform
+by specifying a `--platform`, `--implementation` and `--abi` flags; however,
+these flags only work for the selection of pre-built binary artefacts.
+
+## History
+
+Tools like [crossenv](https://github.com/benfogle/crossenv) can be used to trick
+Python into performing cross-platform builds. These tools use path hacks and
+overrides of known sources of platform-specific details (like `distutils`) to
+provide a cross-compilation environment. However, these solutions tend to be
+somewhat fragile as they aren't first-class citizens of the Python ecosystem.
+
+[The BeeWare Project](https://beeware.org) also uses a version of these
+techniques. On both platforms, BeeWare provides a custom package index that
+contains pre-compiled binaries ([Android](https://chaquo.com/pypi-7.0/);
+[iOS](https://anaconda.org/beeware/repo)). These binaries are produced using a
+forge-like set of tooling
+([Android](https://github.com/chaquo/chaquopy/tree/master/server/pypi);
+[iOS](https://github.com/freakboy3742/chaquopy/tree/iOS-support/server/pypi)).
+
+## Relevant resources
+
+TODO
+
 ## Potential solutions or mitigations
 
-Compiler and build toolchains (e.g., autoconf/automake) have long supported
-cross-compilation; however, these cross-compilation capabilities are easy to
-break unless they are exercised regularly.
+At it's core, what is required is a recognition that cross-platform builds as a
+use case that the Python ecosystem supports.
+
+In concrete terms, for native modules, this would require either:
+
+1. Extension of the PEP517 interface to allow communicating the desired target
+   platform as part of a binary build; or
 
-In the Python space, tools like [crossenv](https://github.com/benfogle/crossenv)
-also exist; these tools use a collection of path hacks and overrides of known
-sources of platform-specific details (like `distutils`) to provide a
-cross-compilation environment. However, these solutions tend to be somewhat
-fragile as they aren't first-class citizens of the Python ecosystem.
+2. Formalization of the "platform identification" interface that can used by
+   PEP517 build backends to identify the target platform, so that tools like
+   `crossenv` can provide a reliable proxied environment for cross-platform
+   builds.
diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 740db14..889cd7d 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -4,14 +4,16 @@ In addition to any ABI requirements, a binary is compiled for a CPU
 architecture. That CPU architecture defines the CPU instructions that can be
 issued by the binary.
 
+## Current state
+
 Historically, it could be assumed that an executable or library would be
 compiled for a single CPU archicture. On the rare occasion that an operating
 system was available for mulitple CPU architectures, it became the
 responsibility of the user to find (or compile) a binary that was compiled for
 their host CPU architecture.
 
-However, on occasion, we see an operating system platform where multiple CPU
-architectures are supported:
+However, we now see operating system platforms where multiple CPU architectures
+are supported:
 
 * In the early days of Windows NT, both x86 and DEC Alpha CPUs were supported
 * Windows 10 supports x86, x86-64, ARMv7 and ARM64; Windows 11 supports x86-64
@@ -37,47 +39,38 @@ of [GPU compatibility](gpus.md). When dealing with multiple CPU architectures,
 there may be some overal with the solutions that can be used to support GPUs in
 native binaries.
 
-## Platform approaches for dealing with multiple architectures
-
-Three approaches have emerged for handling multiple CPU architectures.
+Three approaches have emerged on operating systmes that have a need to manage
+multiple CPU architectures:
 
 ### Multiple binaries
 
 The minimal solution is to distribute multiple binaries. This is the approach
-that was used by Windows NT, and is currently supported by Linux. At time of
-distribution, an installer or other downloadable artefact is provided for each
-supported platform, and it is up to the user to select and download the correct
-artefact.
+that is by Windows and Linux. At time of distribution, an installer or other
+downloadable artefact is provided for each supported platform, and it is up to
+the user to select and download the correct artefact.
 
 ### Archiving
 
 The approach taken by Android is very similar to the multiple binary approach,
 with some affordances and tooling to simplify distribution.
 
-When building an Android project, each target architecture is compiled
-independently. If a native binary library is required to compile the Android
-application, a version must be provided for each supported CPU architecture. A
-directory layout convention exists for providing a binary for each platform,
-with the same library name. This yields an independent final binary (APK) for
-each CPU architecture. When running locally, a CPU-specific APK will be
-uploaded to the simulator or test device.
-
-This approach can be supported with a conventional "single platform wheel"
-approach. A library developer can package a wheel for each Android CPU
-architecture they wish to support; the Android project will install a
-CPU-architecture appropriate wheel when the compiler pass for that archictecture
-is performed. The only complication is that process of installing wheels will
-involve a dependency resolution pass on each supported platform; this could
-potentially lead to a situation where a single application has different
-versions of a Python library on different architectures.
-
-To simplify the process of distributing the application, at time of publication,
-a single Android App Bundle (AAB) is generated from the multiple CPU-specific
-APKs. This AAB contains binaries for all platforms that can be uploaded to an
-app store.
-
-When an end-user requests the installation of an app, the app store strips out the
-binary that is appropriate for the end-user's device.
+By default Android projects use Java/Kotlin, which produces platform independent
+code. However, it is possible to use non Java/Kotlin libraries by using JNI and
+the Android NDK (Native Development Kit). If a project contains native code, a
+separate compilation pass is performed for each architecture.
+
+If a native binary library is required to compile the Android application, a
+version must be provided for each supported CPU architecture. A directory layout
+convention exists for providing a binary for each platform, with the same
+library name.
+
+The final binary artefact produced for Android distrobution uses this same
+directory convention. A "binary" on Android is an APK (Android Application
+Package) bundle; this is effectibely a ZIP file with known metadata and
+structure; internally, there are subfolders for each supported CPU architecture.
+This APK is bundled into AAB (Android Application Bundle) format for upload to
+an app store; at time of installation, a CPU-specific APK is generated and
+provided to the end-user for installation.
 
 ### Fat binaries
 
@@ -105,6 +98,11 @@ ARM64 binary. Complications can occur when only *some* of the binary is being
 converted (e.g., if the binary being executed is fat, but a dynamic library
 isn't).
 
+To support the transition to Apple Silicon/M1 (ARM64), Python has introduced a
+`universal2` architecture target to support . This is effectively a "fat wheel"
+format; the `.dylib` files contained in the wheel are fat binaries containing
+both x86_64 and ARM64 slices.
+
 iOS has an additional complication of requiring support for mutiple *ABIs* in
 addition to multiple CPU archiectures. The ABI for the iOS simulator and
 physical iOS devices are different; however, ARM64 is a supported CPU
@@ -115,43 +113,147 @@ libraries that need to span multiple ABIs. When developing an application for
 iOS, a developer will need to install binaries for both the simulator and
 physical devices.
 
-Python currently provides `universal2` wheels to support x86_64 and ARM64 in a
-single wheel. This is effectively a "fat wheel" format; the `.dylib` files
-contained in the wheel are fat binaries containing both x86_64 and ARM64 slices.
+## Problems
+
+At present, the Python ecosystem almost exclusively uses the "multiple binary"
+solution. This serves the needs of Windows and Linux well, as it matches the
+way end-users interact with binaries.
+
+The `universal2` "fat wheel" solution also works well for macOS. The definition
+of `universal2` is a hard-coded accomodation for one specific (albiet common)
+multi-architecture configuration, and involves a number of specific
+accomodations in the Python ecosystem (e.g., a macOS-specific architecture
+lookup scheme).
+
+Supporting iOS requires supporting between 2 and 5 architectures (x86_64 and
+ARM64 at the minimum), and at least 2 ABIs - the iOS simulator and iOS device
+have different (and incompatible) binary ABIs. At runtime, iOS expects to find a
+single "fat" binary for any given ABI. iOS effectively requires an analog of
+`universal2` covering the 2 ABIs and multiple architectures. However:
+
+1. The Python ecosystem does not provide an extension mechanism that would allow
+   platforms to define and utilize multi-architecture build artefacts.
+
+2. The rate of change of CPU architectures in the iOS ecosystem is more rapid
+   than that seen on desktop platforms; any potential "universal iOS" target
+   would need to be updated or versioned regularly. A single named target would
+   also force developers into supporting older devices that they may not want to
+   support.
+
+Supporting Android also requires the support of between 2 and 4 architectures
+(depending on the range of development and end-user configurations the app needs
+to support). Android's archiving-based approach can be mapped onto the "multiple
+binary" approach, as it is possible to build a single archive from multiple
+individual binaries. However, some coordination is required when installing
+multiple binaries. If an independent install pass (e.g., call to `pip`) is used
+for each architecture, the dependency resolution process for each platform will
+also be independent; if there are any discrepancies in the specific versions
+available for each architecture (or any ordering instabilities in the dependency
+resolution algorithm), it is possible to end up with different versions on each
+platform. Some coordination between per-architecture passes is therefore
+required.
+
+## History
+
+[The BeeWare Project](https://beeware.org) provides support for building both
+iOS and Android binaries. On both platforms, BeeWare provides a custom package
+index that contains pre-compiled binaries
+([Android](https://chaquo.com/pypi-7.0/);
+[iOS](https://anaconda.org/beeware/repo)). These binaries are produced using a
+forge-like set of tooling
+([Android](https://github.com/chaquo/chaquopy/tree/master/server/pypi);
+[iOS](https://github.com/freakboy3742/chaquopy/tree/iOS-support/server/pypi))
+that patches the build systems for the most common Python binary dependencies;
+and on iOS, manages the process of merging single-architecture, single ABI
+wheels into a fat wheel.
+
+On iOS, BeeWare-supplied iOS binary packages provide a single "iPhone" wheel.
+This wheel includes 2 binary libraries (one for the iPhone device ABI, and one
+for the iPhone Simulator ABI); the iPhone simulator binary includes x86_64 and
+ARM64 slices. This is effectively the "universal-iphone" approach, encoding a
+specific combination of ABIs and architectures.
+
+BeeWare's support for Android uses [Chaquopy](https://chaquo.com/chaquopy) as a
+base. Chaquopy's binary artefact repository stores a single binary wheel for
+each platform; it also contains a wrapper around `pip` to manage the
+installation of multiple binaries. When a Python project requests the
+installation of a package:
+
+* Pip is run normally for one binary architecture
+* The `.dist-info` metadata is used to identify the native packages -  both
+  those directly requested by the user, and those installed as indirect
+  requirements by pip
+* The native packages are separated from the pure-Python packages, and pip is
+  then run again for each of the remaining architectures; this time, only those
+  specific native packages are installed, pinned to the same versions that pip
+  selected for the first architecture.
+
+[Kivy](https://kivy.org) also provides support for iOS and Android as deployment
+platforms. However, Kivy doesn't support the use of binary artefacts like wheels
+on those platforms; Kivy's support for binary modules is based on the broader Kivy
+platform including build support for libraries that may be required.
+
+## Relevant resources
+
+To date, there haven't been extensive public discussions about the support of
+iOS or Android binary packages. However, there were discussions around the
+adoption of universal2 for macOS:
+
+* [The CPython discussion about universal2
+  support](https://discuss.python.org/t/apple-silicon-and-packaging/4516)
+* [The addition of universal2 to
+  CPython](https://github.com/python/cpython/pull/22855)
+* [Support in packaging for
+  universal2](https://github.com/pypa/packaging/pull/319), which declares the
+  logic around resolving universal2 to specific platforms.
 
 ## Potential solutions or mitigations
 
-"Universal2" is a macOS-specific definition that encompasses the scope
-of the specific "Apple Silicon" transition ("Universal" wheels also existed
-historically for the PowerPC to Intel transition). Even inside the Apple
-ecosystem, iOS, tvOS, and watchOS all have different combinations of supported
-CPU architectures.
-
-A more general solution for naming multi-architecture binaries, similar to how a
-wheel can declare compatibility with multiple CPython versions (e.g.,
-`cp34.cp35.cp36-abi3-manylinux1_x86_64`) may be called for. In such a scheme,
-`cp310-abi3-macosx_10_9_universal2` would be equivalent to
-`cp310-abi3-macosx_10_9_x86_64.arm64`.
-
-Alternatively, this could be solved as an install-time problem. In this
-approach, package repositories would continue to store single-architecture,
-single-ABI artefacts; however, at time of installation, the installation tool
-would allow for the specification of multiple architectures/ABI combinations.
-The installer would download a wheel for each architecture/ABI requested, and as
-a post-processing step, merge the binaries for multiple architectures into a
-single fat binary for each ABI. This would simplify the story from the package
-archive's perspective, but would require significant modifications to installer
-tooling (some of which would require callouts to platform-specfic build
-tooling).
-
-Supporting Android's archiving approach requires no particular modifications to
-the "single architecture" solutions in use today. However, there may be a
-benefit to the developer experience if it is possible to ensure consistency
-in the dependency resolution solutions that are found for each architecture.
-The could come in the form of:
-1. Allowing for the installation of multiple wheel architectures in a single
-   installation pass.
-2. Sharing dependency resolution solutions between installation passes.
-3. Tools to identify when two different install passes have generated different
-   dependency solutions.
-4. A "multi-architecture" Android wheel.
+There are two approaches that could be used to provide a general solution to
+this problem, depending on whether the support of multiple architectures is
+viewed as a distribution or integration problem.
+
+### Distribution-based solution
+
+The first approach is to treat the problem as a package distribution issue. In
+this approach, artefacts stored in package repositories include all the ABIs and
+CPU architectures needed to meaningfully support a given platform. This is the
+approach embodied by the `universal2` packaging solution on macOS, and the iOS
+solution used by BeeWare.
+
+This approach would require agreement on any new "known" multi-ABI/arch tags, as
+well as any resolution schemes that may be needed for those tags.
+
+A more general approach to this problem would be to allow for multi-architecture
+and multi-ABI binaries as part of the wheel naming scheme. A wheel can already
+declare compatibility with multiple CPython versions (e.g.,
+`cp34.cp35.cp36-abi3-manylinux1_x86_64`); it could be possible for a wheel to
+declare multiple ABI or architecture inclusions. In such a scheme,
+`cp310-abi3-macosx_10_9_universal2` would effectively be equivalent to
+`cp310-abi3-macosx_10_9_x86_64.macosx_10_9_arm64`; an iPhone wheel for the same
+package might be
+`cp310-abi3-iphoneos_12_0_arm64.iphonesimulator_12_0_x86_64.iphonesimulator_12_0_arm64`.
+
+This would allow for more generic logic based on matching name fragments, rather
+than specific "known name" targets.
+
+Regardless of whether "known tags" or a generic naming scheme is used, the
+distribution-based approach requires modifications to the process of building
+packages, and the process of installing packages.
+
+### Integration-based solution
+
+Alternatively, this could be treated as an install-time problem. This is the
+approach taken by BeeWare/Chaquopy on Android.
+
+In this approach, package repositories would continue to store
+single-architecture, single-ABI artefacts. However, at time of installation, the
+installation tool allows for the specification of multiple architectures/ABI
+combinations. The installer then downloads a wheel for each architecture/ABI
+requested, and performs any post-processing required to merge binaries for
+multiple architectures into a single fat binary, or archiving those binary
+artefacts in an appropriate location.
+
+This approach is less invasive from the perspective of package repositories and
+package build tooling; but would require significant modifications to installer
+tooling.

From 373bb097813fe462f6001ee30716e8343edbe475 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Tue, 17 Jan 2023 07:37:51 +0800
Subject: [PATCH 09/30] Apply suggestions from code review

Co-authored-by: h-vetinari <h.vetinari@gmx.com>
Co-authored-by: Malcolm Smith <smith@chaquo.com>
---
 docs/key-issues/cross_platform.md         | 8 ++++----
 docs/key-issues/multiple_architectures.md | 8 ++++----
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/docs/key-issues/cross_platform.md b/docs/key-issues/cross_platform.md
index 4c42c58..3cfe365 100644
--- a/docs/key-issues/cross_platform.md
+++ b/docs/key-issues/cross_platform.md
@@ -30,8 +30,8 @@ will run on the CI/CD system.
 ## Current state
 
 Native compiler and build toolchains (e.g., autoconf/automake, CMake) have long
-supported cross-compilation; however, these cross-compilation capabilities are
-easy to break unless they are exercised regularly.
+supported cross-compilation; however, such cross-compilation capabilities for any
+given project tend to bitrot and break easily unless they are exercised regularly.
 
 CPython's build system includes some support for cross-compilation. This support
 is largely based on leveraging autoconf's support for cross compilation. This
@@ -44,8 +44,8 @@ largely converted into a concern for individual build systems to manage.
 
 ## Problems
 
-There is currently a small gap in communicating target platform details to the
-build system. While a build system like autoconf or Cmake may support
+There is currently a gap in communicating target platform details to the
+build system. While a build system like autoconf or CMake may support
 cross-platform compilation, and a project may be able to cross-compile binary
 artefacts, invocation of the PEP517 build interface currently assumes that the
 platform running the build will be the platform that ultimately runs the Python
diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 889cd7d..00df49e 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -36,7 +36,7 @@ CPU architecture, that doesn't guarantee [ABI compatibility](abi.md).
 
 In some respects, CPU architecture compatibility could be considered a superset
 of [GPU compatibility](gpus.md). When dealing with multiple CPU architectures,
-there may be some overal with the solutions that can be used to support GPUs in
+there may be some overlap with the solutions that can be used to support GPUs in
 native binaries.
 
 Three approaches have emerged on operating systmes that have a need to manage
@@ -55,7 +55,7 @@ The approach taken by Android is very similar to the multiple binary approach,
 with some affordances and tooling to simplify distribution.
 
 By default Android projects use Java/Kotlin, which produces platform independent
-code. However, it is possible to use non Java/Kotlin libraries by using JNI and
+code. However, it is possible to use non-Java/Kotlin libraries by using JNI and
 the Android NDK (Native Development Kit). If a project contains native code, a
 separate compilation pass is performed for each architecture.
 
@@ -66,7 +66,7 @@ library name.
 
 The final binary artefact produced for Android distrobution uses this same
 directory convention. A "binary" on Android is an APK (Android Application
-Package) bundle; this is effectibely a ZIP file with known metadata and
+Package) bundle; this is effectively a ZIP file with known metadata and
 structure; internally, there are subfolders for each supported CPU architecture.
 This APK is bundled into AAB (Android Application Bundle) format for upload to
 an app store; at time of installation, a CPU-specific APK is generated and
@@ -120,7 +120,7 @@ solution. This serves the needs of Windows and Linux well, as it matches the
 way end-users interact with binaries.
 
 The `universal2` "fat wheel" solution also works well for macOS. The definition
-of `universal2` is a hard-coded accomodation for one specific (albiet common)
+of `universal2` is a hard-coded accomodation for one specific (albeit common)
 multi-architecture configuration, and involves a number of specific
 accomodations in the Python ecosystem (e.g., a macOS-specific architecture
 lookup scheme).

From d8a2ca62afb0b77610af2545bef4f65bf7c5c3eb Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Tue, 17 Jan 2023 07:39:29 +0800
Subject: [PATCH 10/30] More updates stemming from review.

---
 docs/glossary.md                          |  4 +++
 docs/key-issues/cross_platform.md         | 27 ++++++++++++++------
 docs/key-issues/multiple_architectures.md | 30 ++++++++++++++---------
 3 files changed, 41 insertions(+), 20 deletions(-)

diff --git a/docs/glossary.md b/docs/glossary.md
index 538f06a..30ad889 100644
--- a/docs/glossary.md
+++ b/docs/glossary.md
@@ -6,6 +6,8 @@
 |---|---|---|
 | ABI | Application Binary Interface | See [here](./background/binary_interface.md) |
 | API | Application Programming Interface | The sum total of available functions, classes, etc. of a given program |
+| AAB | Android Application Bundle | A distributable unit containing an Android application |
+| APK | Android application Package | A "binary" unit for Android, installed on a device |
 | ARM | Advanced RISC Machines | Family of RISC architectures, second-most widely used processor family after x86 |
 | AVX | Advanced Vector eXtensions | Various extensions to the x86 instruction set (AVX, AVX2, AVX512), evolution after SSE |
 | BLAS | Basic Linear Algebra Subprograms | Specification resp. implementation for low-level linear algebra routines |
@@ -29,6 +31,7 @@
 | LAPACK | Linear Algebra PACKage | Standard software library for numerical linear algebra |
 | ISA | Instruction Set Architecture | Specification of an instruction set for a CPU; e.g. x86-64, arm64, ... |
 | JIT | Just-in-time Compilation | Compiling code just before execution; used in CUDA, PyTorch, PyPy, Numba etc. |
+| JNI | Java Native Interface | The bridge API allowing access of Java runtime objects from native code (and vice versa) |
 | LLVM | - | Cross-platform compiler framework, home of Clang, MLIR, BOLT etc. |
 | LTO | Link-Time Optimization | See [here](./background/compilation_concepts.md#link-time-optimization-lto)|
 | LTS | Long-Term Support | Version of a given software/library/distribution designated for long-term support |
@@ -36,6 +39,7 @@
 | MPI | Message Passing Interface | Standard for message-passing in parallel computing |
 | MLIR | Multi-Level IR | Higher-level IR within LLVM; used i.a. in machine learning frameworks |
 | MSVC | Microsoft Visual C++ | Main compiler on Windows |
+| NDK | Native Development Kit | The Android toolchain supporting compilation of binary modules |
 | NEP | Numpy Enhancement Proposal | See [here](https://numpy.org/neps/) |
 | OpenMP | Open Multi Processing | Multi-platform API for enabling multi-processing in C/C++/Fortran |
 | OS | Operating System | E.g. Linux, MacOS, Windows |
diff --git a/docs/key-issues/cross_platform.md b/docs/key-issues/cross_platform.md
index 3cfe365..0e58245 100644
--- a/docs/key-issues/cross_platform.md
+++ b/docs/key-issues/cross_platform.md
@@ -36,8 +36,9 @@ given project tend to bitrot and break easily unless they are exercised regularl
 CPython's build system includes some support for cross-compilation. This support
 is largely based on leveraging autoconf's support for cross compilation. This
 support wasn't well integrated into distutils and the compilation of the binary
-portions of stdlib; however, with the deprecation and removal of disutils in
-Python 3.12, this situation has improved.
+portions of stdlib. The removal of distutils in Python 3.12 represents an
+improvement the overall situation, but there is still a long way to go before
+the ecosystem as a whole has fully integrated the consequences of this change.
 
 The specification of PEP517 means cross-platform compilation support has been
 largely converted into a concern for individual build systems to manage.
@@ -54,7 +55,9 @@ library can't be used as part of the build process.
 
 `pip` provides limited support for installing binaries for a different platform
 by specifying a `--platform`, `--implementation` and `--abi` flags; however,
-these flags only work for the selection of pre-built binary artefacts.
+these flags only work for the selection of pre-built binary artefacts, and are
+therefore constrained to the set of platform and ABI tags published by the
+author.
 
 ## History
 
@@ -68,9 +71,10 @@ somewhat fragile as they aren't first-class citizens of the Python ecosystem.
 techniques. On both platforms, BeeWare provides a custom package index that
 contains pre-compiled binaries ([Android](https://chaquo.com/pypi-7.0/);
 [iOS](https://anaconda.org/beeware/repo)). These binaries are produced using a
-forge-like set of tooling
+set of tooling
 ([Android](https://github.com/chaquo/chaquopy/tree/master/server/pypi);
-[iOS](https://github.com/freakboy3742/chaquopy/tree/iOS-support/server/pypi)).
+[iOS](https://github.com/freakboy3742/chaquopy/tree/iOS-support/server/pypi))
+that is analogous to the tools used by conda-forge to build binary artefacts.
 
 ## Relevant resources
 
@@ -78,10 +82,10 @@ TODO
 
 ## Potential solutions or mitigations
 
-At it's core, what is required is a recognition that cross-platform builds as a
-use case that the Python ecosystem supports.
+At it's core, what is required is a recognition that the use case of
+cross-platform builds is something that the Python ecosystem should support.
 
-In concrete terms, for native modules, this would require either:
+In concrete terms, for native modules, this would require some combination of:
 
 1. Extension of the PEP517 interface to allow communicating the desired target
    platform as part of a binary build; or
@@ -90,3 +94,10 @@ In concrete terms, for native modules, this would require either:
    PEP517 build backends to identify the target platform, so that tools like
    `crossenv` can provide a reliable proxied environment for cross-platform
    builds.
+
+3. Clear separation of metadata associated with the definition of build and
+   target platforms, rather than assuming that build and target platform will
+   always be the same.
+
+4. Extension of the PEP517 interface to report when a build steps (e.g., running
+   a code generation tool) cannot be run on the target hardware.
diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 00df49e..e678f85 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -1,8 +1,13 @@
 # Platforms with multiple CPU architectures
 
-In addition to any ABI requirements, a binary is compiled for a CPU
-architecture. That CPU architecture defines the CPU instructions that can be
-issued by the binary.
+One important subset of ABI concerns is the CPU architecture for which a binary
+artefact has been built. Attempting to run a binary on hardware that doesn't
+match the CPU architecture (or architecture variant [^1]) for which the binary
+was built will generally lead to crashes, even if the ABI being used is
+otherwise compatible.
+
+[^1] e.g., the x86-64 architecture has a range of well-known extensions, such as
+     SSE, SSE2, SSE3, AVX, AVX2, AVX512, etc.
 
 ## Current state
 
@@ -99,9 +104,9 @@ converted (e.g., if the binary being executed is fat, but a dynamic library
 isn't).
 
 To support the transition to Apple Silicon/M1 (ARM64), Python has introduced a
-`universal2` architecture target to support . This is effectively a "fat wheel"
-format; the `.dylib` files contained in the wheel are fat binaries containing
-both x86_64 and ARM64 slices.
+`universal2` architecture target. This is effectively a "fat wheel" format; the
+`.dylib` files contained in the wheel are fat binaries containing both x86_64
+and ARM64 slices.
 
 iOS has an additional complication of requiring support for mutiple *ABIs* in
 addition to multiple CPU archiectures. The ABI for the iOS simulator and
@@ -128,8 +133,8 @@ lookup scheme).
 Supporting iOS requires supporting between 2 and 5 architectures (x86_64 and
 ARM64 at the minimum), and at least 2 ABIs - the iOS simulator and iOS device
 have different (and incompatible) binary ABIs. At runtime, iOS expects to find a
-single "fat" binary for any given ABI. iOS effectively requires an analog of
-`universal2` covering the 2 ABIs and multiple architectures. However:
+single "fat" binary for the ABI that is in use. iOS effectively requires an
+analog of `universal2` covering the 2 ABIs and multiple architectures. However:
 
 1. The Python ecosystem does not provide an extension mechanism that would allow
    platforms to define and utilize multi-architecture build artefacts.
@@ -160,12 +165,13 @@ iOS and Android binaries. On both platforms, BeeWare provides a custom package
 index that contains pre-compiled binaries
 ([Android](https://chaquo.com/pypi-7.0/);
 [iOS](https://anaconda.org/beeware/repo)). These binaries are produced using a
-forge-like set of tooling
+set of tooling
 ([Android](https://github.com/chaquo/chaquopy/tree/master/server/pypi);
 [iOS](https://github.com/freakboy3742/chaquopy/tree/iOS-support/server/pypi))
-that patches the build systems for the most common Python binary dependencies;
-and on iOS, manages the process of merging single-architecture, single ABI
-wheels into a fat wheel.
+that is analogous to the tools used by conda-forge to build binary artefacts.
+These tools patch the source and build configurations for the most common Python
+binary dependencies; on iOS, these tools also manage the process of merging
+single-architecture, single ABI wheels into a fat wheel.
 
 On iOS, BeeWare-supplied iOS binary packages provide a single "iPhone" wheel.
 This wheel includes 2 binary libraries (one for the iPhone device ABI, and one

From f533395aa391a9b88cef378a1b5d80cffbb2219b Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Tue, 17 Jan 2023 11:24:53 +0800
Subject: [PATCH 11/30] Expand note about Linux support.

Co-authored-by: h-vetinari <h.vetinari@gmx.com>
---
 docs/key-issues/multiple_architectures.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index e678f85..0e4440e 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -23,9 +23,9 @@ are supported:
 * In the early days of Windows NT, both x86 and DEC Alpha CPUs were supported
 * Windows 10 supports x86, x86-64, ARMv7 and ARM64; Windows 11 supports x86-64
   and ARM64.
-* Although Linux started as an x86 project, the Linux kernel is now available a
-  wide range of other CPU architectures, including ARM64, RISC-V, PowerPC, s390
-  and more.
+* Due to its open source nature, Linux tends to support all CPU architectures for
+  which someone is interested enough to author & provide support in the kernel,
+  see [here](https://en.wikipedia.org/wiki/List_of_Linux-supported_computer_architectures).
 * Apple transitioned Mac hardware from PowerPC to Intel (x86-64) CPUs, providing
   a forwards compatibility path for binaries
 * Apple is currently transitioning Mac hardware from Intel (x86-64) to

From 8475360fa6241229eb152c1d09f92c6e78790a90 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Tue, 17 Jan 2023 11:45:34 +0800
Subject: [PATCH 12/30] Correct an it's typo.

---
 docs/key-issues/cross_platform.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/key-issues/cross_platform.md b/docs/key-issues/cross_platform.md
index 0e58245..338e87e 100644
--- a/docs/key-issues/cross_platform.md
+++ b/docs/key-issues/cross_platform.md
@@ -82,7 +82,7 @@ TODO
 
 ## Potential solutions or mitigations
 
-At it's core, what is required is a recognition that the use case of
+At the core, what is required is a recognition that the use case of
 cross-platform builds is something that the Python ecosystem should support.
 
 In concrete terms, for native modules, this would require some combination of:

From 2886f2c343781491f57b34cb7c89949f9435fad3 Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Mon, 27 Feb 2023 18:18:19 +0000
Subject: [PATCH 13/30] Add content to page on cross compilation

---
 docs/key-issues/cross_platform.md | 134 ++++++++++++++++++++----------
 1 file changed, 89 insertions(+), 45 deletions(-)

diff --git a/docs/key-issues/cross_platform.md b/docs/key-issues/cross_platform.md
index 338e87e..4d3a002 100644
--- a/docs/key-issues/cross_platform.md
+++ b/docs/key-issues/cross_platform.md
@@ -1,4 +1,4 @@
-# Cross-platform installation
+# Cross compilation
 
 The historical assumption of compilation is that the platform where the code is
 compiled will be the same as the platform where the final code will be executed
@@ -9,49 +9,99 @@ most desktop platforms; however, for some platforms, this isn't the case.
 On mobile platforms, an app is compiled on a desktop platform, and transferred
 to the mobile device (or a simulator) for testing. The compiler is not executed
 on device. Therefore, it must be possible to build a binary artefact for a CPU
-architecture and a ABI that is different the platform that is running the
-compiler.
+architecture and an ABI that is different from the platform that is running the
+compiler. The situation is similar for embedded devices.
 
 Cross compilation issues also emerge when dealing with continuous
 integration/deployment (CI/CD). CI/CD platforms (such as Github Actions)
 generally provide the "common" architectures - often only x86-64 - however, a
 project may want to produce binaries for other platforms (e.g., ARM support for
-Raspberry Pi devices; PowerPC or s390 for mainframe/server devices; or for
+Raspberry Pi devices; PowerPC or s390x for mainframe/server devices; or for
 mobile platforms). These binaries won't run natively on the host CI/CD system
-(without some sort of emulation); but code can be compiled for the target
-platform.
+(without some sort of emulation, for example with QEMU); but code can be
+compiled for the target platform.
 
 macOS also experiences this as a result of the Apple Silicon transition. Apple
-has provided the tools to compile [fat binaries](multiple_architectures.md) on
-x86_64 hardware; however, in this case, the host platform (macOS on x86_64) will
-still be one of the outputs of the compilation process, and the resulting binary
-will run on the CI/CD system.
+has provided the tools to make cross compilation from x86-64 to arm64 as easy
+as possible, as well as to compile [fat binaries](multiple_architectures.md)
+(supporting x86-64 and arm64 at the same time) on x86-64 hardware. In the latter
+case, the host platform (macOS on x86-64) will still be one of the outputs
+of the compilation process, and the resulting binary will run on the CI/CD
+system.
 
 ## Current state
 
-Native compiler and build toolchains (e.g., autoconf/automake, CMake) have long
+Native compiler and build toolchains (e.g., autoconf/automake, CMake, Meson) have long
 supported cross-compilation; however, such cross-compilation capabilities for any
 given project tend to bitrot and break easily unless they are exercised regularly.
 
 CPython's build system includes some support for cross-compilation. This support
 is largely based on leveraging autoconf's support for cross compilation. This
-support wasn't well integrated into distutils and the compilation of the binary
-portions of stdlib. The removal of distutils in Python 3.12 represents an
+support wasn't well integrated into `distutils` and the compilation of the binary
+portions of stdlib. The removal of `distutils` in Python 3.12 represents an
 improvement the overall situation, but there is still a long way to go before
 the ecosystem as a whole has fully integrated the consequences of this change.
 
-The specification of PEP517 means cross-platform compilation support has been
-largely converted into a concern for individual build systems to manage.
+The way build backend hooks in `pyproject.toml` are specified (see PEP 517)
+means cross-platform compilation support has been partially converted into a
+concern for individual build systems to manage.
+
+In order to cross-compile a Python package, one needs a compiler toolchain as
+well as two Python installs - one for the build system and one for the host
+system.[^1] This can make it a little challenging to get started. If a compiler
+toolchain is not already provided on the system of interest, it can be built
+from source with, e.g., [crosstool-ng](https://crosstool-ng.github.io/) or
+obtained from, e.g., [dockcross](https://github.com/dockcross/dockcross).
+Or one can use a packaging system that has builtin support for cross-compilation.
+[The Yocto Project](https://www.yoctoproject.org/),
+[OpenEmbedded](https://www.openembedded.org/wiki/Main_Page) and
+[Buildroot](https://buildroot.org/) are projects specifically focused on
+cross-compilation for Linux embedded systems. More general-purpose packaging
+ecosystems often have toolchains and supporting infrastructure to cross-compile packages for their own needs - see, e.g., info for
+[Void Linux](https://github.com/void-linux/void-packages#cross-compiling),
+[conda-forge](https://conda-forge.org/),
+[Debian](https://wiki.debian.org/CrossCompiling) and
+[Nix](https://nixos.org/guides/cross-compilation.html).
+
+[^1]:
+    The "build", "host" and "target" terminology for identifying which system
+    is which in a cross-compilation setup is not consistent across build
+    systems and packaging tools. Always carefully check whether "build" means
+    the machine on which the compilation is run and "host" the machine on which
+    the produced binaries will run - or vice versa.
+
+Tools like [crossenv](https://github.com/benfogle/crossenv) can be used to trick
+Python into performing cross-platform builds. These tools use path hacks and
+overrides of known sources of platform-specific details (like `distutils`) to
+provide a cross-compilation environment. However, these solutions tend to be
+somewhat fragile as they aren't first-class citizens of the Python ecosystem.
+
+[The BeeWare Project](https://beeware.org) also uses a version of these
+techniques. For both the platforms it supports, BeeWare provides a custom
+package index that contains pre-compiled binaries ([Android](https://chaquo.com/pypi-7.0/);
+[iOS](https://anaconda.org/beeware/repo)). These binaries are produced using a
+set of tooling ([Android](https://github.com/chaquo/chaquopy/tree/master/server/pypi);
+[iOS](https://github.com/freakboy3742/chaquopy/tree/iOS-support/server/pypi))
+that is analogous to the tools used by conda-forge to build binary artefacts.
+
 
 ## Problems
 
-There is currently a gap in communicating target platform details to the
-build system. While a build system like autoconf or CMake may support
+There is currently a gap in _communicating target platform details to the
+build system_. While a build system like autoconf or CMake may support
 cross-platform compilation, and a project may be able to cross-compile binary
-artefacts, invocation of the PEP517 build interface currently assumes that the
+artefacts, invocation of a `pyproject.toml` build hook typically assumes that the
 platform running the build will be the platform that ultimately runs the Python
 code. As a result, `sys.platform`, or the various attributes of the `platform`
-library can't be used as part of the build process.
+and `sysconfig` modules can't be used as part of the build process.
+
+_Running Python code_ for the host (cross) platform is not possible (modulo
+using an emulator), but Python packages have not designed for this. to be
+avoided. For example, `numpy` and `pybind11` ship headers and have
+`get_include()` functions in their main namespaces to obtain the path to those
+headers. That is clearly a problem, which packages dependending on those
+headers have to work around (often done by patching those packages with
+hardcoded paths within a cross-compilation setup).
 
 `pip` provides limited support for installing binaries for a different platform
 by specifying a `--platform`, `--implementation` and `--abi` flags; however,
@@ -59,45 +109,39 @@ these flags only work for the selection of pre-built binary artefacts, and are
 therefore constrained to the set of platform and ABI tags published by the
 author.
 
+
 ## History
 
-Tools like [crossenv](https://github.com/benfogle/crossenv) can be used to trick
-Python into performing cross-platform builds. These tools use path hacks and
-overrides of known sources of platform-specific details (like `distutils`) to
-provide a cross-compilation environment. However, these solutions tend to be
-somewhat fragile as they aren't first-class citizens of the Python ecosystem.
+TODO
 
-[The BeeWare Project](https://beeware.org) also uses a version of these
-techniques. On both platforms, BeeWare provides a custom package index that
-contains pre-compiled binaries ([Android](https://chaquo.com/pypi-7.0/);
-[iOS](https://anaconda.org/beeware/repo)). These binaries are produced using a
-set of tooling
-([Android](https://github.com/chaquo/chaquopy/tree/master/server/pypi);
-[iOS](https://github.com/freakboy3742/chaquopy/tree/iOS-support/server/pypi))
-that is analogous to the tools used by conda-forge to build binary artefacts.
 
 ## Relevant resources
 
-TODO
+- ["Towards standardizing cross compiling "](https://discuss.python.org/t/towards-standardizing-cross-compiling/10357), Ben Fogle (2021),
+- ["PEP xxxx - Standardized Config Settings for Cross-Compiling"](https://github.com/benfogle/peps/blob/master/pep-9999.rst), Ben Fogle (2021),
+- [scipy#14812 - Tracking issue for cross-compilation needs and issues](https://github.com/scipy/scipy/issues/14812) (2021),
+
 
 ## Potential solutions or mitigations
 
 At the core, what is required is a recognition that the use case of
 cross-platform builds is something that the Python ecosystem should support.
 
-In concrete terms, for native modules, this would require some combination of:
-
-1. Extension of the PEP517 interface to allow communicating the desired target
-   platform as part of a binary build; or
-
-2. Formalization of the "platform identification" interface that can used by
-   PEP517 build backends to identify the target platform, so that tools like
-   `crossenv` can provide a reliable proxied environment for cross-platform
-   builds.
+In concrete terms, for native modules, this would require at least:
 
-3. Clear separation of metadata associated with the definition of build and
+1. Making it possible to retrieve relevant metadata from a Python installation
+   without having to run Python code.
+2. Clear separation of metadata associated with the definition of build and
    target platforms, rather than assuming that build and target platform will
    always be the same.
 
-4. Extension of the PEP517 interface to report when a build steps (e.g., running
-   a code generation tool) cannot be run on the target hardware.
+In addition, to make cross-compilation easier to use and move from build system
+specific configuration files - like a "toolchain file" for CMake or a "cross
+file" for Meson - to a standardized version:
+
+3. Extension of the `pyproject.toml` build interface to allow communicating the
+   desired target platform as part of a binary build; or
+4. Formalization of the "platform identification" interface that can used by
+   build backends to identify the target platform, so that tools like
+   `crossenv` can provide a reliable proxied environment for cross-platform
+   builds.

From 755685055099f9143be6ae45ad5a56bceb88c814 Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Fri, 10 Mar 2023 20:00:59 +0000
Subject: [PATCH 14/30] Resolve the last cross-compilation comment, on `pip
 --platform`

---
 docs/key-issues/cross_platform.md | 54 +++++++++++++++++++------------
 1 file changed, 33 insertions(+), 21 deletions(-)

diff --git a/docs/key-issues/cross_platform.md b/docs/key-issues/cross_platform.md
index 4d3a002..e80fb45 100644
--- a/docs/key-issues/cross_platform.md
+++ b/docs/key-issues/cross_platform.md
@@ -24,10 +24,10 @@ compiled for the target platform.
 macOS also experiences this as a result of the Apple Silicon transition. Apple
 has provided the tools to make cross compilation from x86-64 to arm64 as easy
 as possible, as well as to compile [fat binaries](multiple_architectures.md)
-(supporting x86-64 and arm64 at the same time) on x86-64 hardware. In the latter
-case, the host platform (macOS on x86-64) will still be one of the outputs
-of the compilation process, and the resulting binary will run on the CI/CD
-system.
+(supporting x86-64 and arm64 at the same time) on both architectures. In the
+latter case, the host platform will still be one of the outputs of the
+compilation process, and the resulting binary will run on the CI/CD system.
+
 
 ## Current state
 
@@ -72,9 +72,10 @@ ecosystems often have toolchains and supporting infrastructure to cross-compile
 
 Tools like [crossenv](https://github.com/benfogle/crossenv) can be used to trick
 Python into performing cross-platform builds. These tools use path hacks and
-overrides of known sources of platform-specific details (like `distutils`) to
-provide a cross-compilation environment. However, these solutions tend to be
-somewhat fragile as they aren't first-class citizens of the Python ecosystem.
+overrides of known sources of platform-specific details (like `sysconfig` and
+`distutils`) to provide a cross-compilation environment. However, these
+solutions tend to be somewhat fragile as they aren't first-class citizens of
+the Python ecosystem.
 
 [The BeeWare Project](https://beeware.org) also uses a version of these
 techniques. For both the platforms it supports, BeeWare provides a custom
@@ -88,7 +89,7 @@ that is analogous to the tools used by conda-forge to build binary artefacts.
 ## Problems
 
 There is currently a gap in _communicating target platform details to the
-build system_. While a build system like autoconf or CMake may support
+build system_. While a build system like Meson or CMake may support
 cross-platform compilation, and a project may be able to cross-compile binary
 artefacts, invocation of a `pyproject.toml` build hook typically assumes that the
 platform running the build will be the platform that ultimately runs the Python
@@ -96,19 +97,30 @@ code. As a result, `sys.platform`, or the various attributes of the `platform`
 and `sysconfig` modules can't be used as part of the build process.
 
 _Running Python code_ for the host (cross) platform is not possible (modulo
-using an emulator), but Python packages have not designed for this. to be
-avoided. For example, `numpy` and `pybind11` ship headers and have
-`get_include()` functions in their main namespaces to obtain the path to those
-headers. That is clearly a problem, which packages dependending on those
-headers have to work around (often done by patching those packages with
-hardcoded paths within a cross-compilation setup).
-
-`pip` provides limited support for installing binaries for a different platform
-by specifying a `--platform`, `--implementation` and `--abi` flags; however,
-these flags only work for the selection of pre-built binary artefacts, and are
-therefore constrained to the set of platform and ABI tags published by the
-author.
-
+using an emulator), but Python packages have not taken this into account and
+provided ways to avoid the need to run the host interpreter. For example,
+`numpy` and `pybind11` ship headers and have `get_include()` functions in their
+main namespaces to obtain the path to those headers. That is clearly a problem,
+which packages dependending on those headers have to work around (often done by
+patching those packages with hardcoded paths within a cross-compilation setup).
+
+`pip` provides support for installing wheels for a different platform
+by specifying a `--platform`, `--implementation` and `--abi` flags. However,
+these flags only work for packages with wheels, not sdists. Therefore, for
+cross compilation setups that rely on `pip` rather than another package manager
+to install build dependencies, it is cumbersome in practice to prepare the host
+(non-native) part of the cross build environment - a single missing `-none-any`
+wheel for a dependency that is pure Python necessitates hacks to get it
+installed.[^2]
+
+[^2]:
+    The correct solution - filing issues on each project asking them to upload
+    a `-none-any` wheel next to their sdist - typically has a long lead time.
+    Therefore [Briefcase](https://beeware.org/project/projects/tools/briefcase/), the
+    packaging tool for Beeware, patches `pip` to allow installing projects from
+    sdists when `--platform` is specified and only error out when the wheel
+    build attempts to invoke a compiler. That way, pure Python packages can be
+    installed directly.
 
 ## History
 

From 49806e206428e998cb8dc470dc15b5e26294e8cc Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Fri, 10 Mar 2023 20:20:12 +0000
Subject: [PATCH 15/30] Put back link to "multiple architectures" page from
 cross compile page

---
 docs/key-issues/cross_compilation.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/key-issues/cross_compilation.md b/docs/key-issues/cross_compilation.md
index 27b16b6..88ed65e 100644
--- a/docs/key-issues/cross_compilation.md
+++ b/docs/key-issues/cross_compilation.md
@@ -23,7 +23,7 @@ compiled for the target platform.
 
 macOS also experiences this as a result of the Apple Silicon transition. Apple
 has provided the tools to make cross compilation from x86-64 to arm64 as easy
-as possible, as well as to compile fat binaries
+as possible, as well as to compile [fat binaries](multiple_architectures.md)
 (supporting x86-64 and arm64 at the same time) on both architectures. In the
 latter case, the host platform will still be one of the outputs of the
 compilation process, and the resulting binary will run on the CI/CD system.

From ea1fb609afe2547bc3b4cf98caf6f999acde8312 Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Fri, 10 Mar 2023 20:22:18 +0000
Subject: [PATCH 16/30] Remove the `cross_platform.md` file

---
 docs/key-issues/cross_platform.md | 159 ------------------------------
 1 file changed, 159 deletions(-)
 delete mode 100644 docs/key-issues/cross_platform.md

diff --git a/docs/key-issues/cross_platform.md b/docs/key-issues/cross_platform.md
deleted file mode 100644
index e80fb45..0000000
--- a/docs/key-issues/cross_platform.md
+++ /dev/null
@@ -1,159 +0,0 @@
-# Cross compilation
-
-The historical assumption of compilation is that the platform where the code is
-compiled will be the same as the platform where the final code will be executed
-(if not literally the same machine, then at least one that is CPU and ABI
-compatible at the operating system level). This is a reasonable assumption for
-most desktop platforms; however, for some platforms, this isn't the case.
-
-On mobile platforms, an app is compiled on a desktop platform, and transferred
-to the mobile device (or a simulator) for testing. The compiler is not executed
-on device. Therefore, it must be possible to build a binary artefact for a CPU
-architecture and an ABI that is different from the platform that is running the
-compiler. The situation is similar for embedded devices.
-
-Cross compilation issues also emerge when dealing with continuous
-integration/deployment (CI/CD). CI/CD platforms (such as Github Actions)
-generally provide the "common" architectures - often only x86-64 - however, a
-project may want to produce binaries for other platforms (e.g., ARM support for
-Raspberry Pi devices; PowerPC or s390x for mainframe/server devices; or for
-mobile platforms). These binaries won't run natively on the host CI/CD system
-(without some sort of emulation, for example with QEMU); but code can be
-compiled for the target platform.
-
-macOS also experiences this as a result of the Apple Silicon transition. Apple
-has provided the tools to make cross compilation from x86-64 to arm64 as easy
-as possible, as well as to compile [fat binaries](multiple_architectures.md)
-(supporting x86-64 and arm64 at the same time) on both architectures. In the
-latter case, the host platform will still be one of the outputs of the
-compilation process, and the resulting binary will run on the CI/CD system.
-
-
-## Current state
-
-Native compiler and build toolchains (e.g., autoconf/automake, CMake, Meson) have long
-supported cross-compilation; however, such cross-compilation capabilities for any
-given project tend to bitrot and break easily unless they are exercised regularly.
-
-CPython's build system includes some support for cross-compilation. This support
-is largely based on leveraging autoconf's support for cross compilation. This
-support wasn't well integrated into `distutils` and the compilation of the binary
-portions of stdlib. The removal of `distutils` in Python 3.12 represents an
-improvement the overall situation, but there is still a long way to go before
-the ecosystem as a whole has fully integrated the consequences of this change.
-
-The way build backend hooks in `pyproject.toml` are specified (see PEP 517)
-means cross-platform compilation support has been partially converted into a
-concern for individual build systems to manage.
-
-In order to cross-compile a Python package, one needs a compiler toolchain as
-well as two Python installs - one for the build system and one for the host
-system.[^1] This can make it a little challenging to get started. If a compiler
-toolchain is not already provided on the system of interest, it can be built
-from source with, e.g., [crosstool-ng](https://crosstool-ng.github.io/) or
-obtained from, e.g., [dockcross](https://github.com/dockcross/dockcross).
-Or one can use a packaging system that has builtin support for cross-compilation.
-[The Yocto Project](https://www.yoctoproject.org/),
-[OpenEmbedded](https://www.openembedded.org/wiki/Main_Page) and
-[Buildroot](https://buildroot.org/) are projects specifically focused on
-cross-compilation for Linux embedded systems. More general-purpose packaging
-ecosystems often have toolchains and supporting infrastructure to cross-compile packages for their own needs - see, e.g., info for
-[Void Linux](https://github.com/void-linux/void-packages#cross-compiling),
-[conda-forge](https://conda-forge.org/),
-[Debian](https://wiki.debian.org/CrossCompiling) and
-[Nix](https://nixos.org/guides/cross-compilation.html).
-
-[^1]:
-    The "build", "host" and "target" terminology for identifying which system
-    is which in a cross-compilation setup is not consistent across build
-    systems and packaging tools. Always carefully check whether "build" means
-    the machine on which the compilation is run and "host" the machine on which
-    the produced binaries will run - or vice versa.
-
-Tools like [crossenv](https://github.com/benfogle/crossenv) can be used to trick
-Python into performing cross-platform builds. These tools use path hacks and
-overrides of known sources of platform-specific details (like `sysconfig` and
-`distutils`) to provide a cross-compilation environment. However, these
-solutions tend to be somewhat fragile as they aren't first-class citizens of
-the Python ecosystem.
-
-[The BeeWare Project](https://beeware.org) also uses a version of these
-techniques. For both the platforms it supports, BeeWare provides a custom
-package index that contains pre-compiled binaries ([Android](https://chaquo.com/pypi-7.0/);
-[iOS](https://anaconda.org/beeware/repo)). These binaries are produced using a
-set of tooling ([Android](https://github.com/chaquo/chaquopy/tree/master/server/pypi);
-[iOS](https://github.com/freakboy3742/chaquopy/tree/iOS-support/server/pypi))
-that is analogous to the tools used by conda-forge to build binary artefacts.
-
-
-## Problems
-
-There is currently a gap in _communicating target platform details to the
-build system_. While a build system like Meson or CMake may support
-cross-platform compilation, and a project may be able to cross-compile binary
-artefacts, invocation of a `pyproject.toml` build hook typically assumes that the
-platform running the build will be the platform that ultimately runs the Python
-code. As a result, `sys.platform`, or the various attributes of the `platform`
-and `sysconfig` modules can't be used as part of the build process.
-
-_Running Python code_ for the host (cross) platform is not possible (modulo
-using an emulator), but Python packages have not taken this into account and
-provided ways to avoid the need to run the host interpreter. For example,
-`numpy` and `pybind11` ship headers and have `get_include()` functions in their
-main namespaces to obtain the path to those headers. That is clearly a problem,
-which packages dependending on those headers have to work around (often done by
-patching those packages with hardcoded paths within a cross-compilation setup).
-
-`pip` provides support for installing wheels for a different platform
-by specifying a `--platform`, `--implementation` and `--abi` flags. However,
-these flags only work for packages with wheels, not sdists. Therefore, for
-cross compilation setups that rely on `pip` rather than another package manager
-to install build dependencies, it is cumbersome in practice to prepare the host
-(non-native) part of the cross build environment - a single missing `-none-any`
-wheel for a dependency that is pure Python necessitates hacks to get it
-installed.[^2]
-
-[^2]:
-    The correct solution - filing issues on each project asking them to upload
-    a `-none-any` wheel next to their sdist - typically has a long lead time.
-    Therefore [Briefcase](https://beeware.org/project/projects/tools/briefcase/), the
-    packaging tool for Beeware, patches `pip` to allow installing projects from
-    sdists when `--platform` is specified and only error out when the wheel
-    build attempts to invoke a compiler. That way, pure Python packages can be
-    installed directly.
-
-## History
-
-TODO
-
-
-## Relevant resources
-
-- ["Towards standardizing cross compiling "](https://discuss.python.org/t/towards-standardizing-cross-compiling/10357), Ben Fogle (2021),
-- ["PEP xxxx - Standardized Config Settings for Cross-Compiling"](https://github.com/benfogle/peps/blob/master/pep-9999.rst), Ben Fogle (2021),
-- [scipy#14812 - Tracking issue for cross-compilation needs and issues](https://github.com/scipy/scipy/issues/14812) (2021),
-
-
-## Potential solutions or mitigations
-
-At the core, what is required is a recognition that the use case of
-cross-platform builds is something that the Python ecosystem should support.
-
-In concrete terms, for native modules, this would require at least:
-
-1. Making it possible to retrieve relevant metadata from a Python installation
-   without having to run Python code.
-2. Clear separation of metadata associated with the definition of build and
-   target platforms, rather than assuming that build and target platform will
-   always be the same.
-
-In addition, to make cross-compilation easier to use and move from build system
-specific configuration files - like a "toolchain file" for CMake or a "cross
-file" for Meson - to a standardized version:
-
-3. Extension of the `pyproject.toml` build interface to allow communicating the
-   desired target platform as part of a binary build; or
-4. Formalization of the "platform identification" interface that can used by
-   build backends to identify the target platform, so that tools like
-   `crossenv` can provide a reliable proxied environment for cross-platform
-   builds.

From d249af6004ea90555a72015660a0f8ae821b0dc2 Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Fri, 10 Mar 2023 20:33:56 +0000
Subject: [PATCH 17/30] Fix some formatting and typo issues

---
 docs/key-issues/multiple_architectures.md | 31 ++++++++++++-----------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 0e4440e..c5765b2 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -2,17 +2,18 @@
 
 One important subset of ABI concerns is the CPU architecture for which a binary
 artefact has been built. Attempting to run a binary on hardware that doesn't
-match the CPU architecture (or architecture variant [^1]) for which the binary
+match the CPU architecture (or architecture variant[^1]) for which the binary
 was built will generally lead to crashes, even if the ABI being used is
 otherwise compatible.
 
-[^1] e.g., the x86-64 architecture has a range of well-known extensions, such as
-     SSE, SSE2, SSE3, AVX, AVX2, AVX512, etc.
+[^1]:
+    E.g., the x86-64 architecture has a range of well-known extensions, such as
+    SSE, SSE2, SSE3, AVX, AVX2, AVX512, etc.
 
 ## Current state
 
 Historically, it could be assumed that an executable or library would be
-compiled for a single CPU archicture. On the rare occasion that an operating
+compiled for a single CPU architecture. On the rare occasion that an operating
 system was available for mulitple CPU architectures, it became the
 responsibility of the user to find (or compile) a binary that was compiled for
 their host CPU architecture.
@@ -105,11 +106,11 @@ isn't).
 
 To support the transition to Apple Silicon/M1 (ARM64), Python has introduced a
 `universal2` architecture target. This is effectively a "fat wheel" format; the
-`.dylib` files contained in the wheel are fat binaries containing both x86_64
+`.dylib` files contained in the wheel are fat binaries containing both x86-64
 and ARM64 slices.
 
 iOS has an additional complication of requiring support for mutiple *ABIs* in
-addition to multiple CPU archiectures. The ABI for the iOS simulator and
+addition to multiple CPU architectures. The ABI for the iOS simulator and
 physical iOS devices are different; however, ARM64 is a supported CPU
 architecture for both. As a result, it is not possible to produce a single fat
 library that supports both the iOS simulator and iOS devices. Apple provides an
@@ -130,7 +131,7 @@ multi-architecture configuration, and involves a number of specific
 accomodations in the Python ecosystem (e.g., a macOS-specific architecture
 lookup scheme).
 
-Supporting iOS requires supporting between 2 and 5 architectures (x86_64 and
+Supporting iOS requires supporting between 2 and 5 architectures (x86-64 and
 ARM64 at the minimum), and at least 2 ABIs - the iOS simulator and iOS device
 have different (and incompatible) binary ABIs. At runtime, iOS expects to find a
 single "fat" binary for the ABI that is in use. iOS effectively requires an
@@ -175,7 +176,7 @@ single-architecture, single ABI wheels into a fat wheel.
 
 On iOS, BeeWare-supplied iOS binary packages provide a single "iPhone" wheel.
 This wheel includes 2 binary libraries (one for the iPhone device ABI, and one
-for the iPhone Simulator ABI); the iPhone simulator binary includes x86_64 and
+for the iPhone Simulator ABI); the iPhone simulator binary includes x86-64 and
 ARM64 slices. This is effectively the "universal-iphone" approach, encoding a
 specific combination of ABIs and architectures.
 
@@ -185,10 +186,10 @@ each platform; it also contains a wrapper around `pip` to manage the
 installation of multiple binaries. When a Python project requests the
 installation of a package:
 
-* Pip is run normally for one binary architecture
+* Pip is run normally for one binary architecture,
 * The `.dist-info` metadata is used to identify the native packages -  both
   those directly requested by the user, and those installed as indirect
-  requirements by pip
+  requirements by pip,
 * The native packages are separated from the pure-Python packages, and pip is
   then run again for each of the remaining architectures; this time, only those
   specific native packages are installed, pinned to the same versions that pip
@@ -203,15 +204,15 @@ platform including build support for libraries that may be required.
 
 To date, there haven't been extensive public discussions about the support of
 iOS or Android binary packages. However, there were discussions around the
-adoption of universal2 for macOS:
+adoption of `universal2` for macOS:
 
-* [The CPython discussion about universal2
+* [The CPython discussion about `universal2`
   support](https://discuss.python.org/t/apple-silicon-and-packaging/4516)
-* [The addition of universal2 to
+* [The addition of `universal2` to
   CPython](https://github.com/python/cpython/pull/22855)
 * [Support in packaging for
-  universal2](https://github.com/pypa/packaging/pull/319), which declares the
-  logic around resolving universal2 to specific platforms.
+  `universal2`](https://github.com/pypa/packaging/pull/319), which declares the
+  logic around resolving `universal2` to specific platforms.
 
 ## Potential solutions or mitigations
 

From 50d8c26e5a333f893a649eb72de03033e5d002b3 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Mon, 20 Mar 2023 12:40:27 +0800
Subject: [PATCH 18/30] Revisions to multi-architecture notes following review.

---
 docs/key-issues/multiple_architectures.md | 42 +++++++++++------------
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index c5765b2..2019d9f 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -12,14 +12,7 @@ otherwise compatible.
 
 ## Current state
 
-Historically, it could be assumed that an executable or library would be
-compiled for a single CPU architecture. On the rare occasion that an operating
-system was available for mulitple CPU architectures, it became the
-responsibility of the user to find (or compile) a binary that was compiled for
-their host CPU architecture.
-
-However, we now see operating system platforms where multiple CPU architectures
-are supported:
+Most operating systems support multiple CPU architectures:
 
 * In the early days of Windows NT, both x86 and DEC Alpha CPUs were supported
 * Windows 10 supports x86, x86-64, ARMv7 and ARM64; Windows 11 supports x86-64
@@ -36,15 +29,13 @@ are supported:
 * Android currently supports ARMv7, ARM64, x86, and x86-64; it has historically
   also supported ARMv5 and MIPS
 
+The general expectation is that an executable or library is compiled for a
+single CPU archicture.
+
 CPU architecture compatibility is a necessary, but not sufficient criterion for
 determining binary compatibility. Even if two binaries are compiled for the same
 CPU architecture, that doesn't guarantee [ABI compatibility](abi.md).
 
-In some respects, CPU architecture compatibility could be considered a superset
-of [GPU compatibility](gpus.md). When dealing with multiple CPU architectures,
-there may be some overlap with the solutions that can be used to support GPUs in
-native binaries.
-
 Three approaches have emerged on operating systmes that have a need to manage
 multiple CPU architectures:
 
@@ -55,6 +46,10 @@ that is by Windows and Linux. At time of distribution, an installer or other
 downloadable artefact is provided for each supported platform, and it is up to
 the user to select and download the correct artefact.
 
+At present, the Python ecosystem almost exclusively uses the "multiple binary"
+solution. This serves the needs of Windows and Linux well, as it matches the
+way end-users interact with binaries on those platforms.
+
 ### Archiving
 
 The approach taken by Android is very similar to the multiple binary approach,
@@ -91,7 +86,10 @@ Fat binaries can be compiled in two ways:
    compiler to generate multiple output architectures in the output binary
 2. **Multiple pass** After compiling a binary for each platform, Apple provides
    a call named `lipo` to combine multiple single-architecture binaries into a
-   single fat binary that contains all platforms.
+   single fat binary that contains all platforms. The `delocate-fuse` command
+   provided by the [delocate](https://pypi.org/project/delocate/) Python package
+   can be used to perform this merging on Python wheels (along with other
+   functionality).
 
 At runtime, the operating system loads the binary slice for the current CPU
 architecture, and the linker loads the appropriate slice from the fat binary of
@@ -121,15 +119,15 @@ physical devices.
 
 ## Problems
 
-At present, the Python ecosystem almost exclusively uses the "multiple binary"
-solution. This serves the needs of Windows and Linux well, as it matches the
-way end-users interact with binaries.
+The problems that exist with supporting multiple architectures are limited to
+those platforms that expect distributable artefacts to support multiple
+platforms simultanously - macOS, iOS and Android.
 
-The `universal2` "fat wheel" solution also works well for macOS. The definition
-of `universal2` is a hard-coded accomodation for one specific (albeit common)
-multi-architecture configuration, and involves a number of specific
-accomodations in the Python ecosystem (e.g., a macOS-specific architecture
-lookup scheme).
+Although the `universal2` "fat wheel" format exists, there is some resistance to
+using this format in some circles (in particular in the science/data ecosystem).
+If a package publishes independent wheels for x86_64 and M1, there's no
+ecosystem-level tooling for consuming those artefacts. However, ad-hoc approaches
+using `delocate` or `lipo` can be used.
 
 Supporting iOS requires supporting between 2 and 5 architectures (x86-64 and
 ARM64 at the minimum), and at least 2 ABIs - the iOS simulator and iOS device

From a9776e0cc0713a5ec16a50924aa8fb265512d50c Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Tue, 21 Mar 2023 12:24:03 +0000
Subject: [PATCH 19/30] Add foldout for pros and cons of `universal2` wheels

---
 docs/key-issues/multiple_architectures.md | 62 +++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 2019d9f..0df4827 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -107,6 +107,68 @@ To support the transition to Apple Silicon/M1 (ARM64), Python has introduced a
 `.dylib` files contained in the wheel are fat binaries containing both x86-64
 and ARM64 slices.
 
+
+??? question "What's the deal with `universal2` wheels?"
+
+    The `universal2` fat wheel format has generated quite a bit of discussion,
+    and isn't well-supported by either packaging tools (e.g., there is no way
+    to install a `universal2` wheel from PyPI if thin wheels are also present)
+    or package authors (most numerical, scientific and ML/AI package authors do
+    not provide them). There are some arguments for and against supporting the
+    format or even defaulting to it.
+
+    Arguments for (Russell to write):
+
+    - xxx
+
+    Arguments against:
+
+    - `universal2` wheels are never necessary for end users, they are only an
+      intermediate stage for workflows and tooling to build macOS apps (`.dmg`
+      downloadable installers or similar formats, produced by for example
+      [py2app](https://py2app.readthedocs.io) or
+      [briefcase](https://beeware.org/project/projects/tools/briefcase/)).
+    - The tradeoff between download size and disk space usage vs. the upside
+      for say a .dmg installer is bad - for a typical PyData stack it takes
+      hundreds of MBs per Python environment more than thin wheels, and users
+      are likely to have quite a few environments on their system at once.
+      Meaning that defaulting to `universal2` would use several GBs of disk
+      space more.
+
+        - Disk space on the base MacBook models is 128 GB, and up to half of that
+          can be taken up by the OS and system data itself. So a few GBs is
+          significant.
+        - Internet plans in many countries are not unlimited; almost doubling the
+          download size of wheels is a serious cost, and not desirable for any
+          user - but especially unfriendly to users in countries where network
+          infrastructure is less developed.
+
+    - In addition, it takes extra space on PyPI (examples: `universal2` wheels
+      cost an extra 81.5 MB for NumPy 1.21.4 and 175.5 MB for SciPy 1.9.1), and
+      projects with large wheels often run into total size limits on PyPI.
+    - It imposes an extra maintenance burden for each project, because separate
+      CI jobs are needed to build and test `universal2` wheels. Typically
+      projects make tradeoffs there, because they cannot support every
+      platform. And `universal2` doesn't meet the bar for usage frequency /
+      user demand here - it's well below the demand for `musllinux`,
+      `ppc64le`, PyPy, and other such platforms with still patchy support
+      (see [Expectations that projects provide ever more wheels](../../meta-topics/user_expectations_wheels.md)
+      for more on that).
+    - When a project provides thin wheels (which is a must-do for projects with
+      native code, because those are the better experience due to smaller
+      size), you cannot even install a `universal2` wheel with pip from PyPI at
+      all. Why upload artifacts you cannot install?
+    - It is straightforward to fuse two thin wheels with `delocate-fuse` (a
+      tool that comes with [delocate](https://pypi.org/project/delocate/)),
+      it's a one-liner: `delocate-fuse $x86-64_wheel $arm64_wheel -w .`
+    - Open source projects rely on freely available CI systems to support
+      particular hardware architectures. CI support for macOS `arm64` was a
+      problem at first, but is now available through Cirrus CI. And that
+      availability is expected to grow over time; GitHub Actions and other
+      providers will roll out support at some point. This allows building thin
+      wheels and run tests - which is nicer than building `universal2` wheels
+      on x86-64 and testing only the x86-64 part of those wheels.
+
 iOS has an additional complication of requiring support for mutiple *ABIs* in
 addition to multiple CPU architectures. The ABI for the iOS simulator and
 physical iOS devices are different; however, ARM64 is a supported CPU

From 8d46e06b1c459c219e013f2c70e0131b5a5e2c48 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Wed, 22 Mar 2023 07:57:41 +0800
Subject: [PATCH 20/30] Add the 'for' arguments for universal2.

---
 docs/key-issues/multiple_architectures.md | 38 +++++++++++++++++------
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 0df4827..78092b5 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -117,17 +117,35 @@ and ARM64 slices.
     not provide them). There are some arguments for and against supporting the
     format or even defaulting to it.
 
-    Arguments for (Russell to write):
-
-    - xxx
+    Arguments for:
+
+    - While users with a technical background are usually aware of the CPU
+      architecture of their machine, less technical end users are often
+      unaware of this detail (or of the significance of this detail). The
+      universal2 wheel format allows users to largely ignore this detail, as
+      all CPU architectures for the macOS platform are accomodated in a single
+      binary artefact.
+    - macOS has developed an ecosystem where end users expect that any macOS
+      binary will run on any macOS machine, regardless of CPU architecture.
+      As a result, when building macOS apps (`.dmg` downloadable installers
+      or similar formats, produced by tools such as
+      [py2app](https://py2app.readthedocs.io) or
+      [briefcase](https://beeware.org/project/projects/tools/briefcase/)),
+      the person building the project must be accommodate all possible CPU
+      architectures where the code *could* be executed.
+    - If binary wheels are only available in "thin" format, any issues with
+      merging those wheels into fat equivalents for distribution purposes are
+      deferred to the end user. This can be problematic as it may require
+      expert knowledge about the package being merged (such as optional
+      modules or header files that may not be present in both thin artefacts).
+      Universal2 artefacts captures this knowledge by requiring the project
+      maintainers to resolve any merging issues.
 
     Arguments against:
 
-    - `universal2` wheels are never necessary for end users, they are only an
-      intermediate stage for workflows and tooling to build macOS apps (`.dmg`
-      downloadable installers or similar formats, produced by for example
-      [py2app](https://py2app.readthedocs.io) or
-      [briefcase](https://beeware.org/project/projects/tools/briefcase/)).
+    - `universal2` wheels are never necessary for end users installing into
+      a locally installed Python environment exclusively for their own use,
+      which is the default experience most users have with Python.
     - The tradeoff between download size and disk space usage vs. the upside
       for say a .dmg installer is bad - for a typical PyData stack it takes
       hundreds of MBs per Python environment more than thin wheels, and users
@@ -135,8 +153,8 @@ and ARM64 slices.
       Meaning that defaulting to `universal2` would use several GBs of disk
       space more.
 
-        - Disk space on the base MacBook models is 128 GB, and up to half of that
-          can be taken up by the OS and system data itself. So a few GBs is
+        - Disk space on older MacBook Air models is 128 GB, and up to half of that
+          can be taken up by the OS and system data itself. So a few GBs can be
           significant.
         - Internet plans in many countries are not unlimited; almost doubling the
           download size of wheels is a serious cost, and not desirable for any

From 5d06a56535ff7a95ef90c4a3983bd393e9dbe7af Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Wed, 22 Mar 2023 08:19:50 +0800
Subject: [PATCH 21/30] Clarified 'end user' language; added note about merge
 problems.

---
 docs/key-issues/multiple_architectures.md | 30 ++++++++++++++---------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 78092b5..ec5dacf 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -120,11 +120,11 @@ and ARM64 slices.
     Arguments for:
 
     - While users with a technical background are usually aware of the CPU
-      architecture of their machine, less technical end users are often
-      unaware of this detail (or of the significance of this detail). The
-      universal2 wheel format allows users to largely ignore this detail, as
-      all CPU architectures for the macOS platform are accomodated in a single
-      binary artefact.
+      architecture of their machine, less technical users are often unaware of
+      this detail (or of the significance of this detail). The universal2 wheel
+      format allows users to largely ignore this detail, as all CPU
+      architectures for the macOS platform are accomodated in a single binary
+      artefact.
     - macOS has developed an ecosystem where end users expect that any macOS
       binary will run on any macOS machine, regardless of CPU architecture.
       As a result, when building macOS apps (`.dmg` downloadable installers
@@ -135,16 +135,17 @@ and ARM64 slices.
       architectures where the code *could* be executed.
     - If binary wheels are only available in "thin" format, any issues with
       merging those wheels into fat equivalents for distribution purposes are
-      deferred to the end user. This can be problematic as it may require
-      expert knowledge about the package being merged (such as optional
-      modules or header files that may not be present in both thin artefacts).
-      Universal2 artefacts captures this knowledge by requiring the project
-      maintainers to resolve any merging issues.
+      deferred to the person downloading the wheels (i.e., the app builder).
+      This can be problematic as it may require expert knowledge about the
+      package being merged (such as optional modules or header files that may
+      not be present in both thin artefacts). Universal2 artefacts captures
+      this knowledge by requiring the maintainers of the wheel-producing
+      project to resolve any merging issues.
 
     Arguments against:
 
-    - `universal2` wheels are never necessary for end users installing into
-      a locally installed Python environment exclusively for their own use,
+    - `universal2` wheels are never necessary for users installing into a
+      locally installed Python environment exclusively for their own use,
       which is the default experience most users have with Python.
     - The tradeoff between download size and disk space usage vs. the upside
       for say a .dmg installer is bad - for a typical PyData stack it takes
@@ -179,6 +180,11 @@ and ARM64 slices.
     - It is straightforward to fuse two thin wheels with `delocate-fuse` (a
       tool that comes with [delocate](https://pypi.org/project/delocate/)),
       it's a one-liner: `delocate-fuse $x86-64_wheel $arm64_wheel -w .`
+      However, it's worth noting that this requires that any headers or python
+      files included in the wheel are consistent across all thin wheels; if this
+      isn't the case, the merged wheel will be incomplete or fragile (see
+      [this issue](https://github.com/matthew-brett/delocate/issues/180) for
+      details).
     - Open source projects rely on freely available CI systems to support
       particular hardware architectures. CI support for macOS `arm64` was a
       problem at first, but is now available through Cirrus CI. And that

From 3e1fc05a27a81b5546e6d7fbb05596de822f4b7e Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Wed, 22 Mar 2023 08:32:41 +0800
Subject: [PATCH 22/30] Clarify the state of arm64 on github actions.

---
 docs/key-issues/multiple_architectures.md | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index ec5dacf..b2b6d93 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -189,9 +189,11 @@ and ARM64 slices.
       particular hardware architectures. CI support for macOS `arm64` was a
       problem at first, but is now available through Cirrus CI. And that
       availability is expected to grow over time; GitHub Actions and other
-      providers will roll out support at some point. This allows building thin
-      wheels and run tests - which is nicer than building `universal2` wheels
-      on x86-64 and testing only the x86-64 part of those wheels.
+      providers [will roll out support at some
+      point](https://github.com/github/roadmap/issues/528). This allows
+      building thin wheels and run tests - which is nicer than building
+      `universal2` wheels on x86-64 and testing only the x86-64 part of those
+      wheels.
 
 iOS has an additional complication of requiring support for mutiple *ABIs* in
 addition to multiple CPU architectures. The ABI for the iOS simulator and

From 74705d8d4d22b317e2d4a370f2b534578ea5cb94 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Wed, 22 Mar 2023 08:36:21 +0800
Subject: [PATCH 23/30] Add reference to pip issue about universal2 wheel
 installation.

---
 docs/key-issues/multiple_architectures.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index b2b6d93..4d9d9c0 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -176,7 +176,8 @@ and ARM64 slices.
     - When a project provides thin wheels (which is a must-do for projects with
       native code, because those are the better experience due to smaller
       size), you cannot even install a `universal2` wheel with pip from PyPI at
-      all. Why upload artifacts you cannot install?
+      all. Why upload artifacts you cannot install? This is due to a [known bug
+      in pip](https://github.com/pypa/pip/issues/11573).
     - It is straightforward to fuse two thin wheels with `delocate-fuse` (a
       tool that comes with [delocate](https://pypi.org/project/delocate/)),
       it's a one-liner: `delocate-fuse $x86-64_wheel $arm64_wheel -w .`

From f46d2b00e42c60c0ed2e19928250c36f2f937ca3 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Wed, 22 Mar 2023 08:48:52 +0800
Subject: [PATCH 24/30] Fixed typo.

---
 docs/key-issues/multiple_architectures.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 4d9d9c0..6603f2e 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -131,7 +131,7 @@ and ARM64 slices.
       or similar formats, produced by tools such as
       [py2app](https://py2app.readthedocs.io) or
       [briefcase](https://beeware.org/project/projects/tools/briefcase/)),
-      the person building the project must be accommodate all possible CPU
+      the person building the project must accommodate all possible CPU
       architectures where the code *could* be executed.
     - If binary wheels are only available in "thin" format, any issues with
       merging those wheels into fat equivalents for distribution purposes are

From e1c278ff3f57fed5a2c5253e95bb48ebbbafe812 Mon Sep 17 00:00:00 2001
From: Russell Keith-Magee <russell@keith-magee.com>
Date: Wed, 22 Mar 2023 09:19:08 +0800
Subject: [PATCH 25/30] Removed subjective language.

---
 docs/key-issues/multiple_architectures.md | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 6603f2e..6168c18 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -147,12 +147,11 @@ and ARM64 slices.
     - `universal2` wheels are never necessary for users installing into a
       locally installed Python environment exclusively for their own use,
       which is the default experience most users have with Python.
-    - The tradeoff between download size and disk space usage vs. the upside
-      for say a .dmg installer is bad - for a typical PyData stack it takes
-      hundreds of MBs per Python environment more than thin wheels, and users
-      are likely to have quite a few environments on their system at once.
-      Meaning that defaulting to `universal2` would use several GBs of disk
-      space more.
+    - Using `universal2` wheels requires larger downloads and more disk
+      space - for a typical PyData stack it takes hundreds of MBs per Python
+      environment more than thin wheels, and users are likely to have quite a
+      few environments on their system at once. Meaning that defaulting to
+      `universal2` would use several GBs of disk space more.
 
         - Disk space on older MacBook Air models is 128 GB, and up to half of that
           can be taken up by the OS and system data itself. So a few GBs can be

From 1a926eb15a2431202ed55bf35d9ed619a17ad44c Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Wed, 22 Mar 2023 10:55:12 +0100
Subject: [PATCH 26/30] Apply textual/typo suggestions

Co-authored-by: Eli Schwartz <eschwartz93@gmail.com>
---
 docs/key-issues/multiple_architectures.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 6168c18..ec9e532 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -36,13 +36,13 @@ CPU architecture compatibility is a necessary, but not sufficient criterion for
 determining binary compatibility. Even if two binaries are compiled for the same
 CPU architecture, that doesn't guarantee [ABI compatibility](abi.md).
 
-Three approaches have emerged on operating systmes that have a need to manage
+Three approaches have emerged on operating systems that have a need to manage
 multiple CPU architectures:
 
 ### Multiple binaries
 
 The minimal solution is to distribute multiple binaries. This is the approach
-that is by Windows and Linux. At time of distribution, an installer or other
+that is taken by Windows and Linux. At time of distribution, an installer or other
 downloadable artefact is provided for each supported platform, and it is up to
 the user to select and download the correct artefact.
 
@@ -96,7 +96,7 @@ architecture, and the linker loads the appropriate slice from the fat binary of
 any dynamic libraries.
 
 On macOS ARM hardware, Apple also provides Rosetta as a support mechanism; if a
-user tries to run an binary that doesn't contain an ARM64 slice, but *does*
+user tries to run a binary that doesn't contain an ARM64 slice, but *does*
 contain an x86-64 slice, the x86-64 slice will be converted at runtime into an
 ARM64 binary. Complications can occur when only *some* of the binary is being
 converted (e.g., if the binary being executed is fat, but a dynamic library
@@ -195,7 +195,7 @@ and ARM64 slices.
       `universal2` wheels on x86-64 and testing only the x86-64 part of those
       wheels.
 
-iOS has an additional complication of requiring support for mutiple *ABIs* in
+iOS has an additional complication of requiring support for multiple *ABIs* in
 addition to multiple CPU architectures. The ABI for the iOS simulator and
 physical iOS devices are different; however, ARM64 is a supported CPU
 architecture for both. As a result, it is not possible to produce a single fat

From 79673831cdfe19625775b771c2ad426e9480af96 Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Wed, 22 Mar 2023 10:06:09 +0000
Subject: [PATCH 27/30] Rephrase universal2 usage frequency/demand phrasing

---
 docs/key-issues/multiple_architectures.md | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index ec9e532..95b3a5c 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -168,9 +168,11 @@ and ARM64 slices.
       CI jobs are needed to build and test `universal2` wheels. Typically
       projects make tradeoffs there, because they cannot support every
       platform. And `universal2` doesn't meet the bar for usage frequency /
-      user demand here - it's well below the demand for `musllinux`,
-      `ppc64le`, PyPy, and other such platforms with still patchy support
-      (see [Expectations that projects provide ever more wheels](../../meta-topics/user_expectations_wheels.md)
+      user demand here - it is only asked for by macOS universal app authors,
+      and in practice that demand seems to be well below the demand for
+      wheels for other platforms with still-patchy support like `musllinux`,
+      `ppc64le`, and PyPy (see
+      [Expectations that projects provide ever more wheels](../../meta-topics/user_expectations_wheels.md)
       for more on that).
     - When a project provides thin wheels (which is a must-do for projects with
       native code, because those are the better experience due to smaller

From 1fb0ffbee7f54c4eb29ad32aa91be996ab3944d3 Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Wed, 22 Mar 2023 10:12:49 +0000
Subject: [PATCH 28/30] Tone down the statement on "must provide thin wheels"

---
 docs/key-issues/multiple_architectures.md | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 95b3a5c..a0ce5ba 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -174,11 +174,12 @@ and ARM64 slices.
       `ppc64le`, and PyPy (see
       [Expectations that projects provide ever more wheels](../../meta-topics/user_expectations_wheels.md)
       for more on that).
-    - When a project provides thin wheels (which is a must-do for projects with
-      native code, because those are the better experience due to smaller
-      size), you cannot even install a `universal2` wheel with pip from PyPI at
-      all. Why upload artifacts you cannot install? This is due to a [known bug
-      in pip](https://github.com/pypa/pip/issues/11573).
+    - When a project provides thin wheels (which should be done when projects
+      have large compiled extensions, because of the the better experience for
+      the main use case of wheels - users installing for use on their own
+      machine), you cannot even install a `universal2` wheel with pip from PyPI
+      at all. Why upload artifacts you cannot install? This is due to a [known
+      bug in pip](https://github.com/pypa/pip/issues/11573).
     - It is straightforward to fuse two thin wheels with `delocate-fuse` (a
       tool that comes with [delocate](https://pypi.org/project/delocate/)),
       it's a one-liner: `delocate-fuse $x86-64_wheel $arm64_wheel -w .`

From b44a3220bc9b641398444b6f06ed7514bcc34e8c Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Wed, 22 Mar 2023 10:24:25 +0000
Subject: [PATCH 29/30] Rephrase note on needed robustness improvements in
 delocate-fuse

---
 docs/key-issues/multiple_architectures.md | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index a0ce5ba..6452580 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -183,11 +183,13 @@ and ARM64 slices.
     - It is straightforward to fuse two thin wheels with `delocate-fuse` (a
       tool that comes with [delocate](https://pypi.org/project/delocate/)),
       it's a one-liner: `delocate-fuse $x86-64_wheel $arm64_wheel -w .`
-      However, it's worth noting that this requires that any headers or python
-      files included in the wheel are consistent across all thin wheels; if this
-      isn't the case, the merged wheel will be incomplete or fragile (see
-      [this issue](https://github.com/matthew-brett/delocate/issues/180) for
-      details).
+      Note though that robustness improvements in `delocate-fuse` for more
+      complex cases (e.g., generated header files with architecture-dependent
+      content) are needed (see
+      [delocate#180](https://github.com/matthew-brett/delocate/issues/180)).
+      Such cases are likely to be equally problematic for direct `universal2`
+      wheel builds (see, e.g.,
+      [numpy#22805](https://github.com/numpy/numpy/pull/22805)).
     - Open source projects rely on freely available CI systems to support
       particular hardware architectures. CI support for macOS `arm64` was a
       problem at first, but is now available through Cirrus CI. And that

From dd93f1fe744534a3e133b4300831e9f4e19b6161 Mon Sep 17 00:00:00 2001
From: Ralf Gommers <ralf.gommers@gmail.com>
Date: Wed, 22 Mar 2023 10:37:41 +0000
Subject: [PATCH 30/30] Add "first-class support for fusing thin wheels" as a
 potential solution

---
 docs/key-issues/multiple_architectures.md | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/docs/key-issues/multiple_architectures.md b/docs/key-issues/multiple_architectures.md
index 6452580..c0814c5 100644
--- a/docs/key-issues/multiple_architectures.md
+++ b/docs/key-issues/multiple_architectures.md
@@ -307,9 +307,17 @@ adoption of `universal2` for macOS:
 
 ## Potential solutions or mitigations
 
-There are two approaches that could be used to provide a general solution to
-this problem, depending on whether the support of multiple architectures is
-viewed as a distribution or integration problem.
+For macOS universal app builders, first-class tooling in the Python ecosystem
+to fuse thin wheels is needed. This may be done by, for example, making
+`delocate-fuse` more robust (see
+[delocate#180](https://github.com/matthew-brett/delocate/issues/180)).
+and then making `delocate` itself more visible or merging `delocate` into
+`auditwheel`. This tooling is then available as shared tooling to support
+universal app builders like `py2app` and `briefcase`.
+
+For the general multiple architecture case, there are two approaches that could
+be used to provide a solution to this problem, depending on whether the support
+of multiple architectures is viewed as a distribution or integration problem.
 
 ### Distribution-based solution