Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting builds with Python's Limited API #574

Closed
JustAnotherArchivist opened this issue Dec 19, 2022 · 8 comments
Closed

Supporting builds with Python's Limited API #574

JustAnotherArchivist opened this issue Dec 19, 2022 · 8 comments
Labels
wontfix This will not be worked on

Comments

@JustAnotherArchivist
Copy link
Collaborator

Following a bit of discussion in #572, I'm filing a separate issue for a decision on whether or not this is something we want to worry about in the future.

The Limited API is a subset of the Python C API which is guaranteed not to change in future minor versions. This allows building a single 'abi3' wheel that will work on any Python version 3.x, past, present, or future, higher than some minimum version (depending on which APIs are used).

Our current code makes use of an #ifndef Py_LIMITED_API switch in the PyUnicodechar* conversion to take advantage of an optimisation for certain strings, which is not part of the Limited API. This originates from #417. However, ujson currently does not compile with the Limited API; this line in the decoder, introduced in #555, uses a function not part of the Limited API as of 3.11. I haven't tested further or with earlier Python versions. #573 uses a function that has only been added to the Limited API in 3.11, so it wouldn't build on 3.7 through 3.10.

The main advantage of using the Limited API (that I'm aware of) is the aforementioned single compilation for all 3.x versions. As a side effect, an abi3 wheel would also allow users to still easily install an old ujson version on a newer Python in the future without having to compile it themselves.

The disadvantage is that #573 would be impossible and #555 would have to be reworked (which seems tricky, if it's even possible at all). More such cases may be present or arise in the future, and we'd be limiting ourselves in what we could add or significantly increasing the maintenance burden of constantly implementing workarounds for such builds. As with the example above, it will also have performance impacts.


My own opinion is that the advantages are negligible and the downsides are very significant. Building separate wheels for each supported Python version is trivial with the fully automated release process, as @bwoodsend mentioned in #572, and adding support for a new Python version (assuming no backward incompatibilities) merely requires adding the version number in a couple places. If someone wants to run version combinations of ujson and Python that haven't been tested and aren't officially supported, they should have to jump through hoops before they shoot themselves in the foot. In short, while the Limited API is neat, it's too limited in some cases at this time and not remotely worth the effort.

@JustAnotherArchivist
Copy link
Collaborator Author

(Pinging @tekknolagi, author of #417, who may have an opinion on this topic.)

@tekknolagi
Copy link
Contributor

Supporting the limited API or HPy is a good way to ensure support across versions and even across completely different runtimes. My primary motivation was getting ultrajson working on skybison, which supported the limited API and a little bit more.

First-class limited HPy support would give you PyPy, etc, as well.

@tekknolagi
Copy link
Contributor

That's potentially even a little bit misleading, actually, as we also wanted PEP 489 and no global variables to get stuff working on Skybison.

@JustAnotherArchivist
Copy link
Collaborator Author

HPy was also mentioned in #572 and is certainly interesting. It's still in early alpha though, so I'm not convinced it's worth changing much of the code base for it at this time. FWIW, ujson was one of the first projects they ported to HPy as a PoC, but it has diverged a fair bit from us.

@JustAnotherArchivist
Copy link
Collaborator Author

Regarding #555, the Limited API version would probably be PyUnicode_DecodeUTF32, which would have a considerable performance impact since it, as the name suggests, decodes the data first. PyUnicode_FromKindAndData is essentially a memcpy into a new PyUnicode object's buffer or a simple and fast conversion to UCS1/UCS2.

@bwoodsend
Copy link
Collaborator

I'd only be interested in the limited API if it was close to no cost in both the performance sense and the sense of how much refactoring we have to do. Having 64 wheels per release feels silly at first but given that that's how cibuildwheel intended to work, it doesn't actually cost us anything. That said, I see that cibuildwheel not only supports ABI3 but detects it automatically so at least the CI/CD side of the transition should be smooth.

We can review this as time goes by and the stable API hopefully includes more of the functions we need.

@tekknolagi
Copy link
Contributor

I think a lot of the problems with the APIs have not been fixed yet. There should be support for incremental encoding and decoding cheaply, and for building and copying out of strings. APIs like PyUnicode_AsUTF8 are very fussy because they return internal pointers into the object that have lifetimes tied to the string. It really should either allocate or (preferably) copy to a provided buffer.

I will be taking a detour of unknown length out of Python land in the coming months, so I do not have the time or energy to try and propose fixes for you or for CPython upstream, unfortunately.

@JustAnotherArchivist
Copy link
Collaborator Author

I'm closing this as wontfix for now. If the Limited API gets larger and we can easily support it without significant performance impact, HPy gets more mature, or something like Skybison becomes more widely used, we can revisit this, but for the time being, it doesn't make sense to spend the limited maintainer time on something with very limited impact. Until then, the HPy port/fork can be used by such projects.

@JustAnotherArchivist JustAnotherArchivist added the wontfix This will not be worked on label Feb 10, 2023
@JustAnotherArchivist JustAnotherArchivist closed this as not planned Won't fix, can't repro, duplicate, stale Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants