Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

creedsolo.pyx fails to cythonize on 2.0.5 #60

Closed
mgorny opened this issue Mar 31, 2023 · 30 comments
Closed

creedsolo.pyx fails to cythonize on 2.0.5 #60

mgorny opened this issue Mar 31, 2023 · 30 comments
Assignees
Labels

Comments

@mgorny
Copy link
Contributor

mgorny commented Mar 31, 2023

No clue what's happening here. 1.7.0 cythonizes just fine. I've tried it in venv too.

$ cython --version
Cython version 0.29.33
$ python --version
Python 3.11.2
$ python setup.py build --cythonize
Cython is installed, building creedsolo module
[1/1] Cythonizing creedsolo.pyx

Error compiling Cython file:
------------------------------------------------------------
...
^
------------------------------------------------------------

cython:0:0: cython.cimports is not available
Traceback (most recent call last):
  File "/tmp/reedsolomon/setup.py", line 32, in <module>
    extensions = cythonize([ Extension('creedsolo', ['creedsolo.pyx']) ], annotate=True, force=True,  # this may fail hard if Cython is installed but there is no C compiler for current Python version, and we have no way to know. Alternatively, we could supply exclude_failures=True , but then for those who really want the cythonized compiled extension, it would be much harder to debug
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/Cython/Build/Dependencies.py", line 1117, in cythonize
    cythonize_one(*args)
  File "/usr/lib/python3.11/site-packages/Cython/Build/Dependencies.py", line 1240, in cythonize_one
    raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: creedsolo.pyx
lrq3000 added a commit that referenced this issue Mar 31, 2023
…(see #60) + add pyproject.toml draft (not used for the moment)

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@lrq3000
Copy link
Collaborator

lrq3000 commented Mar 31, 2023

Hello @mgorny , that's normal, it's because I have upgraded to Cython 3.0.0b2, and since it's beta, it's necessary to manually specify this version to get it, a simple `pip install --upgrade "cython>=2" does not work...

I guess you can hardcode pip installing cython==3.0.0b2 in your gentoo build, but the issue is that I will upgrade when cython v3 gets out of beta, so then you'll have to modify your build again etc...

To streamline the process and so that you don't have to worry anymore about this kind of issue, I have added cython as an extra requirement. I just uploaded reedsolo v2.0.10 with these changes. So all you need to do is the following:

pip install --upgrade reedsolo[cythonize] --install-option="--cythonize" --verbose

Note the reedsolo[cythonize]. The [cythonize] is what tells pip that we want to install the extra requirements to cythonize.

Please let me know if this solves the issue :-)

@lrq3000
Copy link
Collaborator

lrq3000 commented Mar 31, 2023

I realize I did not explain why I took this approach: with this approach, I can specify the extra requirements directly in setup.py, so you don't have to worry about what version of cython to install, I manage it directly, and your build will follow automatically as long as you specify [cythonize] when installing reedsolo.

@mgorny
Copy link
Contributor Author

mgorny commented Mar 31, 2023

Unfortunately, that won't work for us. We're always using system packages for the build, so we can't have Cython 2 and Cython 3 installed simultaneously. Then, too many packages are currently broken with Cython 3…

@lrq3000
Copy link
Collaborator

lrq3000 commented Mar 31, 2023

Umph that's an issue. What about build isolation with PEP-518? I can finish the pyproject.toml draft I started earlier, so that build isolation is enabled. Cython 3.0.0b2 is indeed only needed once at build time. I don't know the internals of package building on Gentoo but can you use PEP-518?

@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 1, 2023

I'm still working on making the pep-518 build work, it's surprisingly difficult with C extensions...

Another potential solution: to bundle the .c file, so that cython is not needed at all to compile/install reedsolo. (To see if this is possible: I need to check if this would work for multiple versions of python and is the c file built against all platforms or is it specific to the OS where Cython is run.)

@mgorny
Copy link
Contributor Author

mgorny commented Apr 1, 2023

Isolated builds won't work for us — builds are done offline, so all dependencies need to be preinstalled.

Yes, including the .c file would work. Many projects do that, and I think it's cross-platform.

@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 1, 2023

Thank you for your valuable inputs @mgorny , including the .c file and adding a setup.py flag should be doable, I'm working on it.

I am also considering moving creedsolo to its own package in the future, so that I can build and distribute wheels for various platforms using cibuildwheel, would this be an issue for you (I guess you would have to create another creedsolo gentoo package)?

@lrq3000 lrq3000 closed this as completed in fc2438c Apr 2, 2023
@lrq3000 lrq3000 reopened this Apr 2, 2023
@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 2, 2023

Thank you for your patience @mgorny ! When you've got some time, could you please check if v2.0.13 is good enough for you? It includes creedsolo.c in the sdist package.

And FYI reedsolo and creedsolo can now be built under isolation with pip and build, in case it may be useful for you in the future. Apriori, I won't move creedsolo to its own package, as long as the current solution is not going to get deprecated by pip or setuptools ;-)

@mgorny
Copy link
Contributor Author

mgorny commented Apr 2, 2023

I am also considering moving creedsolo to its own package in the future, so that I can build and distribute wheels for various platforms using cibuildwheel, would this be an issue for you (I guess you would have to create another creedsolo gentoo package)?

Well, it's a little work but we can handle it ;-).

Thank you for your patience @mgorny ! When you've got some time, could you please check if v2.0.13 is good enough for you? It includes creedsolo.c in the sdist package.

I'm afraid it's still not it. FWICS the C extension isn't built without --cythonize (probably because of extensions = None), and with --cythonize it insists on cythonizing it again.

My personal suggestion would be to change the other so that:

  1. the C extension is built by default, from already cythonized sources
  2. --cythonize forces cythonizing again
  3. --no-cext option (or alike) is added to disable building the C extension entirely

…or, well, splitting the C package instead of adding --no-cext as you suggested would also work ;-).

And FYI reedsolo and creedsolo can now be built under isolation with pip and build, in case it may be useful for you in the future. Apriori, I won't move creedsolo to its own package, as long as the current solution is not going to get deprecated by pip or setuptools ;-)

I'm afraid this caused another problem: now tests are installed straight into site-packages, i.e. as /usr/lib/python3.10/site-packages/tests. You need to explicitly exclude tests package in tool.setuptools.packages.find.

@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 2, 2023

Unfortunately the behavior you suggest, to compile the C extension from .c files by default, is what we used to do, but it caused issues for Windows users who have Cython installed but no C compiler... But i can add a flag to build from the .c file without cythonization :-)

About the tests, is it not unavoidable since we include tests in the sdist as you suggested? In any case, it seems that currently there are bugs preventing their exclusion (see here and here), so is it really a deal breaker or can we live with it for now? When I look at my Lib\site-packages\tests folder, I notice there are several others as well, so it seems this can be an acceptable behavior?

lrq3000 added a commit that referenced this issue Apr 2, 2023
…cythonization (see #60)

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 2, 2023

I restored the --native-compile option, which can be specified like:

python -sBm build --config-setting="--build-option=--native-compile"

To build the c extension without cythonization. Is this going to work with your build process?

@lrq3000 lrq3000 self-assigned this Apr 2, 2023
@lrq3000 lrq3000 added the bug label Apr 2, 2023
@paravoid
Copy link

paravoid commented Apr 3, 2023

Debian maintainer here. We're at v1.7.0 at the moment, and using Cython 0.29.32 to ship creedsolo alongside reedsolo. We don't have Cython prereleases in Debian (yet) so I haven't tried that. I encountered this issue while looking at changes made upstream.

From the Debian PoV, we could never rely on pregenerated files like the .c file. This .c file may be a text file and not an ELF binary, but it's the result of a compilation process. I'm surprised that this would be acceptable for Gentoo! The rules in Debian is that we should ship the necessary combination of source files and toolchains to get these into a state where they're useful for users. Using precompiled files can also be problematic from the licensing point of view for many projects that are e.g. GPL-licensed, or used by GPL-licensed software. For these reasons, pregenerated files in upstream tarballs are also generally frowned upon -- they are considered "object" files and often stripped from the source files entirely.

@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 3, 2023

Thank you @paravoid for your insights. I understand your concerns. I did not know the reedsolo package was so maintained so widely across so many Linux distributions, I am humbled by the interest shown in this arguably small project.

I am hence divided about what to do about this issue. On one hand, rollbacking to Cython < 3 is not possible without basically cancelling all changes done since v1.7.0, because indeed nearly all changes pertain to the cython implementation, and they require Cython >= 3 especially for math operations, which behave more like Python in Cython 3, as they are not trivially translatable to c-like math operators. Cython 3 also adds a better support for methods and classes, which is also another area where a lot of improvements were done.

On the other hand, I understand it's not possible for Linux builds to use the beta versions of Cython 3, it's not yet available in most environments. I am quite new to the modern python packaging ecosystem, but it seems to me that fixing this issue (the long delay for the propagation of dependencies updates in various environments) is one of the main goals of this new ecosystem, but from what you guys say, I understand these tools are not mature enough yet and not yet in use.

About the suggestion of compilation from the .c file, I guess it's a pragmatic suggestion that is only meant as a temporary workaround until the likely soon-ish stable release of Cython v3. Also, the *.pyx file remains packaged with the sdist, and I can improve the reproducibility of the cythonized *.c file using a lockfile for dependencies, so that in effect we can get something very close to almost a source .c file in practice, although it remains computer generated yes.

Apart from bundling and compiling from the intermediate creedsolo.c file, one alternative I see is to split the project in two as suggested earlier: reedsolo including only the pure python implementation, and creedsolo as a separate package that requires Cython v3. With this solution, this means that until the stable release of Cython v3 and its propagation down to Debian/Gentoo/etc repos, only reedsolo will be buildable for you. But this also means a net reduction in the package functionalities, since it essentially loses creedsolo for not much gains.

Last solution I see is simply for all Linux builds to be paused until Cython v3 is released in the respective distros. Which can take a while...

Maybe you guys also have some insights from how other projects handled similar situations? I guess I'm not the first to build a Python project based on a newer dependency than is available in various Linux distros?

@mgorny
Copy link
Contributor Author

mgorny commented Apr 3, 2023

About the tests, is it not unavoidable since we include tests in the sdist as you suggested?

The contents of sdist and wheel can be controlled independently. You control what gets included in sdist via MANIFEST.in. Package discovery only affects what gets included in wheel and installed to site-packages.

In any case, it seems that currently there are bugs preventing their exclusion (see here and here), so is it really a deal breaker or can we live with it for now?

I think there is some confusion around what "package data" is. The package data option is used to install additional data files inside Python packages, i.e. something like /usr/lib/python*/site-packages/reedsolo/FOO.txt. Since you're only installed top-level modules, package data seems entirely irrelevant and I think you're triggering some accidental behavior that wasn't intentional.

For this package, the correct thing to do would be to remove tool.setuptools.packages.find, tool.setuptools.package-data and tool.setuptools.exclude-package-data sections and let the auto-discovery find reedsolo.py and install it. If the C extension is built, then setup.py adds it to wheel via ext_modules arg.

When I look at my Lib\site-packages\tests folder, I notice there are several others as well, so it seems this can be an acceptable behavior?

I'm afraid they're all bugs, and it's quite probable they've been fixed already. Unfortunately, pip isn't very good at cleaning up. Distribution package managers, on the other hand, are very strict about things like this. In the end, it would mean multiple packages installing tests/__init__.py file and, well, only one copy of the file can be legally installed.

@mgorny
Copy link
Contributor Author

mgorny commented Apr 3, 2023

Maybe you guys also have some insights from how other projects handled similar situations? I guess I'm not the first to build a Python project based on a newer dependency than is available in various Linux distros?

The only package i'm aware of is rapidfuzz, and we're using the pregenerated C files there.

@paravoid
Copy link

paravoid commented Apr 3, 2023

Thank you @paravoid for your insights. I understand your concerns. I did not know the reedsolo package was so maintained so widely across so many Linux distributions, I am humbled by the interest shown in this arguably small project.

No worries! I actually uploaded this to Debian relatively recently, about a month ago :)

I am hence divided about what to do about this issue. On one hand, rollbacking to Cython < 3 is not possible without basically cancelling all changes done since v1.7.0, because indeed nearly all changes pertain to the cython implementation, and they require Cython >= 3 especially for math operations, which behave more like Python in Cython 3, as they are not trivially translatable to c-like math operators. Cython 3 also adds a better support for methods and classes, which is also another area where a lot of improvements were done.

Understandable!

On the other hand, I understand it's not possible for Linux builds to use the beta versions of Cython 3, it's not yet available in most environments. I am quite new to the modern python packaging ecosystem, but it seems to me that fixing this issue (the long delay for the propagation of dependencies updates in various environments) is one of the main goals of this new ecosystem, but from what you guys say, I understand these tools are not mature enough yet and not yet in use.

I think there are two compounding things in effect here: one is that Cython 3.0 is a "pre-release", and even marked as such in PyPI. So, for example, "pip install cython" in a clean venv brings 0.9.34 right now, not 3.x. Therefore it would not make sense for Debian to package 3.x as a pre-release. (There is Debian experimental which is opt-in sometimes used for this purpose, but this hasn't happened yet). So in that sense, I don't think this is a case of propagation delay -- Debian -like PyPI- tracks regular releases, and is currently at 0.29.32. (The 0.9.32->0.9.34 is a case of propagation delay, though!)

The second one is that Debian is in its hard freeze right now (expected to last until the summer or so), so we cannot expect to see major version bumps in Debian until that is over. That will indeed bring a propagation delay, but typically it would also preclude us from packaging newer versions of reedsolo as well, so not a major issue from your perspective. (In this case reedsolo is exempt of this as it did not make the cut for the next release, so this is not a limiting factor.)

Apart from bundling and compiling from the intermediate creedsolo.c file, one alternative I see is to split the project in two as suggested earlier: reedsolo including only the pure python implementation, and creedsolo as a separate package that requires Cython v3. With this solution, this means that until the stable release of Cython v3 and its propagation down to Debian/Gentoo/etc repos, only reedsolo will be buildable for you. But this also means a net reduction in the package functionalities, since it essentially loses creedsolo for not much gains.

That could work but I think it's more trouble than it's worth. As you pointed out, it looks like Cython 3.0 is close to getting released, so it will go away as a consideration soon. It also seems like this is a bit of a more major bump than usual, so not likely to happen regularly to the point where we need to plan for it in the future. We, as distros, can stick with v1.7.0 for the time being, and update reedsolo to v2.0 when that Cython release happens and find its way to distributions. That is my plan so far at least :)

Maybe you guys also have some insights from how other projects handled similar situations? I guess I'm not the first to build a Python project based on a newer dependency than is available in various Linux distros?

I think the unique situation in your case is that you're relying on a pre-release, rather than just a newer dependency. If it was just a minor version behind or something, I'd just update Debian's Cython right before updating reedsolo :)

From a release management perspective, the only way I could see you having done differently to avoid this would have been to fork into two branches (like Cython does with 0.29.x/3.x), say 1.7.x and 2.x, and release 2.x as pre-releases until Cython stabilizes. But what you did is also OK, assuming Cython's stabilization period is going to last weeks or months rather than years.

Hope this helps!

@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 3, 2023

Thank you both for your very helpful feedbacks and suggestions!

Unless you guys are against the idea, I think I will rollback publication on PyPi back to 1.7.0, and publish the new v2.x releases as alpha, as you suggest @paravoid. I think this is the most reasonable choice. I have just explored which projects depend on reedsolo, and I realized it really is a critical module for the ecosystem, so better be safe than sorry.

Cython mention they should publish v3 during 2023, although we don't know which month, but for context, just a week ago progress was at 92% and is now at 94%, so it seems it's going pretty fast :-)

@mgorny Yes I'm not using package-data, but I think this is a similar bug. I did not try to remove all code for exploration. Also I can try to manually specify each file I want to include. I will try all of these approaches, I'll keep you updated!

@mgorny
Copy link
Contributor Author

mgorny commented Apr 3, 2023

Ok, thanks. Let us know if you need help, I suppose build systems are my specialization these days. I just wished they'd pay the bills xP.

@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 3, 2023

@mgorny Thank you very much for your very kind help! :D I was exactly thinking about that when I was browsing all the reedsolo builds for various Linux distros, it's a process that requires a lot of skills and time (because to do it right you need to have done already a lot of various builds for other softwares), but it's most of the time thankless and not paid, despite being essential for the ecosystem to work! It's marvelous such an altruistic system works at all, and it works quite well in practice!

@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 5, 2023

@mgorny

For this package, the correct thing to do would be to remove tool.setuptools.packages.find, tool.setuptools.package-data and tool.setuptools.exclude-package-data sections and let the auto-discovery find reedsolo.py and install it. If the C extension is built, then setup.py adds it to wheel via ext_modules arg.

Unfortunately this did not change anything. The issue is that this module uses a "single-module layout", but setuptools does not support that anymore officially (only src-layout and flat-layout). I think I'll have to transition to one of those, but I don't want to break the API, I'll see how to handle this.

Anyway, I have rollbacked pypi releases (using yanking) back to 1.7.0, and future releases will be versioned as prereleases (2.x.xb1) until the stable release of Cython v3 gets published, so that I can take my time to carefully and properly package the module according modern standards.

I have to pause my endeavor at the moment due to other pressing affairs, but I will come back to it eventually :-)

@lrq3000 lrq3000 closed this as completed in c378c04 Apr 6, 2023
@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 6, 2023

@mgorny Finally I could free up some time to try to finish this issue before having to pause! Please check if the latest release is OK for you too, it should have tests* packaged in the sdist but not in the wheel :-) Thank you very much for your help in solving this!

@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 6, 2023

Ah and note I had to modify the layout to a src-layout type, so maybe this will require some changes in your packaging process, but this was necessary to avoid inclusion of the tests* folder in the wheel!

@mgorny
Copy link
Contributor Author

mgorny commented Apr 6, 2023

Thanks. I don't see the release on PyPI, so I just went ahead and made sdist from the main branch. It seems to work fine, and at least at a cursory glance I didn't see anything wrong with it.

@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 6, 2023

Ok awesome! Thank you for testing and for your help in fixing these issues! :D

Yes it's not yet published on PyPi because this time I want to ensure the whole building process is much more robust and polished, eg with cibuildwheel :-)

@mgorny
Copy link
Contributor Author

mgorny commented Apr 10, 2023

I can confirm that 2.0.31b1 looks good for Gentoo. Thanks again!

@lrq3000
Copy link
Collaborator

lrq3000 commented Apr 11, 2023

@mgorny Awesome, thank you very much for confirming and for your help all along this issue! :D

@paravoid
Copy link

From a release management perspective, the only way I could see you having done differently to avoid this would have been to fork into two branches (like Cython does with 0.29.x/3.x), say 1.7.x and 2.x, and release 2.x as pre-releases until Cython stabilizes. But what you did is also OK, assuming Cython's stabilization period is going to last weeks or months rather than years.

Cython 3.0.0 was released back in July, and is starting to make its way into distributions. So I think these considerations are thankfully all in the past now :) Do you have any plans for a non-beta 2.0 release anytime soon? Thanks!

@lrq3000
Copy link
Collaborator

lrq3000 commented Nov 25, 2023

Hey @paravoid! I am aware that Cython 3 stable was released this summer but I did not know when distributions would have it, so thank you very much for letting me know :-) There are a few minor issues that have been opened meanwhile that i would like to investigate and maybe fix before the stable release, maybe next week.

Thank you very much for your continued friendly help on maintaining this project! Ill post an update here when I'll be ready to release :-)

@paravoid
Copy link

Hi again :)

cython3 (3.0.7 specifically) is now in Debian unstable. I've verified that reedsolo 2.1.1b builds successfully with it, and that the test suite passes. Ready when you are :)

@paravoid
Copy link

cython3 (3.0.7 specifically) is now in Debian unstable. I've verified that reedsolo 2.1.1b builds successfully with it, and that the test suite passes. Ready when you are :)

@lrq3000, ping? I know this issue is closed so I'm worried you haven't seen this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants