New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build_js
as a subcommand
#12167
build_js
as a subcommand
#12167
Conversation
In the Python ecosystem most of the times a `wheel` distribution is created from the `sdist` distribution instead of the source tree directly. This process is described by PEP 517. Therefore, the original JS/TS files need to be present in the `sdist` so they can compiled/build/bundled when creating the `wheel`, which means that we have to include the `bokehjs` directory in the `MANIFEST.in`. There are other alternatives to this approach, for example the JS/TS files could be compiles/build/bundled when creating the `sdist` and just copied over to the `wheel`. This would require creating a modified version of the `sdist` command. It is very likely that adjusts in the `MANIFEST.in` file or via `include/exclude_package_data` configuration, to get the proper files included in the distribution archives.
In general it is considered good practice to use the `setuptools` version of the available APIs instead of `distutils`'.
This approach was first proposed and discussed in pypa/setuptools#2591
Previously `bokeh` was relying on the legacy behaviour, but given the proposed changes, it should be safe to upgrade it to use the latest setuptools build backend.
@@ -22,6 +22,7 @@ include _setup_support.py | |||
|
|||
# Package files and data | |||
include bokeh/LICENSE.txt | |||
graft bokehjs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Python ecosystem most of the times a wheel
distribution is created from the sdist
distribution instead of the source tree directly. This process is described by PEP 517.
Therefore, the original JS/TS files need to be present in the sdist
so they can compiled/built/bundled when creating the wheel
, which means that we have to include the bokehjs
directory in the MANIFEST.in
.
There are other alternatives to this approach, for example the JS/TS files could be compiles/built/bundled when creating the sdist
and just copied over to the wheel
. This would require creating a modified version of the sdist
command.
It is very likely that adjusts in the MANIFEST.in
file or via exclude_package_data
configuration are necessary, to get the proper files included in the distribution archives, but I was not sure on which files should be there or not...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having BokehJS be buildable from an sdist
is explicitly an anti-goal. Or at least, it was in the past, before wheels, when we had to ship sdists. Can we simply not make sdists at all? Is there a configuration to stipulate wheels-only?That would be my preference.
If we have to support making sdists, then I might be OK changing to make building BokehJS from the sdist possible, as long as we also stop publishing sdists completely. The only overriding criteria I have is that the BokehJS we publish and bundle in the package be the sole source of truth for BokehJS. (I 1000% do not want to open a new category of GH issues due users not being able to build BokehJS, or worse, building and getting some subtle difference)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behaviour of building wheel
from the sdist
instead of the existing source tree is specified in PEP 517, so it is likely that tools such as build
and pip
will follow this specification.
You might be able to skip sdist
and produce the wheel directly using python -m build --wheel
(needs to be tested).
That said, although the build
and pip
might follow the sdist workflow by default, you don't have to publish the sdist
file after the wheel
file is created.
Some OS/distribution maintainers or alternative Python package managers might rely on the sdist being present in PyPI though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP 517 also says
But if the backend indicates that it is missing some requirements for creating an sdist (see below), the frontend will fall back to calling build_wheel in the source directory.
which I took to mean we could safely avoid sdists if we wanted to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However I guess at present conda-forge does start from an sidst:
https://github.com/conda-forge/bokeh-feedstock/blob/main/recipe/meta.yaml#L8
which is unfortunate. @jakirkham is there any accepted conda-forge way to not rely on sdists
there? I would really, really like to stop publishing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abravalheri can you elaborate on
There are other alternatives to this approach, for example the JS/TS files could be compiles/built/bundled when creating the sdist and just copied over to the wheel. This would require creating a modified version of the sdist command.
I can sort of see how that might be useful to copy the files over, but I don't see how it prevents pip install .
from also trying to build BokehJS when it happens to be from an sdist. Preventing BokehJS builds from an sdist is literally my only hard requirement, especially if we have to continue publishing them. Currently we have some gorpy logic to determine if we are "packaged" and condition on that, but there must be a better way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not prevent pip install .
from building BokehJS, because pip install .
will attempt to create an sdist
first anyway.
However, this should prevent pip install bokeh-3.0.0.dev6+31.gb1205c5a3.tar.gz
from building BokehJS, because all final files will already be present in the sdist
.
Please note that I am conjecturing about this possibility (it would require some further investigation) but the sdist
command also seems to have subcommands, and that could be exploited in a similar way... I haven't tested it myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we are stuck keeping something like this then (except with env vars instead of args)
Lines 168 to 177 in 2a44125
def check_packaged(): | |
ROOT = dirname(realpath(__file__)) | |
packaged = exists(join(ROOT, 'PKG-INFO')) | |
if packaged and (BUILD_JS in sys.argv or INSTALL_JS in sys.argv): | |
print(SDIST_BUILD_WARNING) | |
if BUILD_JS in sys.argv: | |
sys.argv.remove(BUILD_JS) | |
if INSTALL_JS in sys.argv: | |
sys.argv.remove(INSTALL_JS) | |
return packaged |
Allowing BokehJS to build from a sdist in any capacity (i.e. even an untarred one) is just completely intolerable (I refuse to expose myself to the potential support burden risk)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I suppose that if you absolutely need to publish the sdist
because of conda, then you will have to check for the PKG-INFO
file to make sure users are not running from a untarred sdist
distribution.
def run(self): | ||
super().run() | ||
if not self.uninstall: | ||
self.run_command("build_js") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside of the approach (adding a new subcommand instead of replacing build_ext
) is that pip install -e .
will not run this subcommand by default.
Overwriting the develop
command to make it run build_js
should handle this scenario.
It is important however to notice that this solution is valid while PEP 660 is not implemented in setuptools. In the near future, setuptools
might add an API so build
subcommands can tell which files to install...
It is very likely that this new API will require subcommand authors to implement methods like:
def get_outputs(self) -> List[str]:
"""Return the list of files created by the build command"""
# and/or
def get_editable_mapping(self) -> Dict[str, str]:
"""Return a directory mapping files existing on the project tree/sdist to files as they would be present in the wheel"""
but I am not sure yet of this design...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the downside to replacing build_ext
? Unless I am mistaken it happens for for install .
and install -e .
(which is what we need) and it seems a simpler way to get to that by changing only one thing in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside of replacing build_ext
is that a custom command will not implement all the public methods of build_ext
unless it inherits from it.
Therefore if other parts of setuptools
are calling (or decide to start calling in the future), these public methods, the build may break.
from pathlib import Path | ||
|
||
ROOT = Path(__file__).resolve().parent | ||
sys.path.append(str(ROOT)) # avoid depending on `build_meta:__legacy__` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setuptools.build_meta:__legacy__
inserts the project directory as the first entry in sys.path
(i.e. it replicates the behaviour of python setup.py
), this allows importing local files, but also can generate some problems.
setuptools.build_meta
does not add the local path to sys.path
, it is up to the user to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point enough code has been deleted that I would probably just move everything back into setup.py
and delete _setup_support.py
altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably sys.path.append
would still be required to import versioneer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, versioneer
completely skipped my mind. That's unfortunate, I really like to avoid things like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does setuptools have any built-in alternative to versioneer
? I would happily ditch it if I could...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you ever worked with setuptools-scm
? It is not built-in, but quite handy to use git tags as single source of truth for the version.
I use something similar to the following in all my projects:
# pyproject.toml
[build-system]
requires = ["setuptools>62.3.3", "setuptools_scm>=6.2"]
[tool.setuptools_scm]
version_scheme = "no-guess-dev"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd never heard of it until now, I will take a look
from _setup_support import INSTALL_REQUIRES, BuildJSCmd, DevelopWithJs, check_python | ||
|
||
try: | ||
from setuptools.command.build import build # future-proof |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now there is no setuptools.command.build
, but this is likely to change in the near future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did notice that it was missing, I am glad to hear it will hopefully be coming. However, rather than this kind of clutter, we would simply use the working version now unconditionally, and if/when a better option comes along we will change setup.py
to use that along with raising the min version of setuptools
required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me know if you want me to remove the try/catch from the PR (I don't know if you plan to use the PR directly or just use it as inspiration).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess probably the latter. Especially since some of my initial optimism has worn off. I think I might back off this attempt, and work incrementally from a different direction. Once the new conda-build
comes out in month or two, I think we can stop using load_setup_py_data
and move all metadata to pyproject.toml
. Separating those concerns will let the setup.py
work proceed more independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problems, please feel free to @
-me if you need any help.
@@ -66,9 +83,9 @@ | |||
# details needed by setup | |||
install_requires=INSTALL_REQUIRES, | |||
python_requires=">=3.8", | |||
packages=find_packages(include=["bokeh", "bokeh.*"]), | |||
packages=find_namespace_packages(include=["bokeh", "bokeh.*"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subfolders without a __init__.py
are still considered packages by Python itself.
If all subfolders in a package have __init__.py
it is OK to only use find_packages
, however, if that is not the case it is more coherent to use find_namespace_packages
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is the case that all of our subfolders have __init__.py
, except the server/static
ones that have BokehJS inside, and the sampledata folder. Are those what you are referring to? I'll admit I don't really understand the difference between these two options, or why one is preferable over the other, currently. I assumed the non-__init__.py
folders have to be handled with a manifest regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I recommend everyone to use find_namespace_packages()
, because it corresponds to the behaviour of the Python interpreter post-PEP 420.
The 2 functions are kept separated as an attempt to prevent unaware users to misuse the feature (if they don't specify any keyword argument - which is very common - this can cause some havoc, e.g. the docs
and the tests
folder might end up in the final wheel by mistake).
In the case of bokeh
, the include=
keyword argument is specified, so using find_namespace_packages()
is safe.
Are those what you are referring to?
Yes, server/static
and any eventual subfolder will be considered namespace packages by the Python interpreter.
Projects that use find_packages
despite having folders without __init__.py
are starting to see some educational warning since setuptools
62.3.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of bokeh, the include= keyword argument is specified, so using find_namespace_packages() is safe.
Ah that's the reason I was missing. Thank you for your patient explanations
from distutils.command.build import build | ||
|
||
# run `build_js` first, so files are available as "data files" to `build_py` | ||
build.sub_commands.insert(0, ('build_js', None)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This follows the approach described in pypa/setuptools#2591.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that issue, and actually tried something like it, but could not get it to work. I guess I appended the JS build step instead of appended, maybe that was the problem, or perhaps I messed some other aspect up.
In any case, for my FYI, build
does not happen during an editable install? Is that why this PR has to overwrite develop
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, for my FYI, build does not happen during an editable install? Is that why this PR has to overwrite develop?
Yes, this is right, the existing implementation of develop
(to be changed soon), does not run build
directly. It just calls build_ext
directly and assume all other files are already in-place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a place the coming changes are outlined? In any case it definitely seems like it might be better to close both these PRs and wait a little longer for some changes on both the setuptools and conda-build side of things to settle out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am currently working on this PR: pypa/setuptools#3265.
It does not include the API for subcommands to specify files for the editable install yet, but I was planning on tackling this at some point in the weekend.
The process I will follow is:
- Merge the change in a few weeks.
- Make it available for beta-testing/community feedback (I will post on the Python discourse on the packaging thread).
- Release under an "experimental flag" so members of the community that don't follow the discourse can provide feedback.
- Finally remove the "experimental flag".
So this might take some time.
from pathlib import Path | ||
from subprocess import PIPE, CompletedProcess, run | ||
|
||
from setuptools import Command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general it is considered good practice to use the setuptools
version of the available APIs instead of distutils
'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I had looked for that but just missed it apparently.
@@ -1,3 +1,7 @@ | |||
[build-system] | |||
requires = ["setuptools"] | |||
build-backend = "setuptools.build_meta" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously bokeh
was relying on the legacy behaviour, but given the proposed changes, it should be safe to fully adhere to PEP 517.
@abravalheri Thank you very much for taking the time to craft this PR! I see it is having the same issue with |
Hi @bryevdv, this PR is a suggestion of changes to be added on top of your existing PR: #12164.
Here I am proposing an alternative approach (instead of replacing
build_ext
, to add a new subcommand).I think this should be a bit safer. The downside is that you also have to overwrite
develop
to use the editable mode.Since, I am not very familiar with bokeh's testing procedure, I have not fully tested the changes proposed here, but the idea is more of giving a few hints of what can be changed in your original PR, so please feel free to simple close this PR and just absorb the parts you feel work for you.
The way I tested the changes here was by running
python -m build
and inspecting the contents of the generatedwheel
file. It is very likely that some adjusts of what should be included or not in the distribution need to be made viaMANIFEST.in
and/orexclude_package_data
, but since I don't know exactly what files should be present, I have not touched it.