Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build_js as a subcommand #12167

Closed

Conversation

abravalheri
Copy link

Hi @bryevdv, this PR is a suggestion of changes to be added on top of your existing PR: #12164.

Here I am proposing an alternative approach (instead of replacing build_ext, to add a new subcommand).
I think this should be a bit safer. The downside is that you also have to overwrite develop to use the editable mode.

Since, I am not very familiar with bokeh's testing procedure, I have not fully tested the changes proposed here, but the idea is more of giving a few hints of what can be changed in your original PR, so please feel free to simple close this PR and just absorb the parts you feel work for you.

The way I tested the changes here was by running python -m build and inspecting the contents of the generated wheel file. It is very likely that some adjusts of what should be included or not in the distribution need to be made via MANIFEST.in and/or exclude_package_data, but since I don't know exactly what files should be present, I have not touched it.

In the Python ecosystem most of the times a `wheel` distribution is
created from the `sdist` distribution instead of the source tree
directly. This process is described by PEP 517.

Therefore, the original JS/TS files need to be present in the `sdist`
so they can compiled/build/bundled when creating the `wheel`, which
means that we have to include the `bokehjs` directory in the
`MANIFEST.in`.

There are other alternatives to this approach, for example the JS/TS
files could be compiles/build/bundled when creating the `sdist` and just
copied over to the `wheel`. This would require creating a modified
version of the `sdist` command.

It is very likely that adjusts in the `MANIFEST.in` file or
via `include/exclude_package_data` configuration, to get the proper
files included in the distribution archives.
In general it is considered good practice to use the `setuptools`
version of the available APIs instead of `distutils`'.
Previously `bokeh` was relying on the legacy behaviour, but given the
proposed changes, it should be safe to upgrade it to use the latest
setuptools build backend.
@@ -22,6 +22,7 @@ include _setup_support.py

# Package files and data
include bokeh/LICENSE.txt
graft bokehjs
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Python ecosystem most of the times a wheel distribution is created from the sdist distribution instead of the source tree directly. This process is described by PEP 517.

Therefore, the original JS/TS files need to be present in the sdist so they can compiled/built/bundled when creating the wheel, which means that we have to include the bokehjs directory in the MANIFEST.in.

There are other alternatives to this approach, for example the JS/TS files could be compiles/built/bundled when creating the sdist and just copied over to the wheel. This would require creating a modified version of the sdist command.

It is very likely that adjusts in the MANIFEST.in file or via exclude_package_data configuration are necessary, to get the proper files included in the distribution archives, but I was not sure on which files should be there or not...

Copy link
Member

@bryevdv bryevdv Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having BokehJS be buildable from an sdist is explicitly an anti-goal. Or at least, it was in the past, before wheels, when we had to ship sdists. Can we simply not make sdists at all? Is there a configuration to stipulate wheels-only?That would be my preference.

If we have to support making sdists, then I might be OK changing to make building BokehJS from the sdist possible, as long as we also stop publishing sdists completely. The only overriding criteria I have is that the BokehJS we publish and bundle in the package be the sole source of truth for BokehJS. (I 1000% do not want to open a new category of GH issues due users not being able to build BokehJS, or worse, building and getting some subtle difference)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behaviour of building wheel from the sdist instead of the existing source tree is specified in PEP 517, so it is likely that tools such as build and pip will follow this specification.

You might be able to skip sdist and produce the wheel directly using python -m build --wheel (needs to be tested).

That said, although the build and pip might follow the sdist workflow by default, you don't have to publish the sdist file after the wheel file is created.

Some OS/distribution maintainers or alternative Python package managers might rely on the sdist being present in PyPI though...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP 517 also says

But if the backend indicates that it is missing some requirements for creating an sdist (see below), the frontend will fall back to calling build_wheel in the source directory.

which I took to mean we could safely avoid sdists if we wanted to.

Copy link
Member

@bryevdv bryevdv Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I guess at present conda-forge does start from an sidst:

https://github.com/conda-forge/bokeh-feedstock/blob/main/recipe/meta.yaml#L8

which is unfortunate. @jakirkham is there any accepted conda-forge way to not rely on sdists there? I would really, really like to stop publishing them.

Copy link
Member

@bryevdv bryevdv Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abravalheri can you elaborate on

There are other alternatives to this approach, for example the JS/TS files could be compiles/built/bundled when creating the sdist and just copied over to the wheel. This would require creating a modified version of the sdist command.

I can sort of see how that might be useful to copy the files over, but I don't see how it prevents pip install . from also trying to build BokehJS when it happens to be from an sdist. Preventing BokehJS builds from an sdist is literally my only hard requirement, especially if we have to continue publishing them. Currently we have some gorpy logic to determine if we are "packaged" and condition on that, but there must be a better way?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not prevent pip install . from building BokehJS, because pip install . will attempt to create an sdist first anyway.

However, this should prevent pip install bokeh-3.0.0.dev6+31.gb1205c5a3.tar.gz from building BokehJS, because all final files will already be present in the sdist.

Please note that I am conjecturing about this possibility (it would require some further investigation) but the sdist command also seems to have subcommands, and that could be exploited in a similar way... I haven't tested it myself.

Copy link
Member

@bryevdv bryevdv Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we are stuck keeping something like this then (except with env vars instead of args)

bokeh/_setup_support.py

Lines 168 to 177 in 2a44125

def check_packaged():
ROOT = dirname(realpath(__file__))
packaged = exists(join(ROOT, 'PKG-INFO'))
if packaged and (BUILD_JS in sys.argv or INSTALL_JS in sys.argv):
print(SDIST_BUILD_WARNING)
if BUILD_JS in sys.argv:
sys.argv.remove(BUILD_JS)
if INSTALL_JS in sys.argv:
sys.argv.remove(INSTALL_JS)
return packaged

Allowing BokehJS to build from a sdist in any capacity (i.e. even an untarred one) is just completely intolerable (I refuse to expose myself to the potential support burden risk)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I suppose that if you absolutely need to publish the sdist because of conda, then you will have to check for the PKG-INFO file to make sure users are not running from a untarred sdist distribution.

def run(self):
super().run()
if not self.uninstall:
self.run_command("build_js")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of the approach (adding a new subcommand instead of replacing build_ext) is that pip install -e . will not run this subcommand by default.

Overwriting the develop command to make it run build_js should handle this scenario.

It is important however to notice that this solution is valid while PEP 660 is not implemented in setuptools. In the near future, setuptools might add an API so build subcommands can tell which files to install...

It is very likely that this new API will require subcommand authors to implement methods like:

def get_outputs(self) -> List[str]:
    """Return the list of files created by the build command"""

# and/or
def get_editable_mapping(self) -> Dict[str, str]:
    """Return a directory mapping files existing on the project tree/sdist to files as they would be present in the wheel"""

but I am not sure yet of this design...

Copy link
Member

@bryevdv bryevdv Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the downside to replacing build_ext? Unless I am mistaken it happens for for install . and install -e . (which is what we need) and it seems a simpler way to get to that by changing only one thing in one place.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of replacing build_ext is that a custom command will not implement all the public methods of build_ext unless it inherits from it.

Therefore if other parts of setuptools are calling (or decide to start calling in the future), these public methods, the build may break.

from pathlib import Path

ROOT = Path(__file__).resolve().parent
sys.path.append(str(ROOT)) # avoid depending on `build_meta:__legacy__`
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setuptools.build_meta:__legacy__ inserts the project directory as the first entry in sys.path (i.e. it replicates the behaviour of python setup.py), this allows importing local files, but also can generate some problems.

setuptools.build_meta does not add the local path to sys.path, it is up to the user to do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point enough code has been deleted that I would probably just move everything back into setup.py and delete _setup_support.py altogether.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably sys.path.append would still be required to import versioneer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, versioneer completely skipped my mind. That's unfortunate, I really like to avoid things like this ☹️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does setuptools have any built-in alternative to versioneer? I would happily ditch it if I could...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you ever worked with setuptools-scm? It is not built-in, but quite handy to use git tags as single source of truth for the version.

I use something similar to the following in all my projects:

# pyproject.toml
[build-system]
requires = ["setuptools>62.3.3", "setuptools_scm>=6.2"]

[tool.setuptools_scm]
version_scheme = "no-guess-dev"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd never heard of it until now, I will take a look

from _setup_support import INSTALL_REQUIRES, BuildJSCmd, DevelopWithJs, check_python

try:
from setuptools.command.build import build # future-proof
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now there is no setuptools.command.build, but this is likely to change in the near future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did notice that it was missing, I am glad to hear it will hopefully be coming. However, rather than this kind of clutter, we would simply use the working version now unconditionally, and if/when a better option comes along we will change setup.py to use that along with raising the min version of setuptools required.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let me know if you want me to remove the try/catch from the PR (I don't know if you plan to use the PR directly or just use it as inspiration).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess probably the latter. Especially since some of my initial optimism has worn off. I think I might back off this attempt, and work incrementally from a different direction. Once the new conda-build comes out in month or two, I think we can stop using load_setup_py_data and move all metadata to pyproject.toml. Separating those concerns will let the setup.py work proceed more independently.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problems, please feel free to @-me if you need any help.

@@ -66,9 +83,9 @@
# details needed by setup
install_requires=INSTALL_REQUIRES,
python_requires=">=3.8",
packages=find_packages(include=["bokeh", "bokeh.*"]),
packages=find_namespace_packages(include=["bokeh", "bokeh.*"]),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subfolders without a __init__.py are still considered packages by Python itself.
If all subfolders in a package have __init__.py it is OK to only use find_packages, however, if that is not the case it is more coherent to use find_namespace_packages.

Copy link
Member

@bryevdv bryevdv Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is the case that all of our subfolders have __init__.py, except the server/static ones that have BokehJS inside, and the sampledata folder. Are those what you are referring to? I'll admit I don't really understand the difference between these two options, or why one is preferable over the other, currently. I assumed the non-__init__.py folders have to be handled with a manifest regardless.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I recommend everyone to use find_namespace_packages(), because it corresponds to the behaviour of the Python interpreter post-PEP 420.

The 2 functions are kept separated as an attempt to prevent unaware users to misuse the feature (if they don't specify any keyword argument - which is very common - this can cause some havoc, e.g. the docs and the tests folder might end up in the final wheel by mistake).

In the case of bokeh, the include= keyword argument is specified, so using find_namespace_packages() is safe.

Are those what you are referring to?

Yes, server/static and any eventual subfolder will be considered namespace packages by the Python interpreter.

Projects that use find_packages despite having folders without __init__.py are starting to see some educational warning since setuptools 62.3.0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of bokeh, the include= keyword argument is specified, so using find_namespace_packages() is safe.

Ah that's the reason I was missing. Thank you for your patient explanations

from distutils.command.build import build

# run `build_js` first, so files are available as "data files" to `build_py`
build.sub_commands.insert(0, ('build_js', None))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This follows the approach described in pypa/setuptools#2591.

Copy link
Member

@bryevdv bryevdv Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that issue, and actually tried something like it, but could not get it to work. I guess I appended the JS build step instead of appended, maybe that was the problem, or perhaps I messed some other aspect up.

In any case, for my FYI, build does not happen during an editable install? Is that why this PR has to overwrite develop?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, for my FYI, build does not happen during an editable install? Is that why this PR has to overwrite develop?

Yes, this is right, the existing implementation of develop (to be changed soon), does not run build directly. It just calls build_ext directly and assume all other files are already in-place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a place the coming changes are outlined? In any case it definitely seems like it might be better to close both these PRs and wait a little longer for some changes on both the setuptools and conda-build side of things to settle out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am currently working on this PR: pypa/setuptools#3265.
It does not include the API for subcommands to specify files for the editable install yet, but I was planning on tackling this at some point in the weekend.

The process I will follow is:

  • Merge the change in a few weeks.
  • Make it available for beta-testing/community feedback (I will post on the Python discourse on the packaging thread).
  • Release under an "experimental flag" so members of the community that don't follow the discourse can provide feedback.
  • Finally remove the "experimental flag".

So this might take some time.

@abravalheri abravalheri changed the title Subcommand build_js as a subcommand Jun 9, 2022
@abravalheri abravalheri mentioned this pull request Jun 9, 2022
4 tasks
from pathlib import Path
from subprocess import PIPE, CompletedProcess, run

from setuptools import Command
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it is considered good practice to use the setuptools version of the available APIs instead of distutils'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I had looked for that but just missed it apparently.

@@ -1,3 +1,7 @@
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously bokeh was relying on the legacy behaviour, but given the proposed changes, it should be safe to fully adhere to PEP 517.

@abravalheri abravalheri marked this pull request as ready for review June 9, 2022 11:28
@bryevdv
Copy link
Member

bryevdv commented Jun 9, 2022

@abravalheri Thank you very much for taking the time to craft this PR! I see it is having the same issue with conda-build and scraping setup.py as my other PR, but that's something for me to figure out. I do have some questions / comments, I will make them inline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants