Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to application distribution model #2081

Open
agoose77 opened this issue Dec 4, 2023 · 13 comments
Open

Move to application distribution model #2081

agoose77 opened this issue Dec 4, 2023 · 13 comments
Labels
enhancement New feature or request

Comments

@agoose77
Copy link
Collaborator

agoose77 commented Dec 4, 2023

Context

Right now JB is a somewhat pinned package that users often install in their existing environments.

Proposal

We should move to an application model whereby JB has its own environment, and we aggressively pin the dependencies to known-compatible versions.

Tasks and updates

No response

@agoose77 agoose77 added the enhancement New feature or request label Dec 4, 2023
Copy link

welcome bot commented Dec 4, 2023

Thanks for opening your first issue here! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.

If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).

Welcome to the EBP community! 🎉

@jorgensd
Copy link

I find this a bit problematic (external user here).
I use jupyter-book to make interactive tutorials for software I'm developing.
Moving to two separate environments, one for development of code, and one for documentation, does not seem optimal to me, as one might end up with incompatible environments, due to pinning in jupyter-book.
I get that pinning dependencies makes it easier to maintain jupyter-book, but it also means that it will be harder and harder for external tools to use the software if they share dependencies with jupyter-book, and do not pin these themself.

@agoose77
Copy link
Collaborator Author

agoose77 commented Dec 19, 2023

Hi @jorgensd, thanks for chiming in!

The idea behind pinning Jupyter-book is not for maintenance; the core components need to support wide version ranges for exactly the problem you describe. Instead, it is to ensure that we reduce the scope for problems that traditional book developers encounter.

The path to this model requires us to move our CLI into a new package. After we've done that, jupyter-book will become a metapackage; you can then install the same core components with your own pinning strategy. In explicit terms, then, it's a goal that we should support both application installations and library installations, through different packages.

As it stands, we generally see documentation authors using sphinx directly.

Regarding separate compute environments, there is no (strong) need to have JB in the same environment as the kernels it uses. Whilst we will always support same-environment kernel discovery, we could do more to support power users who install JB once on their machine and have virtual environments for their code execution. We are thinking that this might involve adding better support for repo2docker as an environment definition too.

@jorgensd
Copy link

Hi @jorgensd, thanks for chiming in!

The idea behind pinning Jupyter-book is not for maintenance; the core components need to support wide version ranges for exactly the problem you describe. Instead, it is to ensure that we reduce the scope for problems that traditional book developers encounter.

The path to this model requires us to move our CLI into a new package. After we've done that, jupyter-book will become a metapackage; you can then install the same core components with your own pinning strategy. In explicit terms, then, it's a goal that we should support both application installations and library installations, through different packages.

As it stands, we generally see documentation authors using sphinx directly.

Thanks for the context.
I would just like to say that I would avoid going to sphinx if possible, as jupyter-book has been a fantastic way of enabling things such as
https://jsdokken.com/dolfinx-tutorial/
and
https://rangamanilabucsd.github.io/smart/examples/example1/example1.html
which are more than traditional docs (just an api).

@finsberg
Copy link
Contributor

@agoose77 Why not just install JB with pipx ?

@agoose77
Copy link
Collaborator Author

@finsberg that would be one recommended mechanism for installing jupyter-book in its own environment. On our end, we need to add better support for locating and using different python environments for execution - that's the new part.

@jorgensd that's how we feel too - the sphinx abstraction is thinner than we'd like. As such we'll be working to make it possible to use Jupyter-book in whichever way you want to!

As you can see, I don't see this breaking anyone's workflow. We will need further discussions on what a new pinning strategy would mean, but that can happen in future.

Note that this is just one maintainer's view - will also get concensus on the core team for any big changes.

@minrk
Copy link
Contributor

minrk commented Dec 19, 2023

Coming from Jupyter, the still not great support for running kernels in external envs will become a bigger issue for jupyter-book users than ~every other Jupyter user context where the default kernel still dominates, and may warrant a jupyter-book-specific solution for a smooth user experience.

It's usually as simple as:

/path/to/env/bin/ipython-kernel install --prefix /path/to/runtime/env/

but this can miss things like PATH and other env variables that would be set during "true" environment activation, which can in turn be tedious to work out when necessary. Launching a kernel in a conda env with conda run, for example, works great. Not usually hard, but not very clean or discoverable, either.

So it would probably be prudent to consider work on documentation/support for external env kernels as a prerequisite to pushing more users into needing to use them.

@agoose77
Copy link
Collaborator Author

agoose77 commented Dec 19, 2023

@minrk 100%. In fact, that's what this issue pertains to; we already support the standard kernel discovery as you illustrate. This topic is something that I am personally looking at — we haven't collectively had a great deal of discussion yet. To be explicit, in case we're speaking across one another, I don't think the existing kernel discovery mechanism is good enough to encourage users to work with. It's too "magic", and requires users that oftentimes don't know how kernels actually work under the hood to learn about kernelspecs.

I briefly spoke with @choldgraf about this, who mentioned the repo2docker specification as an example of an environment definition that already exists in the Jupyter ecosystem. Thinking about this at a very general level, it would be interesting to talk about environment provisioning for Jupyter Book whereby the kernels do not even need to be on the host machine. As such, I think the work that can be done here is to build out support for non-in-place environments, such that Jupyter Book can be an isolated application (but does not have to be), and this might include leaning in to the FAIR mindset in our tooling.

Here's an unordered feature list / idea list:

  1. jupyter book build . --runtime=$PWD/.venv (virtualenv)
  2. jupyter book build . (current virtualenv)
  3. jupyter book build . --runtime=$PWD/.mambaenv/ (mamba prefix)
  4. jupyter book build . --runtime=$PWD (repo2docker spec)
  5. jupyter book build . --runtime-image=my-repo-container (container image)
  6. jupyter book build . --runtime-hub=https://... (JupyterHub)
  7. jupyter book build . --runtime-gateway=https://... (Jupyter Kernel Gateway)

In JupyterLab land there are other concerns RE labextensions and the necessary compatibility for a given Python package. Actually, this has given me the necessary kick to touch base with the JLab developers on the issue.

@choldgraf
Copy link
Member

Personally, I like the idea of piggy-backing on the repo2docker spec somehow, if we can get it to work without requiring a Docker image. E.g., maybe there's a "local" version of repo2docker that supports a subset of the spec (things that can be installed with conda/mamba?) but behaves similarly otherwise. Then users still don't have to know "how to build an environment" as long as they know "how to define an environment with the spec".

@minrk
Copy link
Contributor

minrk commented Dec 20, 2023

I think in general, it makes sense to decouple the "here are the environment specification(s) that I've found" in repo2docker's REES from "here's how to launch an environment with them" (i.e. repo2docker's buildpacks generating). This may be hard to do in reusable code, but could probably be done at least at the specification level in REES, where we are pretty sparse on details.

It's a nice idea to be able to have reusable, decoupled discovery from implementation to make it easier for folks to provide more than one implementatioon, but that's a pretty big project.

@agoose77
Copy link
Collaborator Author

I'm also not against the pathway being (in addition to using the current environment)

  • virtual environment path
  • conda/mamba prefix
  • repo2docker generator (build and execute in docker)

The latter whilst bulkier would enforce the notion that the project is buildable in repo2docker, which would be a nice boon.

@choldgraf
Copy link
Member

choldgraf commented Dec 21, 2023

Just noting that I think we need to better understand the assumptions and capabilities of our users here.

My guess is that the average jupyter book user has no interest in defining virtual environments, conda/mamba environments, or using Docker. I think the maximum we might expect of them would be something like "put a requirements.txt file in this folder and it'll install into a dedicated environment to execute rather than using your default environment".

But we should put that assumption to test by talking to users - don't forget that we are likely on the far, far tail end of computational fluency for most Jupyter Book users.

@minrk
Copy link
Contributor

minrk commented Dec 23, 2023

Yeah, that's why I think it's probably in-scope for book to take responsibility for creating and activating the env (in the default case) if it goes the separate-env route, and present a higher-level interface than Jupyter in general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants