Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for renaming kernelspecs on the fly. #1267

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ojarjur
Copy link
Contributor

@ojarjur ojarjur commented May 2, 2023

This change adds support for kernelspec managers to be configured with renaming patterns that they will apply to the kernelspecs they serve.

That, in turn, will be used a future change to allow multiple kernel spec managers to be used simultaneously without their kernel spec names colliding.

This functionality is provided using a mixin that can be inherited by a subclass of KernelSpecManager in order to add this renaming support.

Additionally, we provide canned subclasses of both KernelSpecManager and GatewayKernelSpecManager with this new renaming feature built into them.

To use the new renaming feature with a remote Kernel Gateway, the user would add the following snippet to their Jupyter config:

c.ServerApp.kernel_spec_manager_class = 'jupyter_server.gateway.managers.GatewayRenamingKernelSpecManager'

... meanwhile, an example of using the renaming functionality with local kernelspecs, can be achieved by the user adding the following snippet to their config:

c.ServerApp.kernel_spec_manager_class = 'jupyter_server.services.kernelspecs.renaming.RenamingKernelSpecManager'

@ojarjur
Copy link
Contributor Author

ojarjur commented May 2, 2023

@Zsailer @kevin-bates FYI, this is the next step in my work towards implementing a solution to #1187

Once this (and the PR's for handler changes) are in the only remaining work will be defining a new type of KernelSpecManager that multiplexes between multiple, nested KernelSpecManagers.

I've left this PR in draft state for now until I get time to add new test cases for the renaming functionality, but my manual testing has shown that this seems to work well for both local and remote (via the Kernel Gateway) kernels.

@kevin-bates
Copy link
Member

Hi @ojarjur - thank you for opening this PR - it's definitely in the direction we want to go wrt to Kernel Servers or Kernel Providers (whatever the name will be).

In those thoughts, the idea was that each "provider" would be named and that name would be used as the kernelspec prefix (and added in the display name as you have suffixed in this PR). At that time, we could easily say that the default spec_name_prefix derives from the correspond provider name, but, by that time, there could be configurations in which the configured spec_name_prefix is not a good value for the "provider" name. Since, today, we can only have exactly one "kernel provider" (either local or a single gateway), these names and prefixes do not come into play, and will not do so until Kernel Providers are introduced. As a result, I'm wondering if spec_name_prefix and display_name_suffix should be configurable. The concern being that this may take us on a path that may be difficult to correct due to backward compatibility/migration issues.

I'm wondering if we should just go ahead and introduce Kernel Servers/Providers now instead and avoid potential compatibility/deprecation cycles.

I'll try to look at this in more detail but my bandwidth is limited these days and wanted to run those thoughts by you.

@ojarjur
Copy link
Contributor Author

ojarjur commented May 3, 2023

@kevin-bates thanks for the quick and thoughtful response.

I can see the pragmatism in not making the spec_name_prefix configurable (at least for now). I'll try to figure out the best way to do that (hard-code it? A trait that is not in the config? Something else?).

For display_name_suffix, I was thinking that needed to be configurable for folks to be able to specify translations in other languages. Is there another option for that (e.g. something like a set of strings that get translated)?

I'm wondering if we should just go ahead and introduce Kernel Servers/Providers now instead and avoid potential compatibility/deprecation cycles.

So, I actually have a separate git worktree where I have most of the functionality for multiplexing working (I just have one bug left to fix around websocket connections to local kernels).

What I found when I tried to define a KernelProvider type was that I was just reinventing MultiKernelManager.

KernelProvider was meant to define a set of kernel specs and how to use them (create kernels with them and create websocket connections to them).

MultiKernelManager has a reference to its KernelSpecManager, so the only thing it lacked that a KernelProvider would have is knowledge about how to create websocket connections. However, it does have knowledge about what KernelManager class to use, and the websocket connections are specific to the KernelManager, so I defined an optional attribute on KernelManager to specify the BaseWebSocketConnection class to use, and then wrote a routing implementation of BaseWebSocketConnection that forwards to an instance of the websocket connection class defined by the KernelManager (falling back to the ZMQ based one if none is provided).

That seems to work well, and lets us avoid defining a new type.

Hopefully I'll have that change cleaned up enough to share within the next couple weeks. The reason I sent this PR separately was that I wanted to keep the scope/size of the changes smaller so that they would be more manageable.

@kevin-bates
Copy link
Member

Hi @ojarjur - thank you for this work - I definitely believe we're of the same mindset. Yes, I would view a KernelProvider to be a composition of MultikernelManager/KernelSpecManager/WebsocketManager. A KernelProvider should also have a name (more of the "symbolic form", like spec_name_prefix) and a display name (that can be localized - although I suspect there are many things that aren't easily localizable today, it's a good thought to have).

Each KernelProvider is collected in a KernelProviders class that acts as a router based on the kernel prefix - so all kernel and kernelspec handlers would blindly call methods on the single KernelProviders instance that essentially routes through a hash map indexed by the kernel-name prefix (just as a MultiKernelManager routes using the kernel_id). This top-level object also ensures names are distinct.

I believe there would be two kinds of providers - a local provider, of which there can be 0 or 1 instances, and a remote provider, of which there can 0 to N instances, and at least one provider instance must exist. By default, there is exactly 1 local provider, but if the user were to use the (legacy) --gateway-url (or equivalent configuration), there would be exactly 1 remote provider in order to preserve backward compatibility. For these backward-compatible scenarios, no names/prefixes would appear (although, for the single gateway compatibility scenario, we may need to think more about that).

I also think that we should allow for disabling a local provider - in cases where 1 or more remote providers are desired and the operator doesn't want any kernels running locally.

About the name

Although I think Kernel Providers (KernelProvider) makes sense, I'm a little concerned that when we have detailed conversations in which kernel provisioners are part of the topic, we'll be constantly having to confirm or clarify that we are truly talking about one or the other and this can be easily resolved with distinct naming. Some other names might be Kernel Servers (KernelServer), Kernel Gateways (KernelGateway - a Jupyverse "kernel server" should be a valid target), Kernel Hosts (KernelHost), Kernel Proxies (KernelProxy), ....???

I personally find Kernel Servers (KernelServer) the most natural in light of the unfortunate conflict between provider and provisioner.

@ojarjur
Copy link
Contributor Author

ojarjur commented May 4, 2023

@kevin-bates Thanks, I'm glad to see we're thinking along similar lines.

Each KernelProvider is collected in a KernelProviders class that acts as a router based on the kernel prefix

I originally tried that, but realized that I could simplify the code by routing based on the entire kernel name rather than just a prefix.

When I made that switch, I no longer needed the KernelProvider and found I could just use any instance of MultiKernelManager whose corresponding KernelManager class has an appropriate websocket_connection_class attribute set.

That also means we have the option of not renaming any kernelspecs if the kernelspecs provided by the various underlying MultiKernelManager classes don't conflict.

@kevin-bates
Copy link
Member

I have no doubt this could be implemented without introducing a "Kernel Provider" object, but I think that would be a mistake. Having an object that encapsulates these three "sub-objects" (really four if you consider dealing with kernelspec resources) provides a means that, one day, you could conceivably introduce multiple implementations of the KP interface. But, more importantly than extensibility, IMO, this encapsulation provides the ability to communicate more effectively - which is why naming is so important.

In your prototype, how are multiple "remote providers" configured?

By having a KernelProvider concept, it represents a "one-stop shop" for all kernels (and specs) managed by a particular provider. Some implementations may choose to do things differently, the only requirement being that the methods exposed via REST are available. However, under the covers, those "public" methods may redirect to a kernel provider that supports an entirely different protocol on which the current public methods can be made - enabling future evolution that we may not anticipate today, etc.

That also means we have the option of not renaming any kernelspecs if the kernelspecs provided by the various underlying MultiKernelManager classes don't conflict.

I would argue that the only time we do not rename specs is the backward compatible scenarios when there is exactly one "provider" - either local or gateway - and that local kernels are never renamed (I.e., the "name" of the local provider is always the empty string) when a local provider is enabled.

This change adds support for kernelspec managers to be configured with
renaming patterns that they will apply to the kernelspecs they serve.

That, in turn, will be used a future change to allow multiple kernel
spec managers to be used simultaneously without their kernel spec names
colliding.

This functionality is provided using a mixin that can be inherited by a
subclass of KernelSpecManager in order to add this renaming support.

Additionally, we provide canned subclasses of both KernelSpecManager
and GatewayKernelSpecManager with this new renaming feature built into
them.

To use the new renaming feature with a remote Kernel Gateway, the user
would add the following snippet to their Jupyter config:

```
c.ServerApp.kernel_spec_manager_class = 'jupyter_server.gateway.managers.GatewayRenamingKernelSpecManager'
```

... meanwhile, an example of using the renaming functionality with local
kernelspecs, can be achieved by the user adding the following snippet to
their config:

```
c.ServerApp.kernel_spec_manager_class = 'jupyter_server.services.kernelspecs.renaming.RenamingKernelSpecManager'
```

This change also fixes a pre-existing issue with the GatewayMappingKernelManager class
where the `default_kernel_name` value was not set until *after* the first request to
`/api/kernelspecs`, resulting in the first such call getting the wrong default
kernel name (even though subsequent calls would have gotten the correct one).

I confirmed that this issue with the default kernel name was already present
in the codebase on the main branch before this commit.
@ojarjur
Copy link
Contributor Author

ojarjur commented May 5, 2023

FYI, the docs/readthedocs.org:jupyter-server check seems to have been broken by the recent urllib3 2.0 release.

I'm not sure what the right fix for that is (downgrade urllib3?, upgrade OpenSSL? something else?...).

@ojarjur
Copy link
Contributor Author

ojarjur commented May 5, 2023

Sent #1269 to fix the readthedocs build failure

@ojarjur ojarjur marked this pull request as ready for review May 6, 2023 00:18
@ojarjur
Copy link
Contributor Author

ojarjur commented May 9, 2023

@kevin-bates Thanks for all of your discussion so far, and sorry that I was slow to respond to some of your questions.

I thought the best way to make the discussion more concrete was to finish getting my proof of concept working and pushing it out so you could see for yourself exactly how it works.

I've got that working now and published it here.

This is definitely not ready for review yet (so I didn't create a pull request), but it does work with my minimal, manual testing (creating and using both local and remote kernels).

In your prototype, how are multiple "remote providers" configured?

That's the one thing I can't figure out how to do. Traitlets being class based (and in particular, the existing code assuming that there is exactly one GatewayClient) makes this a problem I don't know how to solve yet.

I'm hoping that we can build the solution up incrementally where we support a small number of fixed "layouts" for now (e.g. "local kernels only", "one remote gateway only", "local kernels+one remote gateway"), and extend it to support arbitrary numbers of remote gateway servers once we figure out how to solve the configuration problem.

@kevin-bates
Copy link
Member

Hi @ojarjur - as you, I feel like I need to apologize for my slow response! 😄 I'm really looking forward to taking a look at the POC but my focus has to be on other things at the moment.

Regarding the configuration of instances, you're right, the current traits approach really is class oriented so I think we'll need to introduce a discovery aspect of instance-based configurables. Since these instances would be tied to an application and we have an application-oriented directory that houses extension configurations, I suspect we could introduce a similar notion where the initialization of the "multi-kernel server" (the name is quoted because it's purely for discussion purposes) would load configuration files from a well-known location, each identifying the name of the "Kernel Server" and containing information for that instance.

This loading process would ensure there are no conflicts or ambiguities (like different names pointing at the same locations) and will "hydrate" its instances. In addition, we could tie this to Dynamic Configuration functionality (that I'd like to bring over from EG) and enable the ability to add new Kernel Server instances on the fly w/o having to restart the server. (I suppose the same mechanism could determine when a given server should be shut down and removed from the active set - preferably once all requests have been completed, etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants