Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register pre/post save hooks, call them sequentially #696

Merged
merged 7 commits into from Mar 14, 2022

Conversation

davidbrochart
Copy link
Contributor

Closes #695

@codecov-commenter
Copy link

codecov-commenter commented Feb 8, 2022

Codecov Report

Merging #696 (8bef774) into main (5d962d7) will decrease coverage by 0.19%.
The diff coverage is 48.43%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #696      +/-   ##
==========================================
- Coverage   70.95%   70.76%   -0.20%     
==========================================
  Files          62       62              
  Lines        7515     7555      +40     
  Branches     1190     1199       +9     
==========================================
+ Hits         5332     5346      +14     
- Misses       1826     1851      +25     
- Partials      357      358       +1     
Impacted Files Coverage Δ
jupyter_server/services/contents/manager.py 81.77% <44.06%> (-6.42%) ⬇️
jupyter_server/services/contents/filemanager.py 72.24% <100.00%> (+1.51%) ⬆️
...pyter_server/services/contents/largefilemanager.py 83.83% <100.00%> (ø)
jupyter_server/services/kernels/kernelmanager.py 80.69% <0.00%> (-1.90%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5d962d7...8bef774. Read the comment docs.

Copy link

@rahul26goyal rahul26goyal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont have much context on the functionality.. added some minor comments.

jupyter_server/services/contents/filemanager.py Outdated Show resolved Hide resolved
jupyter_server/services/contents/filemanager.py Outdated Show resolved Hide resolved
@davidbrochart davidbrochart force-pushed the save_hooks branch 3 times, most recently from f78c480 to 5f0f6d2 Compare February 14, 2022 12:55
@davidbrochart
Copy link
Contributor Author

Should we go ahead and merge it? This will allow to move forward with jupyterlab/jupyterlab#11599.

@davidbrochart
Copy link
Contributor Author

I just saw the comments in the team compass:

Should probably have different ways to add/remove hooks as needed

I'm not sure, as this would allow e.g. a server extension to remove the hooks added by another extension?

Order is probably important

It will probably be impossible to enforce a particular order. If e.g. multiple server extensions register hooks, then it is going to be the order in which the extensions are loaded.
I think we should mention that the order in which hooks are called is not known.

Uniqueness should probably be enforced in case the validator triggers multiple times?

Maybe it should also be possible e.g. or different server extensions to register the same hook?

Maybe expose a new trait exposing the collection of hooks for manipulation.

I thinks this could be part of another PR, as it seems to require more discussion and specification.
The goal of this PR is to keep the same API, and just allow for multiple hooks instead of overriding an unique hook. I don't want to block jupyterlab/jupyterlab#11599 if a more substantial work is needed.

@vidartf
Copy link
Member

vidartf commented Feb 22, 2022

Uniqueness should probably be enforced in case the validator triggers multiple times?

Maybe it should also be possible e.g. or different server extensions to register the same hook?

I'm not sure if we are talking about the same thing. I think the original meaning here was "the same function should not be called twice by the same event". Two different server extensions can install a hook, but if they both register the same function, then the hook should only be called once. For example by making the collection of hooks a set.

It will probably be impossible to enforce a particular order. If e.g. multiple server extensions register hooks, then it is going to be the order in which the extensions are loaded.

If this is true, then maybe the RTC handler's use of this hook needs to be rethought? AFAICT, order will be very significant for that to function as expected.

@davidbrochart
Copy link
Contributor Author

If this is true, then maybe the RTC handler's use of this hook needs to be rethought? AFAICT, order will be very significant for that to function as expected.

But if the RTC handler won't be able to use this hook, then will any server extension? How can we enforce hook calling order?

@davidbrochart
Copy link
Contributor Author

I think it is urgent to move on with jupyterlab/jupyterlab#11599, after what was reported in jupyterlab/jupyterlab#12154, since it will likely solve the issue, but this PR is blocking. I suggest that either:

  • we merge this PR as-is,
  • or we drop this PR while we work on a better solution, and we show a warning each time a file hook overwrites an existing one.

I am not aware of anyone using the file hooks (did some GitHub search, but I agree it's not enough), so I would be in favor of merging this PR.
Thoughts?

@Zsailer
Copy link
Member

Zsailer commented Mar 3, 2022

What is self.post_save_hook after it has been set multiple times? It's not a list—it's the value of the last time it was called, correct? The list is stored under self._post_save_hooks.

Here's an problem I see using an example:

If anyone has configured a post_save_hook, which likely lives in a deployment's custom config file (and likely not shared on Github if it's a company), they will expect the value of post_save_hook to be the value coming from their config file. However, since extensions would set this trait after user config is loaded in the server's start-up sequence (init_configurables happens before load_server_extension), the value of this trait would be the post_save_hook provided by the last extension to set it, not user's configuration.

@davidbrochart
Copy link
Contributor Author

davidbrochart commented Mar 3, 2022

Which is not different from the current situation: an extension configures a post_save_hook and then another one configures a new post_save_hook. The first extension expects the value to be the one it has configured, but it's going to be the one from the second extension. Worse, the post_save_hook the first extension configured will never be called.

@davidbrochart
Copy link
Contributor Author

I think this PR only improves on the current situation, and I'm planning to merge it soon if there is no feedback.

@Zsailer
Copy link
Member

Zsailer commented Mar 8, 2022

Which is not different from the current situation: an extension configures a post_save_hook and then another one configures a new post_save_hook. The first extension expects the value to be the one it has configured, but it's going to be the one from the second extension. Worse, the post_save_hook the first extension configured will never be called.

Yes, but this is a bug. That's my point. We shouldn't perpetuate this issue by capitalizing on it here.

This trait wasn't intended to be owned by extensions. As a user—you would expect that your custom configuration supersedes any config set by an extension. In jupyter_server land, where extensions are more prevalent (i.e. everything is an extension), traits like this become an issue. I think we need to fix/deprecate/etc this trait rather than exploit it here.

I would suggest we create a new trait for JupyterLab to use that's a List and deprecate the old. We can iterate on the new trait (e.g. enforcing order, removing duplicates, etc.) to make it more robust over time.

@davidbrochart
Copy link
Contributor Author

davidbrochart commented Mar 8, 2022

Now I'm confused, what is the audience that this feature targets? It seems to me that it's not usable by anybody, and that it should even be removed from the documentation. In my case, it would have saved me a lot of time.

EDIT: sorry for ranting here 😄

@davidbrochart
Copy link
Contributor Author

I can create a new trait that is a List, but it could be dangerous to expose this list and allow anybody to change it. Ideally, one would just append to it, but it is easy to override it. Also, I don't know if appending is supported when the trait is set through a config file?

@Zsailer
Copy link
Member

Zsailer commented Mar 8, 2022

As a user—you would expect that your custom configuration supersedes any config set by an extension

Sorry for the confusion... "user" is a misleading term to use here.

The intended us is: a person who runs Jupyter Server or JupyterLab can write a custom function and pass it to the contents manager through the config system using the ContentsManager.pre_save_hook in a jupyter config file, e.g.:

# jupyter_server_config.py file 
c.FileContentsManager.pre_save_hook = my_custom_function

That's what the documentation is (attempting) to communicate 😅.

I think a good rule of thumb is, configurable traits (i.e. traits with config=True) are intended to be configured by user config (from config files or CLI), not extensions. If we're going to break this rule, we should carefully consider the consequences.

If JupyterLab overrides this trait after the user config is loaded, pre_save_hook will not return my_custom_function but JupyterLab's hook. IMO, that is a bug in its current form.

If we change this trait to be a List and allow multiple sources, including extensions, to provide hooks to that list—that seems okay to me. We probably want to add some extra logic to this List trait to address the issues that the comments above mention.

@Zsailer
Copy link
Member

Zsailer commented Mar 8, 2022

I can create a new trait that is a List, but it could be dangerous to expose this list and allow anybody to change it. Ideally, one would just append to it, but it is easy to override it. Also, I don't know if appending is supported when the trait is set through a config file?

Then, let's not make this a configurable trait?

Let's add a new API to contents managers to register hooks through the API instead of the config system?

@Zsailer
Copy link
Member

Zsailer commented Mar 8, 2022

I think we're just debating the "trait-ness" or "configurability" of this API. 😆 Let's just add a new, separate API for registering hooks from extensions? It doesn't have to collide with or go through this trait?

@davidbrochart
Copy link
Contributor Author

Yes you're right, a new "non-trait" API sounds good 👍

@davidbrochart davidbrochart force-pushed the save_hooks branch 2 times, most recently from 98ca951 to adfe0d0 Compare March 8, 2022 20:15
@davidbrochart
Copy link
Contributor Author

Not sure what's happening in the CI:

File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/zmq/sugar/context.py", line 190, in term
    return super(Context, self).term()

+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++
tests/services/sessions/test_api.py::test_restart_kernel[jp_server_config0] 

@blink1073
Copy link
Collaborator

Not sure what's happening in the CI

I think this is related to jupyter/jupyter_client#751

@@ -18,7 +18,7 @@ def save(self, model, path=""):
if chunk is not None:
path = path.strip("/")

self.run_pre_save_hook(model=model, path=path)
self.run_pre_save_hooks(model=model, path=path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In looking at the history of this issue, including jupyterlab/jupyterlab#11599, it led me to PR #643, where this call was within the chunk == 1 block. As a result, that PR winds up calling the pre-save hook on every chunk, although the post-save hook is (still) only called on the last chunk.

I suspect a bug was introduced there but wanted to bring it up here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thanks for pointing that out Kevin. I think it's just a matter of adding the if chunk == 1 back, right? I can open a new PR for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this particular hook needs to still be constrained to chunk 1 even though chunks don't appear to be in use (and handled in the else block).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it's part of #716. How does it sound to merge this PR and then #716?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the sequence be the other way around? #716 is fixing a bug in the existing (pre-this-PR) code base, then this PR is moving to a multi-valued hook trait. As it stands right now, this PR has a bug - it calls the pre-save hooks on every chunk (when chunking is used). I'd rather see the issue fixed as part of merging #716 and the merge that would need to happen to update this PR be applied, thereby correcting the chunking/pre-save-hooks bug. I thought the end game was to have the multi-valued hooks in place, merging this PR prior to #716 will re-introduce the single-hook trait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with merging #716 first 👍

@davidbrochart
Copy link
Contributor Author

I added a test for register_pre_save_hook, since that's the one I will use in jupyterlab/jupyterlab#11599.
Post-save hooks seems a bit more complicated to test, since we would need to access the return value (the model) of the save method.

@davidbrochart
Copy link
Contributor Author

Actually this should be the response to the request, I'll try to add a test for post-hooks then.

@davidbrochart
Copy link
Contributor Author

I think this is ready to be merged, if someone could hit the button 😄

@Zsailer
Copy link
Member

Zsailer commented Mar 11, 2022

Great work here @davidbrochart! I'm happy to merge this.

We picked up some merge conflicts here after merging #716—do you mind rebasing here?

I don't think we can release until #717 is merged, because we learned from #713 that #165 introduced some backwards incompatible changes.

I propose that we wait until after the weekend to cut a new release with these additions, since releasing before/over a weekend can be dangerous 😅

@davidbrochart
Copy link
Contributor Author

Sounds good @Zsailer, I'll rebase and we'll release next week hopefully!

@davidbrochart
Copy link
Contributor Author

I just rebased, I think the CI failures are unrelated.

@blink1073
Copy link
Collaborator

Kicking CI to pick up fixes

@blink1073 blink1073 closed this Mar 14, 2022
@blink1073 blink1073 reopened this Mar 14, 2022
@blink1073
Copy link
Collaborator

Once this passes, I'll merge and make a minor release

@blink1073
Copy link
Collaborator

Actually, I kicked #724 as well. Assuming both pass we'll get them both in

@blink1073
Copy link
Collaborator

Ditto with #717

@blink1073
Copy link
Collaborator

Also #725

@blink1073 blink1073 merged commit 32b2f74 into jupyter-server:main Mar 14, 2022
@davidbrochart davidbrochart deleted the save_hooks branch March 14, 2022 10:53
@vidartf
Copy link
Member

vidartf commented Mar 31, 2022

@davidbrochart
Copy link
Contributor Author

Will do 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pre/post save hook registration
7 participants