New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add isConnected
to ServiceManager
, use it in hub-extension
#10156
Add isConnected
to ServiceManager
, use it in hub-extension
#10156
Conversation
Thanks for making a pull request to JupyterLab! To try out this branch on binder, follow this link: |
Hi @vkaidalov-rft! Thank you for your contribution! I think this approach makes sense and is a sensible addition. The one misgiving I have is about where the To me, this sounds more like an attribute of the specific application instance at runtime, so I would propose creating an attribute in The downside of this is that you'll need to thread the info through, but semantically it seems like it is a more natural location for this information. What do you think? |
7240863
to
e68b01a
Compare
e68b01a
to
7444026
Compare
Hello @afshin. Thank you for the review! |
Thanks for looking into this. We're definitely not being very smart about server down conditions. A couple of thoughts:
|
@jasongrout Thanks for a second look at @vkaidalov-rft's PR. Since this will affect the API surface area, more deliberation is good.
I didn't know how to think about this, but your description makes me wonder if what we actually wanted is something like a |
Yes, I think that conveys the intent better, and is similar to the terminology we use in the services package for kernel connections. |
Hi @vkaidalov-rft could you update the PR and rebase/merge with master? Thanks! |
Hi @goanpeca! Sure, I will be glad to finish the PR. Thank you for reminding me. Hi @jasongrout! Thank you a lot for your feedback.
As far as I know, the retry-after header is indeed not taken into account by the current implementation. However, I don't think that JupyterHub may set this header to a useful value in the case of the server maintenance described in the linked issue by @n-a-sz or in the case of the The exponential backoff strategy has already been used by the pollers in the case of failed requests. The number of the periodical requests sent to the cluster showed itself to stay huge in practice even though the frequency was getting reduced. The goal of this PR is to stop polling the cluster unless the error
I totally agree with you, let's rename it. As @afshin proposed, it might be named |
Hey @vkaidalov-rft This will be a great addition. I would support getting in it. Could you do the rename to |
250ac1d
to
2775683
Compare
Hi @fcollonval. Thank you for your feedback! I've done the rename to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviving your PR @vkaidalov-rft.
bandwidthSaveMode
to ServiceManager
, use it in hub-extension
isConnected
to ServiceManager
, use it in hub-extension
References
Resolves #7756.
Code changes
Currently, the periodic HTTP requests are sent by
KernelManager
,SessionManager
and the other specific managers created within the commonServiceManager
instance. Each specific manager creates an instance of thePoll
class from thelumino
library. The instances are configured in such a way that the frequency of the polling is getting gradually reduced in case of unsuccessful results of the requests.The
Poll
instance used by theFileBrowserModel
is created in thefilebrowser
package and is not connected to theServiceManager
. An instance of theServiceManager
is still available there.The
SaveHandler
class of thedocmanager
package also tries to save the file that it handles if there are any unsaved changes. It does it periodically using thesetTimeout
function. In case of a server shutdown, the unsaved files cause a lot of logs due to the retries.As a solution, the
bandwidthSaveMode
boolean property was added to theServiceManager
class, so that the extensions could switch it to inform the other extensions whether to pause sending periodical requests. This property can also be used by third-party extensions, e.g. to pause polling the/clusters
endpoint in the Dask extension for JupyterLab that also sends periodic requests.Given the fact that there had already been the "decorrelated jitter" strategy implemented in
lumino
and used, I only added switching thebandwidthSaveMode
property in thehub-extension
, so that thebandwidthSaveMode
is turned on if there is aDialog
widget showing a disconnection error message created by thehub-extension
in particular. I suppose that the issue is not related to the scenario when JupyterLab is launched locally, so this PR doesn't modify the current behaviour in any other cases. I'm looking forward to your reviews!User-facing changes
For those who deploy and maintain JupyterLab under JupyterHub, the changes will significantly reduce the number of logs produced by JupyterHub in case of both intended (e.g. server maintanence or the
cull-idle-servers
service of JupyterHub enabled) and unintended shutdowns of the individual notebook servers.Before
Despite the disconnection error, the periodic requests are still sent creating a lot of 302 and 503 log records. To reproduce this, one can manually shut down the notebook server via the Hub Control Panel.
After
The polling gets paused if there is a disconnection error message created by the
hub-extension
.Backwards-incompatible changes
No backwards-incompatible changes have been found by me so far.