Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wait to cleanup kernels after kernel is finished pending #748

Merged
merged 2 commits into from Mar 23, 2022

Conversation

Zsailer
Copy link
Member

@Zsailer Zsailer commented Mar 22, 2022

This PR fixes an issue I'm seeing when working with pending kernels.

There's race condition when shutting a kernel down. The entries in _kernel_connections and _kernel_ports for a pending shutdown get removed before that pending kernel fully finished terminating.

This raises the following exception in the kernels handler as a consequence:

Uncaught exception GET /api/kernels/0b8f1e20-e8c6-43a2-bb96-8f07dc702636/channels?session_id=61f58256-f8d3-48f3-a12e-0ac6c0f4c856 (127.0.0.1)
    HTTPServerRequest(protocol='http', host='127.0.0.1:8888', method='GET', uri='/api/kernels/0b8f1e20-e8c6-43a2-bb96-8f07dc702636/channels?session_id=61f58256-f8d3-48f3-a12e-0ac6c0f4c856', version='HTTP/1.1', remote_ip='127.0.0.1')
    Traceback (most recent call last):
      File "/.../python3.8/site-packages/tornado/web.py", line 1704, in _execute
        result = await result
      File "/.../python3.8/site-packages/jupyter_server/services/kernels/handlers.py", line 409, in get
        await super(ZMQChannelsHandler, self).get(kernel_id=kernel_id)
      File "/.../lib/python3.8/site-packages/jupyter_server/base/zmqhandlers.py", line 341, in get
        await res
      File "/.../python3.8/site-packages/tornado/websocket.py", line 278, in get
        await self.ws_connection.accept_connection(self)
      File "/.../python3.8/site-packages/tornado/websocket.py", line 879, in accept_connection
        await self._accept_connection(handler)
      File "/.../python3.8/site-packages/tornado/websocket.py", line 962, in _accept_connection
        await self._receive_frame_loop()
      File "/.../python3.8/site-packages/tornado/websocket.py", line 1119, in _receive_frame_loop
        self.handler.on_ws_connection_close(self.close_code, self.close_reason)
      File "/.../python3.8/site-packages/tornado/websocket.py", line 576, in on_ws_connection_close
        self.on_connection_close()
      File "/.../python3.8/site-packages/tornado/websocket.py", line 568, in on_connection_close
        self.on_close()
      File "/.../python3.8/site-packages/jupyter_server/services/kernels/handlers.py", line 722, in on_close
        if km._kernel_connections[self.kernel_id] == 0:
    KeyError: '0b8f1e20-e8c6-43a2-bb96-8f07dc702636'

@Zsailer Zsailer added the bug label Mar 22, 2022
@codecov-commenter
Copy link

codecov-commenter commented Mar 22, 2022

Codecov Report

Merging #748 (f827c30) into main (3864399) will not change coverage.
The diff coverage is 100.00%.

❗ Current head f827c30 differs from pull request most recent head 7f30adc. Consider uploading reports for the commit 7f30adc to get more accurate results

@@           Coverage Diff           @@
##             main     #748   +/-   ##
=======================================
  Coverage   70.51%   70.51%           
=======================================
  Files          62       62           
  Lines        7615     7615           
  Branches     1216     1216           
=======================================
  Hits         5370     5370           
  Misses       1877     1877           
  Partials      368      368           
Impacted Files Coverage Δ
jupyter_server/services/kernels/kernelmanager.py 80.69% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3864399...7f30adc. Read the comment docs.

Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Zsailer - nice catch. These changes make sense, just had a question regarding the conditions in which they're invoked.

jupyter_server/services/kernels/kernelmanager.py Outdated Show resolved Hide resolved
jupyter_server/services/kernels/kernelmanager.py Outdated Show resolved Hide resolved
Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Thanks @Zsailer!

@blink1073 blink1073 merged commit cd1b1b8 into jupyter-server:main Mar 23, 2022
@Zsailer Zsailer deleted the pending-kernel-cleanup branch January 16, 2024 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants