Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent server fails to handle asyncio.CancelledError from create_kernels() #1128

Closed
rapsealk opened this issue Mar 6, 2023 · 1 comment · May be fixed by #1129
Closed

Agent server fails to handle asyncio.CancelledError from create_kernels() #1128

rapsealk opened this issue Mar 6, 2023 · 1 comment · May be fixed by #1129
Assignees
Labels
comp:agent Related to Agent component type:bug Reports about that are not working
Milestone

Comments

@rapsealk
Copy link
Member

rapsealk commented Mar 6, 2023

What Operating System(s) are you seeing this problem on?

macOS (Apple Silicon)

Backend.AI version

b2c4a7a

Describe the bug

The agent server fails to handle asyncio.CancelledError in some cases:

2023-03-06 11:07:07.884 ERROR callosum.rpc.channel.Peer [16442] RPC user error
Traceback (most recent call last):
  File "/Users/rapsealk/Desktop/git/backend.ai-dev/dist/export/python/virtualenvs/python-default/3.11.2/lib/python3.11/site-packages/callosum/rpc/channel.py", line 281, in _func_task
    result = await self._func_scheduler.get_fut(server_request_id)
    ^^^^^^^^^^^^^^^^^
  File "/Users/rapsealk/Desktop/git/backend.ai-dev/dist/export/python/virtualenvs/python-default/3.11.2/lib/python3.11/site-packages/callosum/ordering.py", line 224, in get_fut
    return await task
  File "/Users/rapsealk/Desktop/git/backend.ai-dev/src/ai/backend/agent/server.py", line 158, in _inner
    return await meth(
    ^^^^^^^^^^^^^^^^^
  File "/Users/rapsealk/Desktop/git/backend.ai-dev/src/ai/backend/agent/server.py", line 133, in _inner
    return await meth(self, *args, **kwargs)
    ^^^^^^^^^^^^^^^^^
  File "/Users/rapsealk/Desktop/git/backend.ai-dev/src/ai/backend/agent/server.py", line 386, in create_kernels
    raw_results = [
  File "/Users/rapsealk/Desktop/git/backend.ai-dev/src/ai/backend/agent/server.py", line 388, in <listcomp>
    "id": str(result["id"]),
    ^^^^^^^^^^^^^^^^^
TypeError: 'CancelledError' object is not subscriptable

asyncio.CancelledError has been a subclass of BaseException rather than Exception since Python 3.8. Therefore, it cannot be filtered by condition filter(lambda item: isinstance(item, Exception)).

errors = [*filter(lambda item: isinstance(item, Exception), results)]

To Reproduce

No response

Expected Behavior

Raises an exception and does not attempt to read values by index.

Anything else?

No response

@rapsealk rapsealk added the type:bug Reports about that are not working label Mar 6, 2023
@rapsealk rapsealk added this to the 23.03 milestone Mar 6, 2023
@rapsealk rapsealk self-assigned this Mar 6, 2023
@rapsealk rapsealk modified the milestones: 23.03, 24.03 Apr 4, 2024
@rapsealk rapsealk added the comp:agent Related to Agent component label Apr 4, 2024
@achimnol
Copy link
Member

achimnol commented Apr 17, 2024

That part of code has been refactored in #1771.

match result:
case BaseException():
errors.append(result)
case _:

It should not have this issue for 23.09 and 24.03 releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:agent Related to Agent component type:bug Reports about that are not working
Projects
None yet
2 participants