Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Technical CancelledError bubbling up where it should not #8175

Open
1 task done
dmoklaf opened this issue Feb 20, 2024 · 2 comments
Open
1 task done

Technical CancelledError bubbling up where it should not #8175

dmoklaf opened this issue Feb 20, 2024 · 2 comments
Labels

Comments

@dmoklaf
Copy link

dmoklaf commented Feb 20, 2024

Describe the bug

I am using a large-scale crawler based on aiohttp client. Some old web servers break connections randomly when under load (hence this bug is difficult to reproduce). When they do, in rare cases, the aiohttp client emits a CancelledError. Investigating the code, I found where this is coming from:

await self._writer

await self._writer

These methods do not suppress the CancelledError that may be raised internally by the _writer (a task wrapping the write_bytes method). This task emits a CancelledError in very rare cases, when it is cancelled by the aiohttp internal machinery before having been truly started by the asyncio scheduler. In this case, none of the internal CancelledError try...except mechanisms it has are called, and the CancelledError, entirely technical to serve the internal aiohttp logic, bubbles up to the library caller.

To Reproduce

Difficult to reproduce (a large scale crawler is necessary to reach this error)

Expected behavior

The CancelledError, being purely internal, shall not bubble up to the library caller. It shall be intercepted in the 2 methods above.

To do so, a robust CancelledError suppression mechanism, as the one posted in #8174, can be used.

However, I am not familiar with the internals of the aiohttp libray, I am not sure if both methods need to be patched, one only or none (different fix needed).

Logs/tracebacks

None

Python Version

$ python --version
Python 3.11.6

aiohttp Version

$ python -m pip show aiohttp
Name: aiohttp
Version: 3.9.3
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: 
Author-email: 
License: Apache 2
Location: /opt/homebrew/Caskroom/miniforge/base/envs/work/lib/python3.11/site-packages
Requires: aiosignal, attrs, frozenlist, multidict, yarl
Required-by: aiohttp-jinja2, asyncpraw, asyncprawcore, datasets, httpstan, pystan

multidict Version

$ python -m pip show multidict
Name: multidict
Version: 6.0.5
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /opt/homebrew/Caskroom/miniforge/base/envs/work/lib/python3.11/site-packages
Requires: 
Required-by: aiohttp, yarl

yarl Version

$ python -m pip show yarl
Name: yarl
Version: 1.9.4
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache-2.0
Location: /opt/homebrew/Caskroom/miniforge/base/envs/work/lib/python3.11/site-packages
Requires: idna, multidict
Required-by: aiohttp, asyncprawcore

OS

MacOS Sonoma 14.2.1

Related component

Client

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@dmoklaf dmoklaf added the bug label Feb 20, 2024
@bdraco
Copy link
Member

bdraco commented Feb 20, 2024

However, I am not familiar with the internals of the aiohttp libray, I am not sure if both methods need to be patched, one only or none (different fix needed).

Its likely both since the cancellation of the writer shouldn't be leaking upward

@dmoklaf
Copy link
Author

dmoklaf commented Feb 21, 2024

It looks like this refactoring would also allow to simplify the code of the write_bytes method underlying the _writer task:

This method catches CancelledError which would be useless with the new mechanism described above. Also, this method seems to catch other errors, but not completely. In some cases, it exposes itself to new exception being raised, e.g. in

await writer.write_eof()

and

await writer.write_eof()

So there seems to be an unconsistent mix of behaviors regarding exceptions handling/raising. But maybe connection management (ie detecting broken connections and bubbling up these I/O errors) make it necessary.

Reading the entire class, it looks like there is a lot of complexity around the _writer task to work around the various cases. Maybe all this _writer code could be simplified by using a clean single callback connected to the task, together with a Future, to handle properly the various task exit cases.

I am definitely not versed enough in this repository to guess what would be the correct change to perform to align all this on a single, simple behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants