Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limited queue length causes internal errors with aiobotocore #136

Open
DRMacIver opened this issue Jan 16, 2024 · 1 comment
Open

Limited queue length causes internal errors with aiobotocore #136

DRMacIver opened this issue Jan 16, 2024 · 1 comment

Comments

@DRMacIver
Copy link

Working on some new code I forgot to use the workaround for #130 when creating a very large number of parallel tasks using aiobotocore.

Here's a simplified example triggering it:

import trio_asyncio
from aiobotocore.session import get_session
from botocore.config import Config as BotoConfig
from trio_asyncio import aio_as_trio
import trio

# Replace with some S3 bucket and key. I didn't have a good public one to reference, sorry.
MY_BUCKET = '...'
MY_KEY = '...'
MY_REGION = '...'

async def main():
    session = get_session()
    async with aio_as_trio(session.create_client('s3', region_name=MY_REGION, config=BotoConfig(retries={'max_attempts': 20}))) as client:
        async with trio.open_nursery() as nursery:
            for _ in range(10000):
                @nursery.start_soon
                async def download_file_from_s3():
                    response = await aio_as_trio(client.get_object(
                        Bucket=MY_BUCKET,
                        Key=MY_KEY,
                    ))

                    async with await trio.open_file(target, 'wb') as o:
                        body = response['Body']
                        async with aio_as_trio(body):
                            while True:
                                chunk = await aio_as_trio(body.read(10 ** 6))
                                if not chunk:
                                    break

if __name__ == '__main__':
    trio_asyncio.run(main)

As well as the initial exception from #130 (which is expected), this gives a bunch of other interesting internal errors:

In particular:

AssertionError:
Exception ignored in: <coroutine object Runner.init at 0x7f143b608760>
Traceback (most recent call last):
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 1909, in init
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 958, in __aexit__
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 1101, in _nested_child_finished
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 1080, in _add_exc
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_ki.py", line 181, in wrapper
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 796, in cancel
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 453, in recalculate
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 1439, in _attempt_delivery_of_any_pending_cancel
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 1421, in _attempt_abort
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_io_epoll.py", line 306, in abort
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_io_epoll.py", line 275, in _update_registrations
ValueError: I/O operation on closed epoll object
Exception ignored in: <function Nursery.__del__ at 0x7f143bfadc60>
Traceback (most recent call last):
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 1266, in __del__
AssertionError:
Exception ignored in: <coroutine object run.<locals>._run_task at 0x7f143a8c2c50>
Traceback (most recent call last):
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio_asyncio/_loop.py", line 527, in _run_task
  File "/usr/lib64/python3.11/contextlib.py", line 222, in __aexit__
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio_asyncio/_loop.py", line 453, in open_loop
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 958, in __aexit__
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 1101, in _nested_child_finished
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 1080, in _add_exc
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_ki.py", line 181, in wrapper
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 796, in cancel
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 446, in recalculate
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 825, in cancel_called
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_generated_run.py", line 79, in current_time
RuntimeError: must be called from async context
Task was destroyed but it is pending!
task: <Task cancelling name='Task-8630' coro=<AioBaseClient._make_api_call() done, defined at /home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/aiobotocore/client.py:324> wait_for=<Future cancelled> cb=[run_aio_future.<locals>.done_cb() at /home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio_asyncio/_util.py:28]>

Then once a really large number of errors like that have finished, we start seeing:

Exception in default exception handler
Traceback (most recent call last):
  File "/usr/lib64/python3.11/asyncio/base_events.py", line 1797, in call_exception_handler
    self.default_exception_handler(context)
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio_asyncio/_async.py", line 50, in default_exception_handler
    self._nursery.start_soon(propagate_asyncio_error)
  File "/home/ec2-user/.local/share/virtualenvs/my-project/lib/python3.11/site-packages/trio/_core/_run.py", line 1191, in start_soon
    GLOBAL_RUN_CONTEXT.runner.spawn_impl(async_fn, args, self, name)
    ^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'RunContext' object has no attribute 'runner'

Presumably there's some underlying root problem causing all these things to go wrong together, I'm not sure. The GLOBAL_RUN_CONTEXT.runner bit at the end is super suspicious - the only place I can find that can delete that attribute is here, but as far as I know this code isn't ever forking.

@oremanj oremanj changed the title Various internal errors when triggering #130 with aiobotocore Limited queue length causes internal errors with aiobotocore Feb 8, 2024
@oremanj
Copy link
Member

oremanj commented Feb 8, 2024

I found this very difficult to track down, so worked around it by making the default queue length unlimited. Leaving the issue open as a pointer towards issues that arise with limited queue length, but that's no longer a likely configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants