Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError on running tasks in parallel #7656

Closed
4 tasks done
deepanshu-zluri opened this issue Nov 25, 2022 · 22 comments
Closed
4 tasks done

AssertionError on running tasks in parallel #7656

deepanshu-zluri opened this issue Nov 25, 2022 · 22 comments
Labels
bug Something isn't working status:upstream An upstream issue caused by a bug in one of our dependencies

Comments

@deepanshu-zluri
Copy link

deepanshu-zluri commented Nov 25, 2022

First check

  • I added a descriptive title to this issue.
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the Prefect documentation for this issue.
  • I checked that this issue is related to Prefect and not one of its dependencies.

Bug summary

  1. Having a main flow which uses sequential task runner, calling a subflow which just calls the .map function on the task trying to create parallel task runs.
  2. few of those tasks crash , giving AssertionError

Screenshot 2022-11-25 at 10 57 50 AM

Reproduction

from prefect import task, flow
from prefect import get_run_logger
import time
from prefect.task_runners import SequentialTaskRunner


@task(name='run_scheduler')
def run_scheduler(event):
    scheduler_output = [i for i in range(1, 300)]
    time.sleep(150)  # Adding sleep time to make task bit long running
    return scheduler_output


@task(name='run_executor', tags=['spends_executor'])
def run_executor(scheduler_output):
    time.sleep(150)  # Adding sleep time to make task bit long running
    executor_output = f"printing just the executor input {scheduler_output}"
    return executor_output


@flow(task_runner=SequentialTaskRunner())
def spends_flow(event):
    logger = get_run_logger()
    logger.info(event)

    # Calling Scheduler task
    scheduler_output = run_scheduler(event)

    # if the output is empty array just log it else calling subflow which creates parallel task executions

    if len(scheduler_output) < 1:
        logger.info("no elements in array")
    else:
        spends_executor(scheduler_output)
    logger.info('flow completed')


@flow
def spends_executor(scheduler_output):
    run_executor.map(scheduler_output)


if __name__ == "__main__":
    event = "test"
    spends_flow(event)

Error

Encountered exception during execution:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/prefect/engine.py", line 1247, in orchestrate_task_run
    result = await run_sync(task.fn, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 68, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(call, cancellable=True)
  File "/usr/local/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "flows/test_calculate_spends_flows/test_spends_flow.py", line 16, in run_executor
    output = run_deployment(
  File "/usr/local/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 197, in coroutine_wrapper
    return run_async_from_worker_thread(async_fn, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 148, in run_async_from_worker_thread
    return anyio.from_thread.run(call)
  File "/usr/local/lib/python3.9/site-packages/anyio/from_thread.py", line 49, in run
    return asynclib.run_async_from_thread(func, *args)
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
    return f.result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.9/site-packages/prefect/client/utilities.py", line 47, in with_injected_client
    return await fn(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/prefect/deployments.py", line 131, in run_deployment
    flow_run = await client.read_flow_run(flow_run_id)
  File "/usr/local/lib/python3.9/site-packages/prefect/client/orion.py", line 1443, in read_flow_run
    response = await self._client.get(f"/flow_runs/{flow_run_id}")
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1757, in get
    return await self.request(
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1533, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.9/site-packages/prefect/client/base.py", line 160, in send
    await super().send(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1620, in send
    response = await self._send_handling_auth(
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1648, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1685, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1722, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 221, in handle_async_request
    await self._attempt_to_acquire_connection(status)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 160, in _attempt_to_acquire_connection
    status.set_connection(connection)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 22, in set_connection
    assert self.connection is None
AssertionError

Versions

agent
Version:             2.6.6
API version:         0.8.3
Python version:      3.9.15
Git commit:          87767cda
Built:               Thu, Nov 3, 2022 1:15 PM
OS/Arch:             linux/x86_64
Profile:             default
Server type:         hosted

self hosted orion server
Version:             2.6.0
API version:         0.8.2
Python version:      3.9.14
Git commit:          96f09a51
Built:               Thu, Oct 13, 2022 3:21 PM
OS/Arch:             linux/x86_64
Profile:             default
Server type:         hosted

Additional context

No response

@deepanshu-zluri deepanshu-zluri added bug Something isn't working status:triage labels Nov 25, 2022
@padbk
Copy link
Contributor

padbk commented Nov 25, 2022

We are seeing something very similar, but also on ConcurrentTaskRunner flows.

We are running 2.6.9 python 3.9 on both the agent and orion, running from the Helm chart on EKS.

image

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/prefect/engine.py", line 610, in orchestrate_flow_run
    result = await run_sync(flow_call)
  File "/usr/local/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 68, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(call, cancellable=True)
  File "/usr/local/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "ourflow.py", line 43, in our_flow
    run_task(ourtask, wait_for=another_task)
  File "/usr/local/lib/python3.9/site-packages/prefect/tasks.py", line 360, in __call__
    return enter_task_run_engine(
  File "/usr/local/lib/python3.9/site-packages/prefect/engine.py", line 733, in enter_task_run_engine
    return run_async_from_worker_thread(begin_run)
  File "/usr/local/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 148, in run_async_from_worker_thread
    return anyio.from_thread.run(call)
  File "/usr/local/lib/python3.9/site-packages/anyio/from_thread.py", line 49, in run
    return asynclib.run_async_from_thread(func, *args)
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
    return f.result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.9/site-packages/prefect/engine.py", line 874, in get_task_call_return_value
    return await future._result()
  File "/usr/local/lib/python3.9/site-packages/prefect/futures.py", line 237, in _result
    return await final_state.result(raise_on_failure=raise_on_failure, fetch=True)
  File "/usr/local/lib/python3.9/site-packages/prefect/states.py", line 86, in _get_state_result
    raise MissingResult(
prefect.exceptions.MissingResult: State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.

@gmega
Copy link

gmega commented Nov 25, 2022

I'm using SequentialTaskRunner and no async tasks (i.e. not running anything in parallel), and getting the same error as @deepanshu-zluri. Also running 2.6.9, but on python 3.10. We're using Prefect Cloud.

Crash details:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1405, in report_task_run_crashes
    yield
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1108, in begin_task_run
    state = await orchestrate_task_run(
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1203, in orchestrate_task_run
    state = await propose_state(
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1534, in propose_state
    response = await client.set_task_run_state(
  File "/usr/local/lib/python3.10/site-packages/prefect/client/orion.py", line 1687, in set_task_run_state
    response = await self._client.post(
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1848, in post
    return await self.request(
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1533, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.10/site-packages/prefect/client/base.py", line 160, in send
    await super().send(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1620, in send
    response = await self._send_handling_auth(
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1648, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1685, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1722, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 221, in handle_async_request
    await self._attempt_to_acquire_connection(status)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 160, in _attempt_to_acquire_connection
    status.set_connection(connection)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 22, in set_connection
    assert self.connection is None
AssertionError

@zanieb
Copy link
Contributor

zanieb commented Nov 25, 2022

Hi! I believe this error is from the most recent httpcore/httpx releases — I'd recommend downgrading those dependencies to the previous version.

@padbk that looks like an unrelated issue.

@zanieb zanieb added status:upstream An upstream issue caused by a bug in one of our dependencies and removed status:triage labels Nov 25, 2022
@zanieb
Copy link
Contributor

zanieb commented Nov 25, 2022

Looks like they've addressed this upstream encode/httpcore#627

@sti0
Copy link

sti0 commented Nov 28, 2022

Seems that this was fixed with v0.16.2. My flows return to stable again.

@deepanshu-zluri
Copy link
Author

deepanshu-zluri commented Nov 28, 2022

@sti0 how did you change the httpcore version ? if you are using any particular prefect image for agent can you please share it ? i added httpcore == 0.16.2 in my requirements but im still getting the same issue

@bennnym
Copy link

bennnym commented Nov 28, 2022

I also used that httpcore version and am getting the same issues.

@sti0
Copy link

sti0 commented Nov 28, 2022

@deepanshu-zluri I'm not using the prefect base image. I build my own one and since httpcore v.0.16.2 the errors are gone away. Install prefect with pip on the build process (based on python-3.10).
You can maybe use the EXTRA_PIP_PACKAGES env to force an update on the base image.
https://docs.prefect.io/concepts/infrastructure/?h=extra_pip_packages#installing-extra-dependencies-at-runtime

@sti0
Copy link

sti0 commented Nov 28, 2022

This would force install of v0.16.2 on the base image:

docker run -e EXTRA_PIP_PACKAGES="httpcore==0.16.2" --rm -it prefecthq/prefect:2.6.9-python3.10 pip list

@deepanshu-zluri
Copy link
Author

@sti0 yes i did that. still didnt work

@bennnym
Copy link

bennnym commented Nov 28, 2022

I have this as my image

FROM        prefecthq/prefect:2.6.7-python3.9


RUN         pip install prefect-aws==0.1.8 httpcore==0.16.2

It does not work.

@sti0
Copy link

sti0 commented Nov 28, 2022

@bennnym as you are building on your own you may try to use Python base image and install prefect with pip. This is what I do and what works in my case. But maybe you run on another infrastructure or something so this won't work for you.

@bennnym
Copy link

bennnym commented Nov 28, 2022

Yeah I build on a custom image. That would work too, I don't see why though, I can try and report back.

@sti0
Copy link

sti0 commented Nov 28, 2022

Ah, I see you both using Python 3.9? I'm on 3.10, maybe theres some difference?!

And @bennnym you may like to update prefect to the latest version. Otherwise we diff apples with oranges.

My setup is latest prefect 2.6.9 with 3.10. Maybe there are other interferences on the older versions.

@zanieb zanieb closed this as completed Nov 28, 2022
@deepanshu-zluri
Copy link
Author

@sti0 is it possible to share the custom dockerfile you are using (maybe without the libraries specific to your usecase)

@sti0
Copy link

sti0 commented Dec 2, 2022

Hi @deepanshu-zluri , I can't share my file directly but there is no real magic behind. Attention we pre-bundle all our stuff into one image and use it as infrastructure. Our agent and server are running in Docker images with default prefect image and Python 3.10. So it depends how you use it later.

So just a little snippet:

FROM python:3.10-slim-bullseye
# copy all flows and other internal dependencies
# there is also a requirements.txt which defines prefect version etc.
COPY prefect /prefect
RUN pip install --no-warn-script-location /prefect/.

@padbk
Copy link
Contributor

padbk commented Dec 8, 2022

Looks like our error has been solved by upgrading to 2.7.0

@deepanshu-zluri
Copy link
Author

@sti0 for agent and server which default image of prefect are you using ?

@sti0
Copy link

sti0 commented Dec 15, 2022

@deepanshu-zluri prefect:2.7.1-python3.10

@deepanshu-zluri
Copy link
Author

did you run into any namespace issues with it ?

kubernetes.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': '1d6523ac-c50c-4589-a61b-879d27067bfb', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '83345a0d-ffb4-45f8-84bf-b0cee0f55505', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'd25d47ea-b08d-49c0-8b39-d589ce88b7c1', 'Date': 'Thu, 15 Dec 2022 05:49:11 GMT', 'Content-Length': '366'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"namespaces "kube-system" is forbidden: User "system:serviceaccount:prod-prefect-self-hosted-ns:prod-agent-sa" cannot get resource "namespaces" in API group "" in the namespace "kube-system"","reason":"Forbidden","details":{"name":"kube-system","kind":"namespaces"},"code":403}

im getting this as soon as i changed my image. if i revert to prefect 2.6.0 on python 3.9 default image this doesnt occur

@sti0
Copy link

sti0 commented Dec 15, 2022

No issues. But we are running plane Docker containers. Not using k8s

@deepanshu-zluri
Copy link
Author

i dont think that should make any difference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working status:upstream An upstream issue caused by a bug in one of our dependencies
Projects
None yet
Development

No branches or pull requests

6 participants