-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic task adding mapped instance #31292
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
you can modify your second DAG to ensure that the mapped instances are created correctly. Here's a suggestion on how to update your code:
|
It can be related to #25060. We need some information about the failed dag run in order to help you debug the problem:
|
@hussein-awala
However, this issue has happened multiple times during random runs. It failed on the first retry. |
We have observed the same issue actually! Here are details of ours: Apache Airflow version Deployment What happened
Full logs: composer_masked_logs.txt Our DAG which has a dynamic task map. Like OP it seems like the index is being incorrectly assigned. This DAG also like OP's, runs rather frequently, 6 times or so per day. It launches 4 -10 dynamically mapped tasks. The strange thing about this is that this error is observed a few times, yet most of the time this DAG runs smoothly, no errors. What you think should happen instead How to reproduce |
I am not sure if we have enough information to investigate the root cause |
The issue reported at the top (against 2.3.3) definitely looks identical to #25060. The one below (against 2.4.3) contains too little information to reasonably digest. I would suggest we close this and move the 2.4.3 report to another issue, and clarify there instead. |
Thanks! Closing as completed. @d2015196 please open a new issue with full description of the problem and reproduce example otherwise we can't investigate. |
@eladkal will try to reproduce on my environment and open new issue. |
Before you open a new issue with your logs and circumstances, I am afraid we will not even know if this is the same issue. Our state of knowledge have not changed since, and without you providing evidences and details of what you experience it's simply impossible to answer your question. |
Apache Airflow version
Other Airflow 2 version (please specify below)
What happened
Airflow version in Cloud Composer: composer-1.19.13-airflow-2.3.3
I have two DAGs for reading Stripe data that are exactly the same except for their scheduling intervals. First, they get a list of all the accounts in the system. Then, they use that list for a dynamic task to get other Stripe data via connected accounts (for this example, balance transactions).
The first DAG runs weekly and has catchup enabled going back to 2022-11-01. This DAG runs without issues. The second DAG is a copy of the first with two changes: it runs every 15 min and the catchup was only to the start of the current day.
On some runs of the second (15 min) DAG, the dynamic task adds a mapped instance with an empty map index. The scheduler crashes which causes the entire environment to go down. The log gives this error:
DETAIL: Key (dag_id, task_id, run_id, map_index)=(<DAG name>, <task name>, scheduled__2023-05-12T19:00:00+00:00, 0) already exists.
This error does not occur in an identical DAG that runs weekly.
What you think should happen instead
The task should have one fewer mapped instance than is showing.
How to reproduce
Operating System
macOS
Versions of Apache Airflow Providers
composer-1.19.13-airflow-2.3.3
Deployment
Google Cloud Composer
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: