Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove double collection of dags in airflow dags reserialize #27030

Merged
merged 3 commits into from Oct 13, 2022

Conversation

ephraimbuddy
Copy link
Contributor

We explicitly call dagbag.collect_dags after instantiating DagBag in airflow dags reserialize code.

The method collect_dags is called on instantiation of the DagBag so calling it again means more processing of the same dags.

Here, we use a variable to achieve the same needed effect on reserialization

@ephraimbuddy
Copy link
Contributor Author

May reduce the delays seen here

We explicitly call dagbag.collect_dags after instantiating DagBag in
airflow dags reserialize code.

The method collect_dags is called on instantiation
of the DagBag so calling it again means more processing of the same dags.

Here, we use a variable to achieve the same needed effect on reserialization
airflow/utils/db.py Outdated Show resolved Hide resolved
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit,LGTM otherwise

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
@ephraimbuddy ephraimbuddy changed the title Remove explicit call of dagbag.collect_dags in airflow dags reserialize Remove double collection of dags in airflow dags reserialize Oct 13, 2022
@ephraimbuddy ephraimbuddy merged commit 36e2e43 into apache:main Oct 13, 2022
@ephraimbuddy ephraimbuddy deleted the improve-reserialize branch October 13, 2022 18:29
@ephraimbuddy ephraimbuddy added this to the Airflow 2.4.2 milestone Oct 18, 2022
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Oct 18, 2022
ephraimbuddy added a commit that referenced this pull request Oct 18, 2022
We explicitly call dagbag.collect_dags after instantiating DagBag in
airflow dags reserialize code.

The method collect_dags is called on instantiation
of the DagBag so calling it again means more processing of the same dags.

Here, we use a variable to achieve the same needed effect on reserialization

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
(cherry picked from commit 36e2e43)
ephraimbuddy added a commit that referenced this pull request Oct 18, 2022
We explicitly call dagbag.collect_dags after instantiating DagBag in
airflow dags reserialize code.

The method collect_dags is called on instantiation
of the DagBag so calling it again means more processing of the same dags.

Here, we use a variable to achieve the same needed effect on reserialization

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
(cherry picked from commit 36e2e43)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants