Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIP-39: Handle DAG scheduling with timetables #15397

Merged
merged 1 commit into from Jun 29, 2021

Conversation

uranusjr
Copy link
Member

@uranusjr uranusjr commented Apr 16, 2021

This creates a skeleton for the AIP-39 implementation to build on, and refactor the logic handling @once schedules out of the DAG class.

The current implementation contains a lot of duplicated code, and the PR probably shouldn’t be merged until I can pick most of next_dagrun_after_date apart (and ultimately eliminate it entirely and rewrite all the tests against DAG.next_dagrun_info() or the various TimeTable classes instead).

Update: I’ve since did that, so this PR is clean now.

I want to post this WIP since the trivial @once implementation already exposes things missing from the interface outlined in AIP-39, and I had to invent an argument (between) to cover it. I want to know how ya’ll think of it 🙂

(Note: The current TimeTable.next_dagrun_info() interface is missing the session argument from AIP-39 because I don’t need it yet. It will be added when it needs to be.)


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@uranusjr uranusjr changed the title Initial work toward AIP-39 AIP-39: Foundational TimeTable interface, OnceTimeTable implementation Apr 16, 2021
@jhtimmins
Copy link
Contributor

@uranusjr Can you explain why the between argument is necessary? just from looking at the code I can't tell

@uranusjr
Copy link
Member Author

betwewn: TimeRestriction is a time interval that the next DAG run must be scheduled between, reflecting start_date and end_date of the DAG and tasks in it. A run cannot be scheduled before the latest start_date (must be postponed until then), nor after the earliest end_date (must be cancelled). This logic is implemented in TimeRestriction.restrict() to adjust the schedule “preferred” by the time table.

I’ve actually removed restrict() and put the logic inside next_dagrun_info() instead (will push shortly) since the restriction logic needs to be slightly different depending on whether the time table is based on a data interval or not.

@uranusjr
Copy link
Member Author

uranusjr commented Apr 20, 2021

Alright, I’ve pushed my implementation to all the possible schedule_interval variants in use, and this should be ready to be merged on its own. The next step would be to gradually refactor things calling into the DAG class to use the time table API, instead of accessing schedule_interval and methods related to it.

I’ve noticed a difficulty during my initial investigation though. There are various places using DAG.previous_schedule(), which calculates the DAG’s previous run schedule (logically; no guarantee the run actually happened). This is difficult to represent with the time table design, since the time table only schedules the next run (which can be used to replace DAG.following_schedule(), in case someone wonders).

The simple solution would be to introduce something like TimeTable.prev_dagrun_info() to do it, but I’m wondering, is this really a good way to calculate the “previous run”? I’d imagine I could build something like a linked list of DAG runs, so I can just query when the last scheduled DAG run was from the database, instead of calculating a logical value that does not actually guarantee to reflect reality…

@uranusjr uranusjr marked this pull request as ready for review April 20, 2021 18:08
@uranusjr uranusjr changed the title AIP-39: Foundational TimeTable interface, OnceTimeTable implementation AIP-39: Use TimeTable interface to implement scheduling inside the DAG class Apr 20, 2021
@ashb ashb requested a review from jhtimmins April 20, 2021 19:41
@ashb
Copy link
Member

ashb commented Apr 20, 2021

Is your PR description (or the bit about "should not be merged") still accurate?

airflow/models/dag.py Outdated Show resolved Hide resolved
airflow/models/dag.py Outdated Show resolved Hide resolved
@uranusjr
Copy link
Member Author

Is your PR description (or the bit about "should not be merged") still accurate?

No it’s not, thanks for catching this. I’ve edited the top message to reflect the current status.

@uranusjr
Copy link
Member Author

Grr, not sure what’s going on the the Static Checks job, it just says “This check failed” for me and nothing else.

The Postgres check failed with

Cannot start service trino: driver failed programming external connectivity on endpoint trino (87de12f0f1304fe6056905791286b37b88245be569e5be81c8f993c29560fd28):
Error starting userland proxy: listen tcp4 0.0.0.0:38080: bind: address already in use

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

airflow/models/dag.py Outdated Show resolved Hide resolved
airflow/models/dag.py Outdated Show resolved Hide resolved
airflow/models/dag.py Outdated Show resolved Hide resolved
airflow/timetables/base.py Outdated Show resolved Hide resolved
airflow/models/dag.py Outdated Show resolved Hide resolved
airflow/models/dag.py Outdated Show resolved Hide resolved
@ashb ashb added this to In progress in AIP-39 Pluggable schedule_interval via automation May 20, 2021
@ashb ashb moved this from In progress to Review in progress in AIP-39 Pluggable schedule_interval May 20, 2021
@ashb ashb added this to the Airflow 2.2 milestone Jun 17, 2021
@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Jun 17, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 Nice work 🥳 🎉

airflow/exceptions.py Show resolved Hide resolved
airflow/models/dag.py Outdated Show resolved Hide resolved
"""Get information about the next DagRun of this dag after ``date_last_automated_dagrun``.

This calculates what time interval the next DagRun should operate on
(its execution date), and when it can be scheduled, , according to the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(its execution date), and when it can be scheduled, , according to the
(its execution date), and when it can be scheduled, according to the

airflow/models/dag.py Show resolved Hide resolved
@kaxil kaxil added full tests needed We need to run full set of tests for this PR to merge and removed full tests needed We need to run full set of tests for this PR to merge labels Jun 17, 2021
@uranusjr uranusjr force-pushed the aip-39-timetables-once branch 2 times, most recently from e13fc8a to 113dca0 Compare June 23, 2021 23:13
This creates a new subpackage airflow.timetables, and implements
timetable constructs that provides DAG scheduling logic. The timetable
classes are used to refactor schedule inference logic out of the DAG
class, and existing functions related to scheduling are refactored to
use timetables (and deprecated).

Usages of the deprecated DAG functions in Airflow's code base are
modified to either use the timetable, or infer the information by other
means. For example, usages of previous_schedule() (what was a DAG last
scheduled to run before this run?) are refactored to query the database
when the previous scheduled run actually happened, instead of using the
schedule interval (cron or timedelta) in infer the information. This is
because an AIP-39 timetable does not necessarily run on a periodic-ish
schedule, and we cannot reliably infer when the previous run happened.
@uranusjr
Copy link
Member Author

I’m going to pull the trigger on this soon-ish if the checks pass; this one modifies way too many files and is causing merge conflicts left and right.

@uranusjr uranusjr force-pushed the aip-39-timetables-once branch 2 times, most recently from f68e4ec to b6c95cb Compare June 28, 2021 13:58
@uranusjr
Copy link
Member Author

uranusjr commented Jun 28, 2021

Man it’s so difficult to see all checks pass together. I think I’ve seen each of them passed at least once during my multiple attempts though, so this is going in.

@potiuk
Copy link
Member

potiuk commented Jun 28, 2021

Man it’s so difficult to see all checks pass together. I think I’ve seen each of them passed at least once during my multiple attempts though, so this is going in.

Should be now more stable with #16689 and #16682 merged. It's a constant struggle :)

@uranusjr uranusjr force-pushed the aip-39-timetables-once branch 3 times, most recently from 0ebad6d to f5b55ee Compare June 29, 2021 00:22
@uranusjr uranusjr changed the title AIP-39: Use TimeTable interface to implement scheduling inside the DAG class AIP-39: Handle DAG scheduling with timetables Jun 29, 2021
@uranusjr uranusjr merged commit 5034414 into apache:main Jun 29, 2021
AIP-39 Pluggable schedule_interval automation moved this from Reviewer approved to Done Jun 29, 2021
@uranusjr uranusjr deleted the aip-39-timetables-once branch June 29, 2021 05:20
@ashb
Copy link
Member

ashb commented Jun 29, 2021

🎉

@potiuk
Copy link
Member

potiuk commented Jun 29, 2021

Woohoo!

@kaxil
Copy link
Member

kaxil commented Jun 29, 2021

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

8 participants