How to run assets atomically according to its freshness policy #20176
Replies: 2 comments
-
Hi @peter-lim-yello -- there are some changes in the works to allow you to carve up your asset graph into independent evaluation units for the purpose of auto-materialization (so that key-prefix scheme would work). However, this is not currently possible using public APIs. How hard of a requirement is it that these assets get broken up into separate executions? Logically, one big run will do the same thing as lots of smaller runs, so one potential solution here is to go forward with the "one big execution" approach in the short term and break things up in the medium term? Also, while not related to breaking up your assets into separate executions, the best auto materialize policy setup for regularly materializing things in your case is likely: policy = AutoMaterializePolicy(
rules={
AutoMaterializePolicy.materialize_on_cron(<schedule>),
AutoMaterializePolicy.skip_on_not_all_parents_updated_since_cron(<schedule>),
}
) rather than the lazy preset policy. This is something we're in the process of pushing people towards, as it results in a much more predictable / observable system, while accomplishing the basic goal of executing assets on a distributed schedule. |
Beta Was this translation helpful? Give feedback.
-
Presumably the rules should be instances of How would you handle the case of lazy but not on cron, i.e., as needed by downstream asset, which itself is on cron.? |
Beta Was this translation helpful? Give feedback.
-
Hello,
I'm trying to run select assets based on its freshness policy. i.e only assets that have a cron definition that aligns with the current date would be run.
auto_materialization_policy seems like the best way to do this, but it runs all of the relevant assets simultaneously in one job. Since we're dealing with a lot of assets to be run in one day, are we able to separate them based on key prefix?
Also trying the schedule/sensor route, but it doesn't look like we can select only the assets to be run today, or a subset of assets in a job.
Any ideas?
Beta Was this translation helpful? Give feedback.
All reactions