You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support declarative row-wise filters (col X = "..." or X in (...)) of input partitions in the .map method (filtering per (input, output) pair, not input alone), which can be driven by per-partition Statistics the user defines for the Artifact
These are orthogonal to column-wise selections, which are defined in the .build method
View loading logic is expanded to apply these row and column filters in the best way it can (eg: loading from BQ SELECT <subset> w/ WHERE, Parquet reads subset of columns w/ ddf filtering)
Compared to very granular input partitioning, this:
has less overhead (fewer upstream partitions to track)
has less precise invalidation (less granular upstream partitions)
maintains "small" inputs to the build steps
The # of build tasks is still upper-bounded by the # of output partitions or other concurrency limits.
The text was updated successfully, but these errors were encountered:
X = "..."
orX in (...)
) of input partitions in the.map
method (filtering per (input, output) pair, not input alone), which can be driven by per-partition Statistics the user defines for the Artifact.build
methodSELECT <subset>
w/WHERE
, Parquet reads subset of columns w/ ddf filtering)The # of build tasks is still upper-bounded by the # of output partitions or other concurrency limits.
The text was updated successfully, but these errors were encountered: