Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unroll an array of values for dvc repro #7369

Closed
Rusteam opened this issue Feb 11, 2022 · 3 comments
Closed

unroll an array of values for dvc repro #7369

Rusteam opened this issue Feb 11, 2022 · 3 comments
Labels
A: templating Related to the templating feature awaiting response we are waiting for your reply, please respond! :)

Comments

@Rusteam
Copy link

Rusteam commented Feb 11, 2022

When we run foreach .. do ... for items in an array inside dvc.yaml , we might want to join results of each do into one.
It would be nice to be able to unroll the array instead of adding each item separately.

Now I do this:

vars:
  - datasets:
     - one
     - two
     - three

stages:
  prep:
    foreach: ${datasets}
    do:
      cmd: python main.py prep ${item}
      outs:
       - data/processed/{item}
  join:
    cmd: python main.py merge ${datasets[0]} ${datasets[1]} ${datasets[2]}
    deps:
      - data/processed/${datasets[0]}
      - data/processed/${datasets[1]}
      - data/processed/${datasets[2]}

Obviously this leads to errors when we forget to update join stage after updating datasets variable.

We can have an option to unroll an array like this:

join:
    cmd: python main.py merge ${...datasets}
    deps:
      -  data/processed/
@skshetry
Copy link
Member

skshetry commented Feb 11, 2022

We don't want to complicate parametrization. It looks like what you are looking for is
Incremental processing or streaming in micro-batches #5917
.

@skshetry skshetry added the awaiting response we are waiting for your reply, please respond! :) label Feb 11, 2022
@dberenbaum
Copy link
Contributor

Hi @Rusteam! This issue seems closely related to #6107. It also shows how this is a difficult problem since you expect a different output from "unrolling the array" than in that issue. If that's the case, would you mind commenting on that issue and we will mark this one as a duplicate?

@Rusteam
Copy link
Author

Rusteam commented Feb 11, 2022

Hi @Rusteam! This issue seems closely related to #6107. It also shows how this is a difficult problem since you expect a different output from "unrolling the array" than in that issue. If that's the case, would you mind commenting on that issue and we will mark this one as a duplicate?

sure let me do that, thanks

@Rusteam Rusteam closed this as completed Feb 11, 2022
@daavoo daavoo added the A: templating Related to the templating feature label Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: templating Related to the templating feature awaiting response we are waiting for your reply, please respond! :)
Projects
None yet
Development

No branches or pull requests

4 participants