Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plots diff: unnecessary requirement for dvc.yaml to be valid #6150

Closed
aguschin opened this issue Jun 10, 2021 · 6 comments
Closed

plots diff: unnecessary requirement for dvc.yaml to be valid #6150

aguschin opened this issue Jun 10, 2021 · 6 comments
Labels
A: plots Related to the plots bug Did we break something? diff/show Related to the diff/show feature p3-nice-to-have It should be done this or next sprint

Comments

@aguschin
Copy link
Contributor

aguschin commented Jun 10, 2021

Bug Report

Description

dvc plots diff --targets metrics/metrics_precision.tsv -- master fails because of invalid dvc.yaml, though AFAIK no dvc.yaml is really required to plot this (the metrics_precision.tsv file is tracked with Git). This is how this plot looks like when produced with valid dvc.yaml
Screenshot 2021-06-10 at 14 45 03

Reproduce

# metrics_precision.yaml
timestamp	step	metrics_precision
1623321612138	0	0.0
1623321617898	1	0.0
1623321622355	2	0.0
1623321628074	3	0.0563768875880167
1623321632661	4	0.33889920337730767
1623321637196	5	0.3251424198127944
1623321641976	6	0.0010819493589534024
1623321648511	7	0.35421949649508316
1623321661087	8	0.001618569316573259
1623321694687	9	0.24080049696178488

# dvc.yaml
stages:
  train:
    deps:
      - ${weights}
    cmd: python train.py
  1. create these files in a folder
  2. git init
  3. git add metrics_precision.tsv
  4. git commit -am 'first changes'
  5. dvc plots diff -v --targets metrics_precision.tsv -- master
$ dvc plots diff --targets metrics/metrics_precision.tsv -- master
ERROR: failed to parse 'stages.train.deps' in 'dvc.yaml': Could not find 'weights'

Expected

Expected to get plots without any error.

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 2.3.0 (pip)
---------------------------------
Platform: Python 3.9.5 on macOS-10.16-x86_64-i386-64bit
Supports: azure, http, https
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5s1
Caches: local
Remotes: azure
Workspace directory: apfs on /dev/disk1s5s1
Repo: dvc, git

Additional Information (if any):

I've encountered this error when stopped tracking params.yaml with Git. It's not like it's a common situation itself, but it seems strange that we need to look at dvc.yaml here.

$ dvc plots diff -v --targets metrics_precision.tsv -- master
2021-06-10 15:03:01,912 ERROR: failed to parse 'stages.train.deps' in 'dvc.yaml': Could not find 'weights': Could not find 'weights': Could not find 'weights' in {}: 'weights'
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/context.py", line 222, in select
    d = self[index]
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/context.py", line 197, in __getitem__
    return self.data[key]
KeyError: 'weights'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/context.py", line 349, in select
    node = super().select(normalized)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/context.py", line 224, in select
    raise ValueError(
ValueError: Could not find 'weights' in {}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/__init__.py", line 301, in _resolve
    return context.resolve(
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/context.py", line 529, in resolve
    return func(src, unwrap, skip_interpolation_checks)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/interpolate.py", line 113, in wrapper
    return type(data)(map(g, data))
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/funcy/funcs.py", line 37, in <lambda>
    return lambda *a, **kw: func(*(a + args), **dict(kwargs, **kw))
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/interpolate.py", line 115, in wrapper
    return f(data, *args)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/context.py", line 550, in resolve_str
    return self.select(expr, unwrap=unwrap)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/context.py", line 351, in select
    raise KeyNotInContext(key) from exc
dvc.parsing.context.KeyNotInContext: Could not find 'weights'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/command/plots.py", line 37, in run
    plots = self._func(targets=self.args.targets, props=self._props())
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/command/plots.py", line 79, in _func
    return self.repo.plots.diff(
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/plots/__init__.py", line 187, in diff
    return diff(self.repo, *args, **kwargs)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/plots/diff.py", line 18, in diff
    return repo.plots.show(
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/plots/__init__.py", line 163, in show
    data = self.collect(targets, revs, recursive)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/plots/__init__.py", line 67, in collect
    plots = _collect_plots(self.repo, targets, rev, recursive)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/plots/__init__.py", line 269, in _collect_plots
    plots, path_infos = collect(
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/collect.py", line 87, in collect
    outs: Outputs = _collect_outs(repo, output_filter=output_filter, deps=deps)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/collect.py", line 25, in _collect_outs
    for stage in repo.graph  # using `graph` to ensure graph checks run
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/__init__.py", line 417, in graph
    return build_graph(self.stages, self.outs_trie)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/__init__.py", line 439, in stages
    return self.stage.collect_repo(onerror=error_handler)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/stage.py", line 475, in collect_repo
    new_stages = self.load_file(file_path)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/stage.py", line 300, in load_file
    return self.load_all(path)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/stage.py", line 280, in load_all
    return [stages[key] for key in keys]
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/repo/stage.py", line 280, in <listcomp>
    return [stages[key] for key in keys]
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/stage/loader.py", line 123, in __getitem__
    resolved_data = self.resolver.resolve_one(name)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/__init__.py", line 176, in resolve_one
    return definition.resolve()
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/__init__.py", line 247, in resolve
    return self.resolve_stage(**kwargs)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/__init__.py", line 289, in resolve_stage
    resolved = {
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/__init__.py", line 290, in <dictcomp>
    key: self._resolve(context, value, key, skip_checks)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/__init__.py", line 305, in _resolve
    format_and_raise(
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/__init__.py", line 81, in format_and_raise
    _reraise_err(ResolveError, message, from_exc=exc)
  File "/usr/local/Cellar/dvc/2.3.0/libexec/lib/python3.9/site-packages/dvc/parsing/__init__.py", line 89, in _reraise_err
    raise err from from_exc
dvc.parsing.ResolveError: failed to parse 'stages.train.deps' in 'dvc.yaml': Could not find 'weights'
------------------------------------------------------------
2021-06-10 15:03:01,925 DEBUG: Analytics is enabled.
2021-06-10 15:03:02,016 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/_l/xm_1fknn1479pfh9kg0ttgjc0000gq/T/tmpgouyccl2']'
2021-06-10 15:03:02,017 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/_l/xm_1fknn1479pfh9kg0ttgjc0000gq/T/tmpgouyccl2']'
@dberenbaum
Copy link
Contributor

This is a good example to support having the plots configuration separate from dvc.yaml (see #5980 (reply in thread) and #5942). I don't even think it's possible to save a plots config for a plot now unless it's an output of a DVC stage.

@dberenbaum dberenbaum added bug Did we break something? p1-important Important, aka current backlog of things to do diff/show Related to the diff/show feature labels Jun 10, 2021
@daavoo daavoo added the A: plots Related to the plots label Oct 20, 2021
@dberenbaum dberenbaum added p2-medium Medium priority, should be done, but less important and removed p1-important Important, aka current backlog of things to do labels Feb 17, 2022
@dberenbaum
Copy link
Contributor

@pared @daavoo I lowered the priority of this, but it (and the old linked discussions from @Suor) relates to #6944 and https://www.notion.so/iterative/Dvclive-plots-configuration-011e41923e414bc3b2fd6c9a24ca061f, so maybe it can be addressed as part of those efforts?

@daavoo
Copy link
Contributor

daavoo commented Feb 18, 2022

@pared @daavoo I lowered the priority of this, but it (and the old linked discussions from @Suor) relates to #6944 and https://www.notion.so/iterative/Dvclive-plots-configuration-011e41923e414bc3b2fd6c9a24ca061f, so maybe it can be addressed as part of those efforts?

I think it does not relate to #6944 (at least given the scope described there). The dvc_render extraction only affects code happening after the internal dvc.repo.plots.show().

However, this issue happens during the internal collection of plots (_collect_plots) so the extraction can't address it.

@daavoo
Copy link
Contributor

daavoo commented Apr 20, 2022

@dberenbaum I think this is actually one example of what I have mentioned during the last Planning.

Calling dvc metrics / dvc plots commands with targets should not fail because of dvc.yaml parsing.

Those commands can be used and work properly without a dvc.yaml.

@pared
Copy link
Contributor

pared commented Apr 22, 2022

@dberenbaum Ill try to fix that during #7086. There will be some changes in logic handling the targets, I should be able to squeeze some tests for that use case there too.

@dberenbaum
Copy link
Contributor

@pared That sounds good, but I don't want to further expand the scope of that issue/PR more than needed, so I wouldn't push for it in there.

@daavoo daavoo mentioned this issue Aug 22, 2022
2 tasks
@dberenbaum dberenbaum added p3-nice-to-have It should be done this or next sprint and removed p2-medium Medium priority, should be done, but less important labels Mar 16, 2023
@mattseddon mattseddon closed this as not planned Won't fix, can't repro, duplicate, stale Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: plots Related to the plots bug Did we break something? diff/show Related to the diff/show feature p3-nice-to-have It should be done this or next sprint
Projects
None yet
Development

No branches or pull requests

5 participants