New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
checkout: use dvcignore #7963
checkout: use dvcignore #7963
Conversation
assert not foo_path.exists() | ||
|
||
remove(tmp_dir / ".dvc/cache") | ||
dvc.pull() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I change this to dvc.checkout()
then it fails again with the same error about removing files without confirmation. I tried to fix this but got stuck, I'm not familiar enough with this codebase. I can share what I found while debugging. I'm actually not even sure if the error is expected or not. The issue gave me the impression that both pull
and checkout
should succeed, is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it on my computer it also fails, but I noticed that if I status -c
or fetch
before the chekout
then it will succeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so basically it seems that checkout
breaks down when the cache is deleted, which seems fair and maybe not worth trying to change? fetch
updates the cache, I'm guessing status -c
does too. And pull
does a fetch
before checkout
.
Do you know if there are other ways to trigger this bug of checkout
previously not ignoring files properly that don't involve deleting the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
status -c
doesn't pull down the cache but it will pull down the .dir
index. And it is used in generating a data tree object.
Line 772 in 954de0c
obj = self.get_obj(filter_info=filter_info) |
I think the current problem is that, in
dvc-data
the diff
function didn't take dvcignore
into account when the tree object is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To solve this completely solve this problem, some modifications to dvc-data
is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like to solve it yourself or wait for someone else to do this? But I think it is different problem. We can merge this one as it already solved one part of the question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how to solve it myself, someone else should do this.
@@ -583,6 +583,7 @@ def _checkout(self, *args, **kwargs): | |||
from dvc_data.checkout import CheckoutError as _CheckoutError | |||
from dvc_data.checkout import LinkError, PromptError | |||
|
|||
kwargs.setdefault("ignore", self.dvcignore) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only two places are calling this _checkout
method. One in line 583
had already had the ignore
argument, and another one is in line 785
in this file. I think we should either remove both of them and add a default value to ignore
or remove the default one but pass them explicitly in the calls.
Thanks for your PR @alexmojaki. |
One thing remained, please use |
1b0fbb1
to
1325f19
Compare
1325f19
to
ae81313
Compare
Fixes #7374
Note that there are many places the
ignore
argument could be added instead. This is the deepest call, right before delegating todvc_data
. Not sure if that's the right choice.Will add tests soon.
❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.