Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get: fails to clone because "no valid credentials provided" #7670

Closed
guyrosin opened this issue May 1, 2022 · 30 comments · Fixed by #7674
Closed

get: fails to clone because "no valid credentials provided" #7670

guyrosin opened this issue May 1, 2022 · 30 comments · Fixed by #7674
Assignees
Labels
awaiting response we are waiting for your reply, please respond! :) bug Did we break something? git Related to git and git backends p1-important Important, aka current backlog of things to do regression Ohh, we broke something :-(

Comments

@guyrosin
Copy link

guyrosin commented May 1, 2022

Bug Report

Description

When executing dvc get or dvc update, a "failed to clone repo" error appears, which originates from a "dulwich.client.HTTPUnauthorized: No valid credentials provided" error.
This started happening without any clear reason, after several weeks of working with DVC with no problem.

Related issues: jelmer/dulwich#882, python-poetry/poetry#5428

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 2.10.2 (pip)
---------------------------------
Platform: Python 3.9.12 on macOS-12.2.1-x86_64-i386-64bit
Supports:
        hdfs (fsspec = 2022.3.0, pyarrow = 7.0.0),
        webhdfs (fsspec = 2022.3.0),
        http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6)
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1s1
Caches: local
Remotes: https
Workspace directory: apfs on /dev/disk1s1s1
Repo: dvc, git

Additional Information (if any):

$ dvc get https://dagshub.com/aviv/data-repo README.md -v
2022-05-01 16:37:20,420 DEBUG: Creating external repo https://dagshub.com/aviv/data-repo@None
2022-05-01 16:37:20,420 DEBUG: erepo: git clone 'https://dagshub.com/aviv/data-repo' to a temporary dir
2022-05-01 16:37:20,888 DEBUG: Removing '/Users/guyrosin/code/playground/test/.B66n2XDC3RLgwwhXr8AoUr'                                                                           
2022-05-01 16:37:20,888 ERROR: failed to get 'README.md' from 'https://dagshub.com/aviv/data-repo' - Failed to clone repo 'https://dagshub.com/aviv/data-repo' to '/var/folders/2x/3n66lzcd1gg06nk_sjjq319w0000gn/T/tmpv7y9y9z5dvc-clone'
------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 193, in clone
    repo = clone_from()
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/porcelain.py", line 443, in clone
    return client.clone(
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/client.py", line 535, in clone
    result = self.fetch(path, target, progress=progress, depth=depth)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/client.py", line 601, in fetch
    result = self.fetch_pack(
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/client.py", line 2047, in fetch_pack
    refs, server_capabilities, url = self._discover_references(
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/client.py", line 1908, in _discover_references
    resp, read = self._http_request(url, headers, allow_compression=True)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/client.py", line 2189, in _http_request
    raise HTTPUnauthorized(resp.getheader("WWW-Authenticate"), url)
dulwich.client.HTTPUnauthorized: No valid credentials provided

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/scm.py", line 126, in clone
    git = Git.clone(url, to_path, progress=pbar.update_git, **kwargs)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/scmrepo/git/__init__.py", line 143, in clone
    backend.clone(url, to_path, **kwargs)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 196, in clone
    raise CloneError(url, to_path) from exc
scmrepo.exceptions.CloneError: Failed to clone repo 'https://dagshub.com/aviv/data-repo' to '/var/folders/2x/3n66lzcd1gg06nk_sjjq319w0000gn/T/tmpv7y9y9z5dvc-clone'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/commands/get.py", line 39, in _get_file_from_repo
    Repo.get(
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/repo/get.py", line 49, in get
    with external_repo(
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/external_repo.py", line 39, in external_repo
    path = _cached_clone(url, rev, for_write=for_write)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/external_repo.py", line 165, in _cached_clone
    clone_path, shallow = _clone_default_branch(url, rev, for_write=for_write)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/funcy/decorators.py", line 45, in wrapper
    return deco(call, *dargs, **dkwargs)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/funcy/flow.py", line 274, in wrap_with
    return call()
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/funcy/decorators.py", line 66, in __call__
    return self._func(*self._args, **self._kwargs)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/external_repo.py", line 235, in _clone_default_branch
    git = clone(url, clone_path)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/scm.py", line 131, in clone
    raise CloneError(str(exc))
dvc.scm.CloneError: Failed to clone repo 'https://dagshub.com/aviv/data-repo' to '/var/folders/2x/3n66lzcd1gg06nk_sjjq319w0000gn/T/tmpv7y9y9z5dvc-clone'
------------------------------------------------------------
@dtrifiro
Copy link
Contributor

dtrifiro commented May 1, 2022

Hi,
I see that https://dagshub.com/aviv/data-repo gives me a 404. I'm assuming this is because that is a private repo (either that or it has been deleted)

Since you're saying that this started happening recently was this changed to private (deleted?) recently?

@dtrifiro dtrifiro added the awaiting response we are waiting for your reply, please respond! :) label May 1, 2022
@guyrosin
Copy link
Author

guyrosin commented May 2, 2022

Thanks @dtrifiro, this repo has always been private. I've been using it for several weeks already with no problem.
It seems this error started appearing after I updated DVC to v2.10! After downgrading to v2.9.3 everything seems to work.

Here's the whole debugging story FYI:
Initially I used DVC v2.10.3. I tried running dvc get and dvc import with two git URLs: https://dagshub.com/aviv/data-repo, and using an access token (https://{user}:{token}@dagshub.com/aviv/data-repo). Both didn't work.
I've downgraded DVC to v2.10.1 and it looked much better, but still weird...

  • First of all, dvc get and dvc import actually worked!
  • They worked only when using the URL with the user:token prefix! Otherwise I got the same error as before ("No valid credentials provided")
  • After the successful dvc import, if I executed the same command again without any file changes, (almost) the same error appeared: "unexpected error - No valid credentials provided".
  • But if I push to dvc and git and then run dvc import, it worked...
  • Finally, I downgraded DVC to v2.9.3 (the last version I remember worked 100%), and now running dvc import without any file changes results in "'data.dvc' didn't change, skipping", as it should've been. It works even with the standard HTTPS URL (without the user:token prefix), as it should.

@dtrifiro dtrifiro removed the awaiting response we are waiting for your reply, please respond! :) label May 2, 2022
@dtrifiro
Copy link
Contributor

dtrifiro commented May 2, 2022

Can confirm this is happening, starting from #7554

@dtrifiro dtrifiro added bug Did we break something? git Related to git and git backends labels May 2, 2022
@dtrifiro dtrifiro self-assigned this May 2, 2022
@dberenbaum dberenbaum added the regression Ohh, we broke something :-( label May 4, 2022
dtrifiro added a commit to dtrifiro/dvc that referenced this issue May 13, 2022
`fetch_all_exp` was being invoked with url set to "origin", which caused
any credentials in the provided url to be ignored

Fixes iterative#7670
dtrifiro added a commit to dtrifiro/dvc that referenced this issue May 13, 2022
`fetch_all_exp` was being invoked with url set to "origin", which caused
any credentials in the provided url to be ignored

Fixes iterative#7670
dtrifiro added a commit to dtrifiro/dvc that referenced this issue Jun 2, 2022
`fetch_all_exp()` in `clone()` was called with `url="origin"`, which
resulted in the operation being performed with the remote URL defined
in the cloned repo's config, which did not include any credentials
initially provided to clone.

Fixes iterative#7670
pmrowla pushed a commit that referenced this issue Jun 7, 2022
`fetch_all_exp()` in `clone()` was called with `url="origin"`, which
resulted in the operation being performed with the remote URL defined
in the cloned repo's config, which did not include any credentials
initially provided to clone.

Fixes #7670
@guyrosin
Copy link
Author

guyrosin commented Jun 14, 2022

Still crashes with the same error using dvc v2.11.0 :(

❯ dvc get https://dagshub.com/aviv/data-repo README.md -v
2022-06-14 17:15:43,776 DEBUG: Creating external repo https://dagshub.com/aviv/data-repo@None
2022-06-14 17:15:43,776 DEBUG: erepo: git clone 'https://dagshub.com/aviv/data-repo' to a temporary dir
2022-06-14 17:15:44,238 DEBUG: Removing '/Users/guyrosin/code/playground/.eqznXKcQmyfXK5wC87zMvn'                                                                               
2022-06-14 17:15:44,238 ERROR: failed to get 'README.md' from 'https://dagshub.com/aviv/data-repo' - Failed to clone repo 'https://dagshub.com/aviv/data-repo' to '/var/folders/2x/3n66lzcd1gg06nk_sjjq319w0000gn/T/tmp_2h5e0wzdvc-clone'
------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 196, in clone
    repo = clone_from()
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/porcelain.py", line 443, in clone
    return client.clone(
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/client.py", line 622, in clone
    result = self.fetch(path, target, progress=progress, depth=depth)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/client.py", line 699, in fetch
    result = self.fetch_pack(
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/client.py", line 2075, in fetch_pack
    refs, server_capabilities, url = self._discover_references(
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/client.py", line 1934, in _discover_references
    resp, read = self._http_request(url, headers, allow_compression=True)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dulwich/client.py", line 2218, in _http_request
    raise HTTPUnauthorized(resp.getheader("WWW-Authenticate"), url)
dulwich.client.HTTPUnauthorized: No valid credentials provided

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/scm.py", line 145, in clone
    git = Git.clone(url, to_path, progress=pbar.update_git, **kwargs)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/scmrepo/git/__init__.py", line 143, in clone
    backend.clone(url, to_path, **kwargs)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 199, in clone
    raise CloneError(url, to_path) from exc
scmrepo.exceptions.CloneError: Failed to clone repo 'https://dagshub.com/aviv/data-repo' to '/var/folders/2x/3n66lzcd1gg06nk_sjjq319w0000gn/T/tmp_2h5e0wzdvc-clone'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/commands/get.py", line 39, in _get_file_from_repo
    Repo.get(
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/repo/get.py", line 50, in get
    with external_repo(
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/external_repo.py", line 39, in external_repo
    path = _cached_clone(url, rev, for_write=for_write)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/external_repo.py", line 169, in _cached_clone
    clone_path, shallow = _clone_default_branch(url, rev, for_write=for_write)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/funcy/decorators.py", line 45, in wrapper
    return deco(call, *dargs, **dkwargs)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/funcy/flow.py", line 274, in wrap_with
    return call()
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/funcy/decorators.py", line 66, in __call__
    return self._func(*self._args, **self._kwargs)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/external_repo.py", line 239, in _clone_default_branch
    git = clone(url, clone_path)
  File "/Users/guyrosin/miniconda3/envs/rdkit-test/lib/python3.9/site-packages/dvc/scm.py", line 150, in clone
    raise CloneError(str(exc))
dvc.scm.CloneError: Failed to clone repo 'https://dagshub.com/aviv/data-repo' to '/var/folders/2x/3n66lzcd1gg06nk_sjjq319w0000gn/T/tmp_2h5e0wzdvc-clone'
------------------------------------------------------------
2022-06-14 17:15:44,258 DEBUG: Analytics is enabled.
2022-06-14 17:15:44,308 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/2x/3n66lzcd1gg06nk_sjjq319w0000gn/T/tmpkmarbkqi']'
2022-06-14 17:15:44,310 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/2x/3n66lzcd1gg06nk_sjjq319w0000gn/T/tmpkmarbkqi']'

Output of dvc doctor:

$ dvc doctor
DVC version: 2.11.0 (pip)
---------------------------------
Platform: Python 3.9.12 on macOS-12.3.1-x86_64-i386-64bit
Supports:
        gdrive (pydrive2 = 1.10.1),
        hdfs (fsspec = 2022.5.0, pyarrow = 8.0.0),
        webhdfs (fsspec = 2022.5.0),
        http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6)
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1s1
Caches: local
Remotes: https
Workspace directory: apfs on /dev/disk1s1s1
Repo: dvc, git

@pmrowla pmrowla reopened this Jun 15, 2022
@pmrowla
Copy link
Contributor

pmrowla commented Jun 15, 2022

@guyrosin does using the explicit username+token in the URL (https://{user}:{token}@dagshub.com/aviv/data-repo) work for you in 2.11.0?

the underlying credentials issue (without providing the acess token) is the same as #6586 and is currently being worked on, but in 2.11 I would expect including the token in your repo URL to work

@guyrosin
Copy link
Author

@pmrowla Yeah, it works!
Quick followup question: there's no way to specify credentials for dvc update, right? (that's what currently keeps me from updating to v2.11)

@pmrowla
Copy link
Contributor

pmrowla commented Jun 21, 2022

@guyrosin if you manually edit the .dvc file to include the token in the dependency url field I think dvc update will use the token. But otherwise you will need to wait for the linked fix to be merged & released

@vaxherra
Copy link

vaxherra commented Jul 1, 2022

I am still experiencing similar issues with dvc 2.12.0 with authentications, even with providing explicitly user and token in the repo link.

dulwich.client.HTTPUnauthorized: No valid credentials provided

The only fix is to revert back to 2.9.13.

Any updates on this issue?

@avivio
Copy link

avivio commented Oct 19, 2022

This is still happening for me after upgrading from version 2.9.5 to version 2.30.0

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 2.30.0 (pip)
---------------------------------
Platform: Python 3.9.12 on macOS-12.6-x86_64-i386-64bit
Subprojects:
	dvc_data = 0.17.1
	dvc_objects = 0.7.0
	dvc_render = 0.0.12
	dvc_task = 0.1.3
	dvclive = 0.11.0
	scmrepo = 0.1.1
Supports:
	http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3)
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1s1
Caches: local
Remotes: https
Workspace directory: apfs on /dev/disk1s1s1
Repo: dvc, git

Verbose output of dvc update:

$ dvc update data/dry.dvc -v
2022-10-19 12:17:55,342 DEBUG: Creating external repo https://dagshub.com/mana-bio/data-repo@None
2022-10-19 12:17:55,342 DEBUG: erepo: git clone 'https://dagshub.com/mana-bio/data-repo' to a temporary dir
2022-10-19 12:17:55,902 ERROR: failed update data - Failed to clone repo 'https://dagshub.com/mana-bio/data-repo' to '/var/folders/r2/qc7fk3l172d1wc1l6r1gyzcm0000gn/T/tmpla66ns6fdvc-clone'
------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 200, in clone
    repo = clone_from()
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dulwich/porcelain.py", line 538, in clone
    return client.clone(
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dulwich/client.py", line 760, in clone
    result = self.fetch(path, target, progress=progress, depth=depth)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dulwich/client.py", line 837, in fetch
    result = self.fetch_pack(
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dulwich/client.py", line 2075, in fetch_pack
    refs, server_capabilities, url = self._discover_references(
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dulwich/client.py", line 1934, in _discover_references
    resp, read = self._http_request(url, headers)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dulwich/client.py", line 2215, in _http_request
    raise HTTPUnauthorized(resp.getheader("WWW-Authenticate"), url)
dulwich.client.HTTPUnauthorized: No valid credentials provided

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/scm.py", line 145, in clone
    git = Git.clone(url, to_path, progress=pbar.update_git, **kwargs)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/scmrepo/git/__init__.py", line 143, in clone
    backend.clone(url, to_path, **kwargs)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 203, in clone
    raise CloneError(url, to_path) from exc
scmrepo.exceptions.CloneError: Failed to clone repo 'https://dagshub.com/mana-bio/data-repo' to '/var/folders/r2/qc7fk3l172d1wc1l6r1gyzcm0000gn/T/tmpla66ns6fdvc-clone'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/commands/update.py", line 16, in run
    self.repo.update(
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/repo/__init__.py", line 49, in wrapper
    return f(repo, *args, **kwargs)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/repo/update.py", line 40, in update
    stage.update(
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/stage/__init__.py", line 452, in update
    update_import(
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/stage/imports.py", line 23, in update_import
    stage.deps[0].update(rev=rev)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/dependency/repo.py", line 87, in update
    with self._make_repo(locked=False) as repo:
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/external_repo.py", line 39, in external_repo
    path = _cached_clone(url, rev, for_write=for_write)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/external_repo.py", line 169, in _cached_clone
    clone_path, shallow = _clone_default_branch(url, rev, for_write=for_write)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/funcy/decorators.py", line 45, in wrapper
    return deco(call, *dargs, **dkwargs)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/funcy/flow.py", line 274, in wrap_with
    return call()
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/funcy/decorators.py", line 66, in __call__
    return self._func(*self._args, **self._kwargs)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/external_repo.py", line 239, in _clone_default_branch
    git = clone(url, clone_path)
  File "/Users/aviv/miniconda3/envs/playground/lib/python3.9/site-packages/dvc/scm.py", line 150, in clone
    raise CloneError(str(exc))
dvc.scm.CloneError: Failed to clone repo 'https://dagshub.com/mana-bio/data-repo' to '/var/folders/r2/qc7fk3l172d1wc1l6r1gyzcm0000gn/T/tmpla66ns6fdvc-clone'
------------------------------------------------------------
2022-10-19 12:17:55,915 DEBUG: Analytics is enabled.
2022-10-19 12:17:55,984 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/r2/qc7fk3l172d1wc1l6r1gyzcm0000gn/T/tmphlvsybrd']'
2022-10-19 12:17:55,986 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/r2/qc7fk3l172d1wc1l6r1gyzcm0000gn/T/tmphlvsybrd']'

@dberenbaum
Copy link
Contributor

@avivio This looks like a private repo, right? If so, the credentials need to be found either in the URL or in a git credential helper.

Are you able to successfully do git clone https://dagshub.com/mana-bio/data-repo and do you need to enter any prompts to do so?

@efiop

This comment was marked as outdated.

@dberenbaum
Copy link
Contributor

Here's what I get on a private dagshub repo:

# Fails without credentials
$ dvc get https://dagshub.com/dberenbaum/lstm_seq2seq README.md
ERROR: failed to get 'README.md' from 'https://dagshub.com/dberenbaum/lstm_seq2seq' - Failed to clone repo 'https://dagshub.com/dberenbaum/lstm_seq2seq' to '/var/folders/24/99_tf1xj3vx8k1k_jkdmnhq00000gn/T/tmpak9j7bi1dvc-clone'

# Cache credentials in git
$ git config credential.helper cache
$ git clone https://dagshub.com/dberenbaum/lstm_seq2seq
Cloning into 'lstm_seq2seq'...
Username for 'https://dagshub.com/': dberenbaum
Password for 'https://dberenbaum@dagshub.com/':
remote: Enumerating objects: 91, done.
remote: Counting objects: 100% (91/91), done.
remote: Compressing objects: 100% (67/67), done.
remote: Total 91 (delta 32), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (91/91), 9.30 KiB | 3.10 MiB/s, done.
Resolving deltas: 100% (32/32), done.

# This time it succeeds
$ dvc get https://dagshub.com/dberenbaum/lstm_seq2seq README.md

What else is expected here?

@sisp
Copy link
Contributor

sisp commented Nov 18, 2022

I'm facing the same problem on macOS while there's no problem on Linux (Ubuntu 20.04). dvc get/dvc import/dvc pull fails with the error:

dvc.scm.CloneError: Failed to clone repo ...

I'm using DVC v2.34.2. It would be great if this problem could get fixed with high priority because it prevents our macOS users from using DVC. 🙁

@dberenbaum dberenbaum added the p1-important Important, aka current backlog of things to do label Nov 18, 2022
@dberenbaum
Copy link
Contributor

@sisp Thanks for reporting. It's hard to tell in which scenarios this breaks, but it's clear at this point that it's causing enough issues that we need to change the credential handling, and it's been made a high priority.

In the meantime, can you try this workaround?

Om Mac I had to ssh-add -AK ~/.ssh/id_rsa to enable the agent and keychain: ronf/asyncssh#522

Originally posted by @shcheklein in #7702 (comment)

@sisp
Copy link
Contributor

sisp commented Nov 18, 2022

@dberenbaum Thanks for hinting at the possible workaround. I've forwarded your comment to @dekromp who is using a Mac and got this error. We'll report back. Good to know this is a priority.

@dberenbaum
Copy link
Contributor

It seems like maybe the current implementation fails to retrieve credentials from the OSX keychain. Another possible workaround mentioned above is to try git config credential.helper cache so that git caches the credentials temporarily for reuse. This should work for non-ssh scenarios.

@dekromp
Copy link

dekromp commented Nov 21, 2022

Hi @dberenbaum,
thanks for the workaround. Unfortunately it did not work me. By the way, the provided workaround requires some additional steps mentioned here if the ssh key is not protected by a passphrase: jirsbek/SSH-keys-in-macOS-Sierra-keychain#15 (comment)

Output of dvc doctor:

DVC version: 2.34.2 (pip)
---------------------------------
Platform: Python 3.8.12 on macOS-12.6.1-x86_64-i386-64bit
Subprojects:
	dvc_data = 0.25.3
	dvc_objects = 0.12.2
	dvc_render = 0.0.12
	dvc_task = 0.1.5
	dvclive = 1.0.1
	scmrepo = 0.1.3
Supports:
	http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3)

@pmrowla pmrowla removed their assignment Nov 28, 2022
@dtrifiro
Copy link
Contributor

Hi @dekromp and @sisp, sorry for taking so long to get back to you but I was out sick.

What credential helper are you using on mac? I can't seem to reproduce using git credential-store and git credential-osxkeychain. How did you configure the helper? Does cloning the repo using the git cli work (git clone https://<...>(?

@Heegreis
Copy link

Heegreis commented Nov 29, 2022

My OS is Windows 11 and also has this problem.
dvc doctor output:

DVC version: 2.34.0 (exe)
---------------------------------
Platform: Python 3.10.8 on Windows-10-10.0.22621-SP0
Subprojects:

Supports:
        azure (adlfs = 2022.10.0, knack = 0.10.0, azure-identity = 1.12.0),
        gdrive (pydrive2 = 1.14.0),
        gs (gcsfs = 2022.11.0),
        hdfs (fsspec = 2022.11.0, pyarrow = 10.0.0),
        http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        oss (ossfs = 2021.8.0),
        s3 (s3fs = 2022.11.0, boto3 = 1.24.59),
        ssh (sshfs = 2022.6.0),
        webdav (webdav4 = 0.9.8),
        webdavs (webdav4 = 0.9.8),
        webhdfs (fsspec = 2022.11.0)
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: s3
Workspace directory: NTFS on D:\
Repo: dvc, git

Try to dvc import data from my private repo:

$ git clone http://xxx.xxx.xxx.xxx/my-private-repo # sucess
$ cd my-private-repo
$ git config credential.helper manager-core
$ dvc import ... # import data from my my-private-repo with previus commit
Importing 'row_data (http://xxx.xxx.xxx.xxx/my-private-repo.git)' -> 'data'
ERROR: failed to import 'row_data' from 'http://xxx.xxx.xxx.xxx/my-private-repo.git'. - Failed to clone repo 'http://xxx.xxx.xxx.xxx/my-private-repo.git' to 'C:\Users\user\AppData\Local\Temp\tmpdlwm0mg3dvc-clone'

Retry with git config credential.helper store:

$ git clone http://xxx.xxx.xxx.xxx/my-private-repo # sucess
$ cd my-private-repo
$ git config credential.helper store
$ dvc import ... # import data from my-private-repo with previus commit
Importing 'row_data (http://xxx.xxx.xxx.xxx/my-private-repo.git)' -> 'data'
ERROR: failed to import 'row_data' from 'http://xxx.xxx.xxx.xxx/my-private-repo.git'. - Failed to clone repo 'http://xxx.xxx.xxx.xxx/my-private-repo.git' to 'C:\Users\user\AppData\Local\Temp\tmpr9sdc9r9dvc-clone'
Verbose output of `dvc import`:
...
2022-11-29 10:04:54,419 ERROR: failed to import 'row_data' from 'http://xx.xxx.xxx.xxx/my-private-repo.git'. - Failed to clone repo 'http://xx.xxx.xxx.xxx/my-private-repo.git' to 'C:\Users\user\AppData\Local\Temp\tmp2014udxmdvc-clone'
------------------------------------------------------------
Traceback (most recent call last):
  File "scmrepo\git\backend\dulwich\__init__.py", line 200, in clone
  File "dulwich\porcelain.py", line 551, in clone
  File "dulwich\client.py", line 760, in clone
  File "dulwich\client.py", line 837, in fetch
  File "dulwich\client.py", line 2076, in fetch_pack
  File "dulwich\client.py", line 1934, in _discover_references
  File "dulwich\client.py", line 2216, in _http_request
dulwich.client.HTTPUnauthorized: No valid credentials provided

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "dvc\scm.py", line 145, in clone
  File "scmrepo\git\__init__.py", line 143, in clone
  File "scmrepo\git\backend\dulwich\__init__.py", line 203, in clone
scmrepo.exceptions.CloneError: Failed to clone repo 'http://xxx.xxx.xxx.xxx/my-private-repo.git' to 'C:\Users\user\AppData\Local\Temp\tmp2014udxmdvc-clone'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dvc\commands\imp.py", line 15, in run
  File "dvc\repo\imp.py", line 6, in imp
  File "dvc\repo\__init__.py", line 48, in wrapper
  File "dvc\repo\scm_context.py", line 156, in run
  File "dvc\repo\imp_url.py", line 98, in imp_url
  File "funcy\decorators.py", line 45, in wrapper
  File "dvc\stage\decorators.py", line 43, in rwlocked
  File "funcy\decorators.py", line 66, in __call__
  File "dvc\stage\__init__.py", line 543, in run
  File "funcy\decorators.py", line 45, in wrapper
  File "dvc\stage\decorators.py", line 43, in rwlocked
  File "funcy\decorators.py", line 66, in __call__
  File "dvc\stage\__init__.py", line 571, in _sync_import
  File "dvc\stage\imports.py", line 63, in sync_import
  File "dvc\dependency\repo.py", line 70, in download
  File "dvc\dependency\repo.py", line 99, in get_used_objs
  File "dvc\dependency\repo.py", line 113, in _get_used_and_obj
  File "contextlib.py", line 135, in __enter__
  File "dvc\external_repo.py", line 39, in external_repo
  File "dvc\external_repo.py", line 169, in _cached_clone
  File "funcy\decorators.py", line 45, in wrapper
  File "funcy\flow.py", line 274, in wrap_with
  File "funcy\decorators.py", line 66, in __call__
  File "dvc\external_repo.py", line 239, in _clone_default_branch
  File "dvc\scm.py", line 150, in clone
dvc.scm.CloneError: Failed to clone repo 'http://xxx.xxx.xxx.xxx/my-private-repo.git' to 'C:\Users\user\AppData\Local\Temp\tmp2014udxmdvc-clone'
------------------------------------------------------------
2022-11-29 10:04:54,423 DEBUG: Analytics is enabled.
2022-11-29 10:04:54,423 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\\Users\\user\\AppData\\Local\\Temp\\tmpeaox6mf0']'
2022-11-29 10:04:54,427 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\\Users\\user\\AppData\\Local\\Temp\\tmpeaox6mf0']'

@dtrifiro
Copy link
Contributor

dtrifiro commented Nov 29, 2022

Hi @Heegreis, what helper is configured in the global git config? Does git pull work in the cloned repo after setting the credential helper with git config credential.helper ...?

@Heegreis
Copy link

Heegreis commented Nov 30, 2022

Hi @dtrifiro, I succeeded after trying the following.
My original global git config

$ git config --global --list
user.email=xxx@gmail.com # Actually this is not the account of my private repository
core.editor="C:\Users\user\AppData\Local\Programs\Microsoft VS Code\bin\code" --wait
credential.http://xxx.xxx.xxx.xxx.provider=generic # This IP is the GitLab where my private repository is located

Setting global git credential.helper to manager-core doesn't work

$ git config --global credential.helper manager-core
$ git config --global --list
user.email=xxx@gmail.com
core.editor="C:\Users\user\AppData\Local\Programs\Microsoft VS Code\bin\code" --wait
credential.http://xxx.xxx.xxx.xxx.provider=generic
credential.helper=manager-core
$ dvc import ...
Importing 'row_data (http://xxx.xxx.xxx.xxx/my-private-repo.git)' -> 'data'
ERROR: failed to import 'row_data' from 'http://xxx.xxx.xxx.xxx/my-private-repo.git'. - Failed to clone repo 'http://xxx.xxx.xxx.xxx/my-private-repo.git' to 'C:\Users\user\AppData\Local\Temp\tmp939z789ndvc-clone'

Setting global git credential.helper to store works!

$ git config --global credential.helper store
$ git config --global --list
user.email=xxx@gmail.com
core.editor="C:\Users\user\AppData\Local\Programs\Microsoft VS Code\bin\code" --wait
credential.http://xxx.xxx.xxx.xxx.provider=generic
credential.helper=store
$ dvc import ... # It works

In addition, the above three settings are all available for git pull.

@dekromp
Copy link

dekromp commented Nov 30, 2022

Hi @dekromp and @sisp, sorry for taking so long to get back to you but I was out sick.

What credential helper are you using on mac? I can't seem to reproduce using git credential-store and git credential-osxkeychain. How did you configure the helper? Does cloning the repo using the git cli work (git clone https://<...>(?

Thanks @dtrifiro for the proposal. If I understood it correctly the credential helpers are meant for cloning with http urls. Unfortunately, we are required to clone via ssh. Regarding your question if git clone works. Yes it does.

@dtrifiro
Copy link
Contributor

dtrifiro commented Nov 30, 2022

@dekromp I see. Since this discussion has mostly been about http git remotes and git credential helpers, would you mind creating a new issue with details about your setup so that I can close this one? Thanks.

@dberenbaum
Copy link
Contributor

From our testing, http auth seems to be working. Let's open a separate issue for ssh.

@avivio If you continue to have problems, please follow up with your auth setup 🙏 .

@avivio
Copy link

avivio commented Dec 15, 2022

@dberenbaum Unfortunately none of these workarounds work for me :-(
At least not in a the way I expect, after some of these changes I start getting a prompt to enter my username and password for every update.
I can git pull from both the repo I'm updating from and from the repo I'm updating too, and from the repo I'm updating from I can also do any DVC operation I want to.
The only thing not working is updating and importing DVC data from my "data repo" to my "code repo"

My original config when I started

credential.helper=cache

What happened here is that when I ran dvc update .... I got a prompt to enter my user and password every time I updated

after setting global git credential.helper to manager-core

credential.helper=manager-core

Still got prompt for password and username

after setting global git credential.helper to store

credential.helper=store

This lead to the previos message

failed update data - Failed to clone repo 'https://dagshub.com/mana-bio/data-repo' to '/var/folders/r2/qc7fk3l172d1wc1l6r1gyzcm0000gn/T/tmpm1y1lm0advc-clone'

I can reset the whole behavior to the prompt for user and password by going to the repo I'm importing and update from and running global git credential.helper to manager-core

Output of dvc doctor:

DVC version: 2.37.0 (pip)
---------------------------------
Platform: Python 3.9.12 on macOS-13.0.1-x86_64-i386-64bit
Subprojects:
        dvc_data = 0.28.4
        dvc_objects = 0.14.0
        dvc_render = 0.0.15
        dvc_task = 0.1.6
        dvclive = 1.1.2
        scmrepo = 0.1.4
Supports:
        http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3)
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1s1
Caches: local
Remotes: https
Workspace directory: apfs on /dev/disk1s1s1
Repo: dvc, git

@dberenbaum
Copy link
Contributor

I'm not able to reproduce @avivio. Could you post the output of git config -l and maybe even the contents of the .dvc file you are trying to update (editing out any sensitive info)?

@dberenbaum dberenbaum added the awaiting response we are waiting for your reply, please respond! :) label Dec 16, 2022
@avivio
Copy link

avivio commented Dec 19, 2022

Hey @dberenbaum
adding here git config -l:

credential.helper=osxkeychain
user.name=xxx
user.email=yyy@zzz.com
pull.rebase=false
diff.tool=vscode
difftool.vscode.cmd=code --wait --diff $LOCAL $REMOTE
merge.tool=vscode
mergetool.vscode.cmd=code --wait $MERGED
credential.helper=store
core.repositoryformatversion=0
core.filemode=true
core.bare=false
core.logallrefupdates=true
core.ignorecase=true
core.precomposeunicode=true
remote.origin.url=https://github.com/Mana-bio/playground.git
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
branch.main.remote=origin
branch.main.merge=refs/heads/main
... 
credential.helper=store
...

an example of one of the .dvc files I'm trying to update:

md5: xxx
frozen: true
deps:
- path: data/benchling
  repo:
    url: https://dagshub.com/mana-bio/data-repo
    rev_lock: yyy
outs:
- md5: zzzz
  size: 11396927
  nfiles: 9
  path: benchling

@dtrifiro
Copy link
Contributor

Is the last value in the above configs credental.helper=store? If so, the reason for the failure might be a bad saved credential.

Can you check the following paths for any saved credentials and check whether they're valid?

~/.git-credentials
~/.config/git/credentials

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :) bug Did we break something? git Related to git and git backends p1-important Important, aka current backlog of things to do regression Ohh, we broke something :-(
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

10 participants