Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] conan tries to extract broken downloads #8578

Closed
blackliner opened this issue Mar 1, 2021 · 5 comments · Fixed by #8664
Closed

[bug] conan tries to extract broken downloads #8578

blackliner opened this issue Mar 1, 2021 · 5 comments · Fixed by #8664
Assignees
Milestone

Comments

@blackliner
Copy link
Contributor

blackliner commented Mar 1, 2021

When CTRL + C while conan downloads the prebuilt archives of a package, the subsequent execution of conan install tries to extract from those half-downloaded archives from the download_cache. This does ONLY happens when you use a download_cache.

2 possible ideas:

  • add a hash to check if the downloaded archives are not corrupt
  • restart the download when the extraction fails

Environment Details (include every applicable attribute)

  • Operating System+version: Ubuntu 18.04 WSL2
  • Compiler+version: gcc 7.4
  • Conan version: 1.33.1
  • Python version: 3.6.9

relevant conan settings: download_cache = /tmp/conan_download_cache

Steps to reproduce (Include if Applicable)

Its easy to replicate with a "big" package, maybe something like opencv, that is available as prebuilt on the remote of your choice. If not, use your own conan repo, build and upload your package.

  1. conan install opencv/4.5.0@ and hit CTRL + C when conan says: Downloading 834c114e736468c70cf2b5d0781c6c8db5787764b30d718ac40d45e51ec34c0d: 23%|##2 | 2.64M/11.7M
  2. rerun conan install opencv/4.5.0@ and see:
Downloading binary packages in 8 parallel threads
opencv/4.5.0: Retrieving package 05ae15556d4c5439324a478fdaea5d0109fda60e from remote 'luminar' 
Decompressing conan_package.tgz:  61%|######    | 3.15M/5.18M [00:00<00:00, 33.1MB/s]opencv/4.5.0: ERROR: Exception while getting package: 05ae15556d4c5439324a478fdaea5d0109fda60e
opencv/4.5.0: ERROR: Exception: <class 'conans.errors.ConanException'> Error while downloading/extracting files to /home/fberchtold/.conan/data/opencv/4.5.0/_/_/package/05ae15556d4c5439324a478fdaea5d0109fda60e
Compressed file ended before the end-of-stream marker was reached
Folder removed
ERROR: Error while downloading/extracting files to /home/fberchtold/.conan/data/opencv/4.5.0/_/_/package/05ae15556d4c5439324a478fdaea5d0109fda60e
Compressed file ended before the end-of-stream marker was reached
Folder removed
@solvingj
Copy link
Contributor

solvingj commented Mar 1, 2021

I would definitely consider this a bug.

@memsharded
Copy link
Member

Hi @blackliner

Yes, this is a bit surprising. The implementation already contains a checksum:

        with self._lock(h):
            cached_path = os.path.join(self._cache_folder, h)
            if not os.path.exists(cached_path):
                self._file_downloader.download(url=url, file_path=cached_path, md5=md5,
                                               sha1=sha1, sha256=sha256, **kwargs)
            else:
                # specific check for corrupted cached files, will raise, but do nothing more
                # user can report it or "rm -rf cache_folder/path/to/file"
                try:
                    check_checksum(cached_path, md5, sha1, sha256)
                except ConanException as e:
                    raise ConanException("%s\nCached downloaded file corrupted: %s"
                                         % (str(e), cached_path))

The check_checksum() should given at least a nicer error message, and not wait until extraction time. I am not sure why this doesn't raise, need to check.

@blackliner
Copy link
Contributor Author

blackliner commented Mar 17, 2021

@memsharded any updates on this issue? Is it scheduled to be fixed?

I added some print, and it looks like the method is called without any md5, sha1 or sha256

Running checksum with:
{'md5': None, 'sha1': None, 'sha256': None}

There is a note in that same file (conans/client/downloaders/cached_file_downloader.py) at the _get_hash method:

For Api V2, the cached downloads always have recipe and package REVISIONS in the URL,
making them immutable, and perfect for cached downloads of artifacts. For V2 checksum
will always be None.
For ApiV1, the checksum is obtained from the server via "get_snapshot()" methods, but
the URL in the apiV1 contains the signature=xxx for signed urls, but that can change,
so better strip it from the URL before the hash

@memsharded
Copy link
Member

Hi @blackliner

Yes, I have realized too that when using revisions, the checksum is simply not there for Conan cached artifacts, as it relies on the revision. I am working on a fix using the "dirty" functionality we are using in other places.

@memsharded
Copy link
Member

Fixed in #8664, will be released in 1.35

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants