Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

downloading file using requests.get caused a bug #1323

Open
6 tasks
dongrixinyu opened this issue May 5, 2022 · 2 comments
Open
6 tasks

downloading file using requests.get caused a bug #1323

dongrixinyu opened this issue May 5, 2022 · 2 comments
Assignees
Labels
invalid ⛔ Not-an-issue or upstream (not-our-issue) question/docs ‽ Documentation clarification candidate

Comments

@dongrixinyu
Copy link

  • I have marked all applicable categories:
    • exception-raising bug
    • visual output bug
  • I have visited the source website, and in particular
    read the known issues
  • I have searched through the issue tracker for duplicates
  • I have mentioned version numbers, operating system and
    environment, where applicable:

here is the version:

4.62.3 3.7.5 (default, Jul  5 2021, 10:38:47) 
[GCC 7.5.0] linux

the bug I encountered:

  • I used the example code to download file using requests.
import pdb
import requests, os
from tqdm import tqdm

# this link is available.
eg_link = "https://github.com/dongrixinyu/jiojio/releases/download/v1.1.4/default_pos_model.zip"
response = requests.get(eg_link, stream=True)
with tqdm.wrapattr(open('default_pos_model.zip', "wb"), "write",
                   miniters=1, desc=eg_link.split('/')[-1],
                   total=int(response.headers.get('content-length', 0))) as fout:
    for chunk in response.iter_content(chunk_size=4096):
        fout.write(chunk)
#  while the process executes to this line, check the filesize of default_pos_model.zip by `ls -lSt`, you can see the file size is 92508160
pdb.set_trace()
#  while the process finished, check the filesize of default_pos_model.zip by `ls -lSt`, you can see the file size is 92510542
  • the file size of default_pos_model.zip changed because the chunk_size 4096 can not be exactly divided. The file descriptor did not close until the process finished.
  • So, I assume this bug is caused by tqdm or requests.
@casperdcl
Copy link
Sponsor Member

could you fout.close() just before your pdb.set_trace()?

@casperdcl casperdcl added invalid ⛔ Not-an-issue or upstream (not-our-issue) question/docs ‽ Documentation clarification candidate labels Jun 18, 2022
@casperdcl casperdcl self-assigned this Jun 18, 2022
@KennyChenBasis
Copy link

I just ran into the same bug by following the documentation code. Naively adding fout.close() doesn't work because exceptions can be raised inside the with statement, leaving the file open. I'd have to use a try-finally or a context manager for the file.

I think the documentation is deceptive since it encourages writing to files with tqdm by doing tqdm.wrapattr(open(..., but it's actually not safe.

In my particular case, I was appending data to a file, and retrying the request, tqdm, open, etc. on exceptions... This led to the file having incorrect contents.

This issue is more-or-less the same as #1247.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid ⛔ Not-an-issue or upstream (not-our-issue) question/docs ‽ Documentation clarification candidate
Projects
None yet
Development

No branches or pull requests

3 participants