Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't issue redundant stdin detection warning when is in place. #1303

Merged
merged 4 commits into from Mar 7, 2022

Conversation

isidentical
Copy link
Contributor

@isidentical isidentical commented Feb 25, 2022

This patch fixes a race happening between 2 IO bound operations:

  • read(2) with sys.stdin
  • select(2) with sys.stdin

image

select(2) is a function that takes files and tells when they are ready to be processed (read/written etc.). It is used for concurrently dealing with multiple sockets (e.g you have multiple open connections, and with select(2) you can check which one of those connections have some data to be processed). In HTTPie however, we used it to detect whether there is any incoming data to the sys.stdin, so that when there is no data we could issue a warning.

The problem is that, the initial version of this code unknowlingly created a race between the select.select call:
https://github.com/httpie/httpie/blob/30cd862fc0e173698fc17487c4b96d8f64b701ea/httpie/uploads.py#L98-L110

and the file.read() call:
https://github.com/httpie/httpie/blob/30cd862fc0e173698fc17487c4b96d8f64b701ea/httpie/uploads.py#L124-L132

Both of these functions are blocking, and we ran them concurrently to get away from that issue. But since Python has a GIL, when you run multiple threads together, depending on the CPU-related activity and some other details it switches between them on certain intervals. (green arrows indicate the interpreter given the execution to the Main Thread, red indicate it is given to Observer Thread)

image

So to put it simply, we used to run the read() and the select() at the same time and whichever gets choosen to be executed at that time would win. If it was select(), we'd get the correct result (about whether there is any data or not) and if it was read() we'd always get the wrong result (no incomiong data). This caused to intermittent / random failure. From what I understood this problem happens randomly (and in a OS-dependant way, because of the underlying thread scheduler), and there is no fair way of testing it (except manually).

The race has been solved by moving the select(2) out of the thread (as a guard) and using threading.Events as indicators about whether we've seen any data or not.

@isidentical isidentical force-pushed the stdin-chunked-on-macos branch 2 times, most recently from 4881049 to f930649 Compare February 28, 2022 16:35
@codecov-commenter
Copy link

codecov-commenter commented Feb 28, 2022

Codecov Report

Merging #1303 (75ae325) into master (4d7d6b6) will decrease coverage by 0.75%.
The diff coverage is 95.12%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1303      +/-   ##
==========================================
- Coverage   97.28%   96.53%   -0.76%     
==========================================
  Files          67       85      +18     
  Lines        4235     6027    +1792     
==========================================
+ Hits         4120     5818    +1698     
- Misses        115      209      +94     
Impacted Files Coverage Δ
tests/test_binary.py 100.00% <ø> (ø)
httpie/compat.py 31.11% <27.90%> (-68.89%) ⬇️
tests/conftest.py 77.14% <58.33%> (-9.82%) ⬇️
httpie/output/lexers/http.py 59.09% <59.09%> (ø)
tests/test_ssl.py 91.01% <66.66%> (-3.93%) ⬇️
httpie/manager/__main__.py 82.35% <82.35%> (ø)
httpie/output/lexers/metadata.py 82.35% <82.35%> (ø)
httpie/models.py 94.23% <90.00%> (-3.14%) ⬇️
tests/test_uploads.py 97.70% <92.59%> (-2.30%) ⬇️
httpie/manager/core.py 92.85% <92.85%> (ø)
... and 71 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ad613f2...75ae325. Read the comment docs.

@isidentical isidentical force-pushed the stdin-chunked-on-macos branch 2 times, most recently from c4f14d5 to 4c32bac Compare February 28, 2022 21:49
@isidentical isidentical marked this pull request as ready for review February 28, 2022 22:17
httpie/uploads.py Outdated Show resolved Hide resolved
Copy link
Member

@jkbrzt jkbrzt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff. Let’s just add a mention to CHANGELOG

@isidentical isidentical merged commit 98688b2 into httpie:master Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants