New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible race condition leading to the main loop dying? #374
Comments
@oakkitten could you try this patch: #377 and see if it stops the main thread from dying? I have been unable to come up with a good way to test/validate this fixes the issue. It's not the best solution, but short of rewriting |
I'm not familiar with the codebase, but from what I've seen while trying to catch the issue— Perhaps, instead of calling Also, you are doing if fds != map.keys():
fds = map.keys()
Also, there's the issue with the modification of >>> d = {x: x for x in range(5)}
>>> for x in d.keys():
... print(x)
... if x == 3:
... del d[x]
...
0
1
2
3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration I suppose even doing |
Good point on the We never iterate the map and delete from it though in the code change I made. I am trying to avoid changing too much of asyncore as that was wholesale lifted from Py 3.5. If you were to use This only really happens if the remote closes the connection before having read the full response, which is likely why it is not often seen in the wild, as most reverse proxies will have already read the full response. The problem with attempting to add yet another list, is that there is no single object, the Trying to set a flag inside the thread and trying to make sure that can be read throughout the It's going to require more intrusive changes, which makes sense as |
Using Eventually a call to I am going to see if I can write a test that causes the race. You mentioned you are seeing this in tests, are you running waitress in a thread or a separate process? |
You don't iterate the map and delete from it in the code change you made, but I think you might be doing that in the already existing code?
So the question is, is I am testing an addon for a Qt app that uses waitress internally. It launches it in a thread. This is a patch I made to fix the issue in a dumb way, which should be ok for tests. If you are willing, you can try grabbing the commit before that and running the tests (see tox.ini for instructions; the tests are slow and require a lot of dependencies so it won't be fun). The problem is rare, but you can force it by inserting some kind of a delay before the |
I updated #377 (comment) with the new changes. Give this a shot please. |
It removes any races, only the main thread can close the socket... |
I pulled 4f6789b and 4800 tests and 108 minutes later there were no crash. Thanks! 🎉 |
waitress = 2.1.2 fix the error: OSError: [Errno 9] Bad file descriptor See: Pylons/waitress#374 This error causes the server to "freeze" on robot tests and the tests never finish.
I spot that the description for CVE-2022-31015 mentions that this affects "versions 2.1.0 and 2.1.1". However, a quick glance at the code suggests that this might be because it affects the |
No. It’s not a bug in wasyncore but rather waitress began trying to invoke wasyncore methods like close() from other threads that caused the issue. |
Getcha. However, the fix essentially requires the vendored version, no? Otherwise the |
It enables a performance optimization where waitress can write to the socket safely from a thread. |
@lamby the vendored version is always used, even on Python versions that have |
That makes sense. However, someone using a very very old version of waitress (prior to the module being vendored in, that is...) would be vulnerable to this issue? |
Again no because the bug was only due to a change in how we USED asyncore in the specified versions of waitress. Waitress used it differently and safely before then. And it does again after the cve fix. We documented the affected versions correctly in the cve. |
Thanks, really appreciate it. :) |
I am facing this as well: import waitress
from flask import Flask
app = Flask(__name__)
wsgi = waitress.create_server(app, host='0.0.0.0', port='1234')
wsgi.run() While the main loop is running, I am calling Traceback (most recent call last):
File "/myPath/./main.py", line 1368, in main
self.wsgi.run()
File "/myPath/lib/python3.11/site-packages/waitress/server.py", line 322, in run
self.asyncore.loop(
File "/myPath/lib/python3.11/site-packages/waitress/wasyncore.py", line 245, in loop
poll_fun(timeout, map)
File "/myPath/lib/python3.11/site-packages/waitress/wasyncore.py", line 172, in poll
r, w, e = select.select(r, w, e, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 9] Bad file descriptor Is there a recommended way of preventing this or handling it in a more clean way? Thanks!
|
I just might be wrong because if this indeed a race condition it should be breaking more things. Anyway, I got this exception (line numbers might be wrong due to debug statements):
This error was extremely rare but since I was getting it while running tests I could just run a lot of them until one failed, which I did, and I think the problem is a follows.
First, thread
Thread-1
that the app I'm testing is launching, one that runs waitress server, assembles the descriptor lists forselect
:waitress/src/waitress/wasyncore.py
Lines 154 to 166 in 603d2c1
Then, thread
waitress-0
deletes one of the channels, in my case it was<waitress.channel.HTTPChannel 127.0.0.1:54044 at 0x7f10ec052400>
, and immediately closes the socket:waitress/src/waitress/wasyncore.py
Lines 460 to 470 in 603d2c1
Stack of
waitiress-0
at the moment:Then, thread
Thread-1
is trying to see if the file descriptor of the socked closed above is writable, which leads to the the exception above:waitress/src/waitress/wasyncore.py
Lines 171 to 177 in 603d2c1
Python 3.8.10, waitress 2.1.1, Ubuntu 20.04 LTS focal @ WSL2
The text was updated successfully, but these errors were encountered: