Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: Don't close socket in the WSGI thread, delegate it back to the main thread! #377

Merged
merged 1 commit into from May 25, 2022

Conversation

digitalresistor
Copy link
Member

There's a potential issue whereby a thread attempts to send data, but the socket is closed on the other end, and the thread shuts the socket down.

This may however happen while the file descriptors are being put together before being passed to select() which will then fail as that file descriptor is no longer valid.

Instead we loop until we have called select() successfully, while checking to see if the file descriptors has changed since we started the poll.

If the map of file descriptors has not changed, and yet we get an errno that is a bad file descriptor, we will just raise it and kill the main loop.

Closes #374

@ale-rt
Copy link

ale-rt commented May 4, 2022

I was also hit by #374.
Running this branch the problem appears to be solved.
I did not had any other drawback.

Copy link

@ale-rt ale-rt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I added some really minor remarks

src/waitress/wasyncore.py Outdated Show resolved Hide resolved
src/waitress/wasyncore.py Outdated Show resolved Hide resolved
@digitalresistor
Copy link
Member Author

It unfortunately does have drawbacks in that there are still race conditions in the code. I've got a better fix, I just haven't had time to test/validate/implement it.

I'm sorry you were hit by this too :-(

@ale-rt
Copy link

ale-rt commented May 5, 2022

I'm sorry you were hit by this too :-(

Are you kidding 🤣?! Thanks for the great work you are doing!

This solves a race condition that may exist when attempting to loop over
the open sockets and then calling select() and accidentally have called
close() on the socket in an app thread.
@digitalresistor digitalresistor force-pushed the bugfix/select-closed-socket-race branch from 729170f to c7a3d7e Compare May 25, 2022 02:27
@digitalresistor
Copy link
Member Author

This change has now been updated to the following:

  1. The main thread can call close() on a socket, but an app thread can't
  2. App thread will pull the trigger (because no data was flushed) waking up the main thread to call close()

This solves the race condition and allows select() to function as before.

@digitalresistor digitalresistor marked this pull request as ready for review May 25, 2022 02:35
@mmerickel mmerickel merged commit 4f6789b into master May 25, 2022
@mmerickel mmerickel deleted the bugfix/select-closed-socket-race branch May 25, 2022 03:07
@digitalresistor digitalresistor changed the title Bugfix: Retry if a thread closes a socket before we select() on it Bugfix: Don't close socket in the WSGI thread, delegate it back to the main thread! May 30, 2022
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Jan 28, 2024
2.1.2
-----

Bugfix
~~~~~~

- When expose_tracebacks is enabled waitress would fail to properly encode
  unicode thereby causing another error during error handling. See
  Pylons/waitress#378

- Header length checking had a calculation that was done incorrectly when the
  data was received across multple socket reads. This calculation has been
  corrected, and no longer will Waitress send back a 413 Request Entity Too
  Large. See Pylons/waitress#376

Security Bugfix
~~~~~~~~~~~~~~~

- in 2.1.0 a new feature was introduced that allowed the WSGI thread to start
  sending data to the socket. However this introduced a race condition whereby
  a socket may be closed in the sending thread while the main thread is about
  to call select() therey causing the entire application to be taken down.
  Waitress will no longer close the socket in the WSGI thread, instead waking
  up the main thread to cleanup. See Pylons/waitress#377

2.1.1
-----

Security Bugfix
~~~~~~~~~~~~~~~

- Waitress now validates that chunked encoding extensions are valid, and don't
  contain invalid characters that are not allowed. They are still skipped/not
  processed, but if they contain invalid data we no longer continue in and
  return a 400 Bad Request. This stops potential HTTP desync/HTTP request
  smuggling. Thanks to Zhang Zeyu for reporting this issue. See
  GHSA-4f7p-27jc-3c36

- Waitress now validates that the chunk length is only valid hex digits when
  parsing chunked encoding, and values such as ``0x01`` and ``+01`` are no
  longer supported. This stops potential HTTP desync/HTTP request smuggling.
  Thanks to Zhang Zeyu for reporting this issue. See
  GHSA-4f7p-27jc-3c36

- Waitress now validates that the Content-Length sent by a remote contains only
  digits in accordance with RFC7230 and will return a 400 Bad Request when the
  Content-Length header contains invalid data, such as ``+10`` which would
  previously get parsed as ``10`` and accepted. This stops potential HTTP
  desync/HTTP request smuggling Thanks to Zhang Zeyu for reporting this issue. See
  GHSA-4f7p-27jc-3c36

2.1.0
-----

Python Version Support
~~~~~~~~~~~~~~~~~~~~~~

- Python 3.6 is no longer supported by Waitress

- Python 3.10 is fully supported by Waitress

Bugfix
~~~~~~

- ``wsgi.file_wrapper`` now sets the ``seekable``, ``seek``, and ``tell``
  attributes from the underlying file if the underlying file is seekable. This
  allows WSGI middleware to implement things like range requests for example

  See Pylons/waitress#359 and
  Pylons/waitress#363

- In Python 3 ``OSError`` is no longer subscriptable, this caused failures on
  Windows attempting to loop to find an socket that would work for use in the
  trigger.

  See Pylons/waitress#361

- Fixed an issue whereby ``BytesIO`` objects were not properly closed, and
  thereby would not get cleaned up until garbage collection would get around to
  it.

  This led to potential for random memory spikes/memory issues, see
  Pylons/waitress#358 and
  Pylons/waitress#357 .

  With thanks to Florian Schulze for testing/vaidating this fix!

Features
~~~~~~~~

- When the WSGI app starts sending data to the output buffer, we now attempt to
  send data directly to the socket. This avoids needing to wake up the main
  thread to start sending data. Allowing faster transmission of the first byte.
  See Pylons/waitress#364

  With thanks to Michael Merickel for being a great rubber ducky!

- Add REQUEST_URI to the WSGI environment.

  REQUEST_URI is similar to ``request_uri`` in nginx. It is a string that
  contains the request path before separating the query string and
  decoding ``%``-escaped characters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possible race condition leading to the main loop dying?
3 participants