Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix WSGI middleware not to explode quadratically in the case of a larger body #1329

Merged
merged 6 commits into from Feb 16, 2022

Conversation

vytas7
Copy link
Contributor

@vytas7 vytas7 commented Jan 16, 2022

Fixes #708

NB: the proposed patch changes the type signature of a public utility function (build_environ()), however the method in question doesn't seem to be documented as part of any externally available interface. If that's unacceptable, I could add a compatibility check/wrapper for it.

@vytas7
Copy link
Contributor Author

vytas7 commented Jan 16, 2022

I've just realized this might get superseded by #1303, so judge accordingly.

@tomchristie
Copy link
Member

Great - thanks!

Given that this clearly resolves an open bug, I'd suggest that we get this in and fixed. I can't see any downside to that.
It's possible that it'll be superseded by #1303 - but taking a call on that one is more complex than this.

The test suite currently has a listing failure. I expect it's unrelated to this pull request(?) but it'll needs resolving first, either way.

@vytas7
Copy link
Contributor Author

vytas7 commented Feb 10, 2022

Hi @tomchristie , and thanks for looking at this.
I think these were typing issues and other glitches possibly introduced by this pull request, but I just didn't dig too deep since I didn't know what the plan was. If there is a chance to get this merged, I can try to rectify these.

@tomchristie
Copy link
Member

Rather than using io.BytesIO could you use list appends instead, eg...

Change from the existing:

         body = message.get("body", b"")
         more_body = message.get("more_body", False)
         while more_body:
             body_message: HTTPRequestEvent = await receive()  # type: ignore[assignment]
             body += body_message.get("body", b"")
             more_body = body_message.get("more_body", False)

to this...

        body_list = [message.get("body", b"")]
        more_body = message.get("more_body", False)
        while more_body:
             body_message: HTTPRequestEvent = await receive()  # type: ignore[assignment]
             body_list.append(body_message.get("body", b""))
             more_body = body_message.get("more_body", False)
        body = b"".join(body_list)

That'd probably result in a pull request with a smaller change footprint, right?

@vytas7
Copy link
Contributor Author

vytas7 commented Feb 10, 2022

I've actually considered this; but I think it would in general result in higher overhead.

Writing to a BytesIO only very marginally loses against appending, and may even come close or surpass other approaches for many large chunks of data. Furthermore, it is more memory-efficient since a BytesIO instance only needs to keep an internal memory buffer that it can reallocate on demand; appending to a list would need twice the memory right after concatenation.

@tomchristie
Copy link
Member

Thanks for you time on this @vytas7.
I think we need a more comprehensive fix, so let's close this given #1303.

(Either we'll rely on a2wsgi or we need a fix that doesn't load all the body into memory at once.)

@tomchristie tomchristie reopened this Feb 11, 2022
@tomchristie
Copy link
Member

Re-opening given #1303 (comment)

Currently blocked on the CI type checking failures.

@vytas7
Copy link
Contributor Author

vytas7 commented Feb 11, 2022

Thanks Tom; if this is still of interest, I'll rectify 'em typing problems later in the evening.

@tomchristie
Copy link
Member

Seems like a good plan to me.
It's a fairly low impact PR, and it'll close an outstanding bug, so... 👍

(Thanks for your time on this)

@vytas7
Copy link
Contributor Author

vytas7 commented Feb 11, 2022

@tomchristie done ✔️ (The CI now passes with flying colours.)

@tomchristie tomchristie mentioned this pull request Feb 16, 2022
@tomchristie tomchristie merged commit a694ee4 into encode:master Feb 16, 2022
@tomchristie
Copy link
Member

Thanks @vytas7!

Kludex pushed a commit to sephioh/uvicorn that referenced this pull request Oct 29, 2022
…ger body (encode#1329)

* Fix WSGI middleware not to explode quadratically in the case of a larger body.

* Clean up more_body and body stream handling, get rid of while-else.

* Update body type in `build_environ`'s tests (although they pass regardless...)

* Address typing issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WSGI interface: reading request body is O(n^2) wrt payload size
3 participants