Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set CONTENT_LENGTH for chunked requests #2287

Merged
merged 1 commit into from May 31, 2020

Conversation

eugeneius
Copy link
Contributor

@eugeneius eugeneius commented May 29, 2020

Description

Fixes #1839.

Chunked requests don't contain a Content-Length header, but Puma buffers the entire request body upfront, which means it can determine the length before dispatching to the application.

The Rack spec doesn't mandate the presence of the CONTENT_LENGTH header, but it does refer to it as a "CGI key" and draws a distinction between it and the HTTP Content-Length header:

https://github.com/rack/rack/blob/v2.2.2/SPEC.rdoc

The environment must not contain the keys HTTP_CONTENT_TYPE or HTTP_CONTENT_LENGTH (use the versions without HTTP_). The CGI keys (named without a period) must have String values.

RFC 3875, which defines the CGI protocol including CONTENT_LENGTH, says:

https://tools.ietf.org/html/rfc3875#section-4.1.2

The server MUST set this meta-variable if and only if the request is accompanied by a message-body entity. The CONTENT_LENGTH value must reflect the length of the message-body after the server has removed any transfer-codings or content-codings.

"Removing a transfer-coding" is precisely what Puma is doing when it parses a chunked request.

RFC 7230, the most recent specification of HTTP 1.1, includes a pseudo-code algorithm for decoding chunked requests that roughly matches the behaviour implemented here:

https://tools.ietf.org/html/rfc7230#section-4.1.3

I say "roughly" because we don't update the Transfer-Encoding header or parse trailers.

Your checklist for this pull request

  • I have reviewed the guidelines for contributing to this repository.
  • I have added an entry to History.md if this PR fixes a bug or adds a feature. If it doesn't need an entry to HISTORY.md, I have added [changelog skip] the pull request title.
  • I have added appropriate tests if this PR fixes a bug or adds a feature.
  • My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • I have updated the documentation accordingly.
  • All new and existing tests passed, including Rubocop.

Chunked requests don't contain a Content-Length header, but Puma buffers
the entire request body upfront, which means it can determine the length
before dispatching to the application.

The Rack spec doesn't mandate the presence of the CONTENT_LENGTH header,
but it does refer to it as a "CGI key" and draws a distinction between
it and the HTTP Content-Length header:

https://github.com/rack/rack/blob/v2.2.2/SPEC.rdoc

> The environment must not contain the keys HTTP_CONTENT_TYPE or
> HTTP_CONTENT_LENGTH (use the versions without HTTP_). The CGI keys
> (named without a period) must have String values.

RFC 3875, which defines the CGI protocol including CONTENT_LENGTH, says:

https://tools.ietf.org/html/rfc3875#section-4.1.2

> The server MUST set this meta-variable if and only if the request is
> accompanied by a message-body entity. The CONTENT_LENGTH value must
> reflect the length of the message-body after the server has removed
> any transfer-codings or content-codings.

"Removing a transfer-coding" is precisely what Puma is doing when it
parses a chunked request.

RFC 7230, the most recent specification of HTTP 1.1, includes a pseudo-
code algorithm for decoding chunked requests that roughly matches the
behaviour implemented here:

https://tools.ietf.org/html/rfc7230#section-4.1.3
Copy link
Member

@evanphx evanphx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking and advertising the size of the chunked body seems like a good change. Seems unlikely anyone was depending on CONTENT_LENGTH being absent when chunked encoding was used.

@nateberkopec nateberkopec merged commit f9ddd58 into puma:master May 31, 2020
@TheRusskiy
Copy link

Hey, is there any chance this could be backported to 4.x version? Seems like a pretty serious omission and 5.x is full of breaking changes.
Thank you.

@nateberkopec
Copy link
Member

@TheRusskiy We do not have a previous-major-version maintenance policy. I've created a new 4-3-stable branch, if you'd like to make a pull request against it and backport.

@TheRusskiy
Copy link

@nateberkopec backporting here: #2496

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to retrieve body from "Transfer-Encoding: chunked" request
4 participants