Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix chunked ending check #1607

Merged
merged 2 commits into from Aug 15, 2018
Merged

Conversation

harmdewit
Copy link
Contributor

@harmdewit harmdewit commented Jul 5, 2018

As stated in #1480 according to the spec a valid chunked body is formatted like this:

4.1.  Chunked Transfer Coding

   The chunked transfer coding wraps the payload body in order to
   transfer it as a series of chunks, each with its own size indicator,
   followed by an OPTIONAL trailer containing header fields.  Chunked
   enables content streams of unknown size to be transferred as a
   sequence of length-delimited buffers, which enables the sender to
   retain connection persistence and the recipient to know when it has
   received the entire message.

     chunked-body   = *chunk
                      last-chunk
                      trailer-part
                      CRLF

     chunk          = chunk-size [ chunk-ext ] CRLF
                      chunk-data CRLF
     chunk-size     = 1*HEXDIG
     last-chunk     = 1*("0") [ chunk-ext ] CRLF

     chunk-data     = 1*OCTET ; a sequence of chunk-size octets

   The chunk-size field is a string of hex digits indicating the size of
   the chunk-data in octets.  The chunked transfer coding is complete
   when a chunk with a chunk-size of zero is received, possibly followed
   by a trailer, and finally terminated by an empty line.

   A recipient MUST be able to parse and decode the chunked transfer
   coding.

An example would be something like this:

GET / HTTP/1.1\r\nConnection: Keep-Alive\r\nTransfer-Encoding: chunked\r\n\r\n1\r\nh\r\n4\r\nello\r\n0\r\n\r\n

Notice the 2 CLRF's at the ending of the request. One is of the last last-chunk and the other one is the trailing CRLF of the chunked-body. Currently it Puma only accounts for a single CLRF and incorrectly sees the second CLRF as the start of a following request, which is malformed.

This PR checks for a second CLRF and skips it if present.

Fixes #1456
Fixes #1480

As a sidenode, does Puma parse trailer-parts?

* master:
  test_pumactl.rb - remove skip on test_control_url
  binder.rb - move logger.log calls after adding listeners
  appveyor.yml - fix casing, OpenSSL 1.1.0h, convert to ps
  v3.12.0
  Update url for binder parsing test for JRuby
@Djurredejong
Copy link

We're running into same problem. This branch seems to fix it for us. Is this PR being considerd for master, @schneems ?

@schneems
Copy link
Contributor

Great PR! Thanks!

What's a good way to test this with another tool? Does curl allow me to send chunked requests? What would a good terminal command be for me to test this out?

@harmdewit
Copy link
Contributor Author

harmdewit commented Jul 25, 2018

Curl will automatically send chunked requests by setting the header (link):

For example

curl -H "Transfer-Encoding: chunked" -d "payload" localhost --silent --trace-ascii -

Outputs:

== Info: Rebuilt URL to: localhost/
== Info:   Trying ::1...
== Info: TCP_NODELAY set
== Info: Connection failed
== Info: connect to ::1 port 80 failed: Connection refused
== Info:   Trying 127.0.0.1...
== Info: TCP_NODELAY set
== Info: Connected to localhost (127.0.0.1) port 80 (#0)
=> Send header, 154 bytes (0x9a)
0000: POST / HTTP/1.1
0011: Host: localhost
0022: User-Agent: curl/7.54.0
003b: Accept: */*
0048: Transfer-Encoding: chunked
0064: Content-Type: application/x-www-form-urlencoded
0095: 
0097: 7
=> Send data, 14 bytes (0xe)
0000: payload
0009: 0
000c: 
== Info: upload completely sent off: 14 out of 7 bytes
<= Recv header, 17 bytes (0x11)
0000: HTTP/1.1 200 OK
...

Debugging this request in a receiving puma server in Puma::Client#setup_body:

puts body.inspect => "7\r\npayload\r\n0\r\n\r\n"

Unfortunately I couldn't find any info on how to set the chunk size in order to send multiple chunks. Requests larger than 16380 bytes seem to automatically get chunked, however debugging these large requests in Puma shows wrong results. I suspect these large requests are written to a Tempfile to limit memory usage or something.

@harmdewit
Copy link
Contributor Author

@schneems Found a way to do it with Net::HTTP:

require 'uri'
require 'net/http'
require 'tempfile'

class Chunked
  def initialize(data, chunk_size)
    @size = chunk_size
    if data.respond_to? :read
      @file = data
    end
  end

  def read(_, offset)
    if @file
      @file.read(@size, offset)
    end
  end
  def eof!
    @file.eof!
  end
  def eof?
    @file.eof?
  end
end

uri = URI::parse('http://example.test')
connection = Net::HTTP.new(uri.host, uri.port)
connection.set_debug_output($stdout)

file = Tempfile.new
file.write("Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.")
file.rewind

chunked = Chunked.new(file, 100)

request = Net::HTTP::Post.new uri.request_uri, {'Transfer-Encoding' => 'chunked', 'content-type' => 'text/plain'}
request.body_stream = chunked
connection.start do |http|
  http.request(request)
end
file.close
file.unlink

Outputs (response isn't relevant):

opening connection to example.test:80...
opened
<- "POST / HTTP/1.1\r\nTransfer-Encoding: chunked\r\nContent-Type: text/plain\r\nAccept-Encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3\r\nAccept: */*\r\nUser-Agent: Ruby\r\nHost: example.test\r\n\r\n"
<- "64\r\n"
<- "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod\ntempor incididunt ut labore"
<- "\r\n"
<- "64\r\n"
<- " et dolore magna aliqua. Ut enim ad minim veniam,\nquis nostrud exercitation ullamco laboris nisi ut "
<- "\r\n"
<- "64\r\n"
<- "aliquip ex ea commodo\nconsequat. Duis aute irure dolor in reprehenderit in voluptate velit esse\ncill"
<- "\r\n"
<- "64\r\n"
<- "um dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non\nproident, sunt in culpa qu"
<- "\r\n"
<- "2e\r\n"
<- "i officia deserunt mollit anim id est laborum."
<- "\r\n"
<- "0\r\n\r\n"
-> "HTTP/1.1 503 Service Unavailable\r\n"
-> "Content-Type: text/html; charset=utf8\r\n"
-> "X-Pow-Template: application_not_found\r\n"
-> "Date: Mon, 30 Jul 2018 11:56:22 GMT\r\n"
-> "Connection: keep-alive\r\n"
-> "Transfer-Encoding: chunked\r\n"
-> "\r\n"
-> "a99\r\n"
reading 2713 bytes...
-> "<!doctype html>\n<html>\n<head>\n  <meta charset=\"utf-8\">\n  <title>Application not found</title>\n  <style>\n    body {\n      margin: 0;\n      padding: 0;\n      background: #e0e0d8;\n      line-height: 18px;\n    }\n    div.page {\n      margin: 72px auto;\n      margin: 36px auto;\n      background: #fff;\n      border-radius: 18px;\n      -webkit-box-shadow: 0px 2px 7px #999;\n      -moz-box-shadow: 0px 2px 7px #999;\n      padding: 36px 90px;\n      width: 480px;\n      position: relative;\n    }\n    .big div.page {\n      width: 720px;\n    }\n    h1, h2, p, li {\n      font-family: Helvetica, sans-serif;\n      font-size: 13px;\n    }\n    h1 {\n      line-height: 45px;\n      font-size: 36px;\n      margin: 0;\n    }\n    h1:before {\n      font-size: 66px;\n      line-height: 42px;\n      position: absolute;\n      right: 576px;\n    }\n    .big h1:before {\n      right: 819px;\n    }\n    h1.ok {\n      color: #060;\n    }\n    h1.ok:before {\n      content: \"\xE2\x9C\x93\";\n      color: #090;\n    }\n    h1.err {\n      color: #600;\n    }\n    h1.err:before {\n      content: \"\xE2\x9C\x97\";\n      color: #900;\n    }\n    h2 {\n      line-height: 27px;\n      font-size: 18px;\n      font-weight: normal;\n      margin: 0;\n    }\n    a, pre span {\n      color: #776;\n    }\n    h2, p, pre {\n      color: #222;\n    }\n    pre {\n      white-space: pre-wrap;\n      font-size: 13px;\n    }\n    pre, code {\n      font-family: Menlo, Monaco, monospace;\n    }\n    p code {\n      font-size: 12px;\n    }\n    pre.breakout {\n      border-top: 1px solid #ddd;\n      border-bottom: 1px solid #ddd;\n      background: #fafcf4;\n      margin-left: -90px;\n      margin-right: -90px;\n      padding: 8px 0 8px 90px;\n    }\n    pre.small_text {\n      font-size: 10px;\n    }\n    pre.small_text strong {\n      font-size: 13px;\n    }\n    ul {\n      padding: 0;\n    }\n    li {\n      list-style-type: none;\n    }\n  </style>\n</head>\n<body class=\"\">\n  <div class=\"page\">\n    \n  <h1 class=\"err\">Application not found</h1>\n  <h2>Symlink your app to <code>~/.pow/example</code> first.</h2>\n  <section>\n    <p>When you access <code>http://example.test/</code>, Pow looks for a Rack application at <code>~/.pow/example</code>. To run your app at this domain:</p>\n    <pre><span>$</span> cd ~/.pow\n<span>$</span> ln -s /path/to/myapp example\n<span>$</span> open http://example.test/</pre>\n  </section>\n\n    <ul>\n      <li><a href=\"http://pow.cx/manual\">Pow User&rsquo;s Manual</a></li>\n      <li><a href=\"https://github.com/basecamp/pow/wiki/Troubleshooting\">Troubleshooting</a></li>\n      <li><a href=\"https://github.com/basecamp/pow/wiki/FAQ\">Frequently Asked Questions</a></li>\n      <li><a href=\"https://github.com/basecamp/pow/issues\">Issue Tracker</a></li>\n    </ul>\n  </div>\n</body>\n</html>\n\n"
read 2713 bytes
reading 2 bytes...
-> "\r\n"
read 2 bytes
-> "0\r\n"
-> "\r\n"
Conn keep-alive
=> #<Net::HTTPServiceUnavailable 503 Service Unavailable readbody=true>

@Cory-Christensen
Copy link

This will also fix #1492
I was about to create a PR with the same fix after doing some digging today!

@schneems
Copy link
Contributor

Awesome! thanks for the working repro case. I confirm this fails without your patch and passes with it.

skylight (1.6.0) lib/skylight/probes/middleware.rb:10:in `call'
rack-mini-profiler (0.10.7) lib/mini_profiler/profiler.rb:282:in `call'
skylight (1.6.0) lib/skylight/probes/middleware.rb:10:in `call'
/Users/rschneeman/.gem/ruby/2.5.1/bundler/gems/raven-ruby-9649fa5a50e5/lib/raven/integrations/rack.rb:51:in `call'
skylight (1.6.0) lib/skylight/probes/middleware.rb:10:in `call'
railties (5.2.0) lib/rails/engine.rb:524:in `call'
/Users/rschneeman/.gem/ruby/2.5.1/bundler/gems/puma-7f71af4b45ba/lib/puma/configuration.rb:225:in `call'
/Users/rschneeman/.gem/ruby/2.5.1/bundler/gems/puma-7f71af4b45ba/lib/puma/server.rb:658:in `handle_request'
/Users/rschneeman/.gem/ruby/2.5.1/bundler/gems/puma-7f71af4b45ba/lib/puma/server.rb:472:in `process_client'
/Users/rschneeman/.gem/ruby/2.5.1/bundler/gems/puma-7f71af4b45ba/lib/puma/server.rb:332:in `block in run'
/Users/rschneeman/.gem/ruby/2.5.1/bundler/gems/puma-7f71af4b45ba/lib/puma/thread_pool.rb:133:in `block in spawn_thread'
2018-08-15 12:51:24 -0500: HTTP parse error, malformed request (): #<Puma::HttpParserError: Invalid HTTP format, parsing fails.>

Thanks a ton!

@schneems schneems merged commit 395337d into puma:master Aug 15, 2018
@harmdewit
Copy link
Contributor Author

Hi @schneems can you say when the next version will be released?

@schneems
Copy link
Contributor

I don't honestly know. I can try to queue up a relase. In the mean time can you use the github source in your gemfile?

@MSP-Greg
Copy link
Member

MSP-Greg commented Nov 30, 2018

@schneems

Reminder - appveyor creates & saves a pre-compiled gem file. As I recall, the test files are saved in the gem, along with a windows specific rake file. It's installed and the tests run from it.

It's built for ruby 2.2. thru 2.6.. Given the date, I'm sure trunk won't have any breaking API or ABI changes before 2.6 release. Be the first on your block to release a 2.6 compatible pre-compiled gem!

nateberkopec pushed a commit that referenced this pull request Aug 3, 2019
…request (#1812)

* Fix a bug that the last CRLF of chunked body may be used in the next request

The last CRLF of chunked body is checked by #1607. But it's
incomplete. If a client sends the last CRLF (or just LF) after Puma
processes "0\r\n" line, the last CRLF (or just LF) isn't dropped in
the "0\r\n" process:

https://github.com/puma/puma/blob/675344e8609509b0d767ae7680436b3b382d8394/lib/puma/client.rb#L183-L192

    if line.end_with?("\r\n")
      len = line.strip.to_i(16)
      if len == 0
        @body.rewind
        rest = io.read
        # rest is "" with no the last CRLF case and
        # "\r" with no last LF case.
        # rest.start_with?("\r\n") returns false for
        # Both of these cases.
        rest = rest[2..-1] if rest.start_with?("\r\n")
        @buffer = rest.empty? ? nil : rest
        set_ready
        return true
      end

The unprocessed last CRLF (or LF) is used as the first data in the
next request. Because Puma::Client#reset sets `@parsed_bytes`
to 0.

https://github.com/puma/puma/blob/675344e8609509b0d767ae7680436b3b382d8394/lib/puma/client.rb#L100-L109

    def reset(fast_check=true)
      @parsed_bytes = 0

It means that data in `@buffer` (it's "\r" in no the last LF case) and
unread data in input socket (it's "\r\n" in no the last CRLF case and
"\n" in no the last LF case) are used used as the first data in the
next request.

This change fixes these cases by the followings:

  * Ensures reading the last CRLF by setting `@partial_part_left` when
    CRLF isn't read in processing "0\r\n" line.

  * Introduces a `@in_last_chunk` new state to detect whether the last
    CRLF is waiting or not. It's reset in Puma::Client#reset.

* Remove unnecessary returns

#1812 (comment) is the
location where this rule is made.

* Add missing last CRLF for chunked request in tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants