Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always send lowlevel_error response to client #2812

Closed
wants to merge 4 commits into from

Conversation

baelter
Copy link
Contributor

@baelter baelter commented Jan 27, 2022

Description

Add back f8acac1 and closes #2808

Problematic lines where f8acac1#diff-b58a5aaef75ecf535b6fc4546c848b0f576f9cf27e37b27c59015251097d5736R113-R114 and 117
The former lead to incorrect control flow in cases.

Your checklist for this pull request

  • I have reviewed the guidelines for contributing to this repository.
  • I have added (or updated) appropriate tests if this PR fixes a bug or adds a feature.
  • My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • I have updated the documentation accordingly.
  • All new and existing tests passed, including Rubocop.

@dentarg
Copy link
Member

dentarg commented Jan 27, 2022

I think we are going to need a regression test 😅 , any ideas?

@dentarg
Copy link
Member

dentarg commented Jan 27, 2022

With this config.ru

class BrokenApp
  def initialize(app)
    @app = app
  end

  def call(env)
    [200, nil, []]
  end
end

use BrokenApp

map '/' do
  run lambda { |env| [200, {"Content-Type" => "text/plain"}, ["Hello World"]] }
end

and with Puma 5.6.1

$ curl -s -v http://127.0.0.1:9292
*   Trying 127.0.0.1:9292...
* Connected to 127.0.0.1 (127.0.0.1) port 9292 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:9292
> User-Agent: curl/7.76.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 500 Internal Server Error
* no chunk, no close, no size. Assume close to signal end
<
* Closing connection 0

with this branch

$ cat Gemfile_PR
# frozen_string_literal: true

source "https://rubygems.org"

gem "puma", github: "https://github.com/puma/puma/pull/2812"
$ curl -s -v http://127.0.0.1:9292
*   Trying 127.0.0.1:9292...
* Connected to 127.0.0.1 (127.0.0.1) port 9292 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:9292
> User-Agent: curl/7.76.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< HTTP/1.1 500 Internal Server Error
< Content-Length: 616
<
Puma caught this error: undefined method `each' for nil:NilClass (NoMethodError)
/Users/dentarg/.gem/ruby/3.0.2/bundler/gems/puma-df6512cfbf71/lib/puma/request.rb:441:in `str_headers'
/Users/dentarg/.gem/ruby/3.0.2/bundler/gems/puma-df6512cfbf71/lib/puma/request.rb:129:in `write_response'
/Users/dentarg/.gem/ruby/3.0.2/bundler/gems/puma-df6512cfbf71/lib/puma/request.rb:97:in `handle_request'
/Users/dentarg/.gem/ruby/3.0.2/bundler/gems/puma-df6512cfbf71/lib/puma/server.rb:441:in `process_client'
* Connection #0 to host 127.0.0.1 left intact
/Users/dentarg/.gem/ruby/3.0.2/bundler/gems/puma-df6512cfbf71/lib/puma/thread_pool.rb:147:in `block in spawn_thread'

I'm thinking about this, looks like two responses?

< HTTP/1.1 200 OK
< HTTP/1.1 500 Internal Server Error

@MSP-Greg
Copy link
Member

Maybe we need a better example? Somewhere soon after where Request#handle_request calls @app.call, maybe something like the following could be added?

if headers.is_a? Hash
  raise 'headers from app cannot be frozen' if headers.frozen?
else
  raise "Improper headers (#{headers.class}) from app"
end

@nateberkopec
Copy link
Member

nateberkopec commented Jan 28, 2022

Diff:

diff --git a/lib/puma/request.rb b/lib/puma/request.rb
index 8950cf28..6cde60de 100644
--- a/lib/puma/request.rb
+++ b/lib/puma/request.rb
@@ -110,11 +110,8 @@ module Puma
       env = client.env
       io  = client.io
 
-      return false if closed_socket?(io)
-      lines.clear
-
       head = env[REQUEST_METHOD] == HEAD
-      after_reply = env[RACK_AFTER_REPLY] || []
+      after_reply = env[RACK_AFTER_REPLY] ||= []
 
       begin
         res_info = {}

@nateberkopec
Copy link
Member

Barrier for merge here is a test that fails on #2731 but passes on this branch.

@MSP-Greg
Copy link
Member

@baelter

I had a look at this, and made some changes. See the last commit in my fork's branch:

https://github.com/MSP-Greg/puma/tree/pr2812-f1

A couple of style simplifications, and some begin/rescue/ensure rearranging, which, of course, makes the diff a mess.

Anyway, the issue @dentarg mentioned above is fixed with the lines.reset in write_response's ensure block, and I made a slight change to the test assert_match regex. Without the added lines.reset, the test fails.

I have not run the names.pn specs mentioned elsewhere.

HTH, Greg

@nateberkopec nateberkopec added the waiting-for-changes Waiting on changes from the requestor label Jan 30, 2022
@baelter
Copy link
Contributor Author

baelter commented Jan 31, 2022

@MSP-Greg Thanks! Your changes pass the name.pn specs. Cherry picked your commit into this PR.
Not sure how to resolve the conflicts against master though, maybe you are more equipped to do that?
Also no clue on how a regression tests would look like, any help there is also greatly appreciated.

@MSP-Greg
Copy link
Member

@baelter

Your changes pass the name.pn specs

Thanks for checking. I fixed the conflict. Re a test, if the issue is the one @dentarg mentioned above, showing the response start as:

HTTP/1.1 200 OK
HTTP/1.1 500 Internal Server Error
Content-Length: 616

the issue was fixed in the fixup patch. Previously, the regex in the test was:

%r{HTTP/1.0 500 Internal Server Error\r\nContent-Length: 10\r\n\r\nerror page}

That regex matches the response that dentarg's config.ru generated, but that is an invalid response. The fixup anchored the regex with \A, so it would not pass without the fixup commit. Does that make sense?

Anyway, if that is the cause of the name.pn spec failures, then this is good to go.

@nateberkopec
Copy link
Member

if that is the cause of the name.pn spec failures

@MSP-Greg I'm not sure it is the cause. Your test change passes on v5.6.0.

@MSP-Greg
Copy link
Member

MSP-Greg commented Jan 31, 2022

@nateberkopec You're right, sorry. I think this PR was clearing lines and if an error happened after the clear, lines wasn't reset for the next response. The fixup took care of that, but as you've found, we still don't have a test for what was wrong with #2731.

@baelter

Do you recall what specs were failing?

@joeldrapper
Copy link

joeldrapper commented Feb 3, 2022

I haven't been able to get a test to fail locally, but 5.6.0 seems to be leaking ActiveSupport::CurrentAttributes between requests on Rails 6. I can reproduce it in a staging environment but not development.

@baelter
Copy link
Contributor Author

baelter commented Feb 4, 2022

These specs sometimes fail:

    Signup and edit profile :: Delete a link :: And I click the edit button in the links box
    Signup and edit profile :: Delete an alternate name :: And I click the edit button in the variants box
    Signup and edit profile :: Change pronunciation of :: And I click the edit button in the pronunciation box

https://github.com/fishpercolator/name.pn/blob/main/features/signup_and_edit_profile.feature

These specs make a handful of requests each, other than that I don't see anything special. Maybe @pedantic-git can give some more details on what those spec do?

@pedantic-git
Copy link

Hi @baelter - those are just following links with Capybara. Turbolinks is enabled on the app so there is some work involved in the process of "following a link" but there's nothing special about those particular lines. In my limited testing experience with this issue I found it was more likely to be related to the (number of? specific?) tests that had been run ahead of the ones that failed. Maybe puma reaches some kind of limit or race condition?

@MSP-Greg
Copy link
Member

MSP-Greg commented Feb 4, 2022

I've got an idea about what's causing the problem, I hope to check it this weekend. As with most code, there are parts that might be considered 'brittle', and this has shown that we need more tests.

There has been mention of 'cookies'/session data. I assume that lines up with the logs of the failed tests?

@pedantic-git
Copy link

Whenever I saw tests fail in my own runs, the symptom was always an "unauthenticated" version of the page coming back when the previous request had been authenticated. I couldn't work out what was going wrong beyond that.

@joeldrapper
Copy link

@pedantic-git do you use CurrentAttributes in your app to track authentication variables? I don't think it's cookies, I think it's something to do with threads.

@pedantic-git
Copy link

@joeldrapper Afaik there's no CurrentAttributes in there - it's just the standard cookie session store used with Devise/Warden.

@rainerborene
Copy link

I haven't been able to get a test to fail locally, but 5.6.0 seems to be leaking ActiveSupport::CurrentAttributes between requests on Rails 6. I can reproduce it in a staging environment but not development.

@joeldrapper It leaks on Rails 7 as well. I've deployed 5.6.0 to production environment and it basically fucked up all the data of our customers because we're using ActiveSupport::CurrentAttributes. This release should be removed from Rubygems to say the least.

@joeldrapper
Copy link

joeldrapper commented Feb 10, 2022

@rainerborene I'm sorry to hear that! The consequences of CurrentAttributes not being contained to the request execution are really serious. It's literally designed to be relied on for authentication attributes.

My tests showed ActiveSupport.reset_all did work but somehow wasn't being called. For now, I've deployed a before_action that calls reset_all as an extra precaution, as well as updating to 5.6.1. I still can't get a local test to fail.

I don't know how, but this control flow change seems to cause the Rails app executor callbacks not to fire in some circumstances. It seems like a puma thread can get stuck in this state for a while. I haven't been able to reproduce the issue locally even using the same docker image that reproduced in staging.

@nateberkopec
Copy link
Member

This release should be removed from Rubygems to say the least.

If we can figure out if we're actually causing an issue with CurrentAttributes, then we can do that, but I can't yank (which definitely breaks people's apps) for something that may or may not be broken.

@dentarg
Copy link
Member

dentarg commented Feb 11, 2022

@nateberkopec I think 5.6.0 should be yanked. We know it is broken (somehow). Someone using 5.6.0 might not notice something is wrong and I think that is worse than having "bundle install" not work and having to fix that by upgrading to 5.6.1.

@nateberkopec
Copy link
Member

The security issue here turned in to GHSA-rmj8-8hhh-gv5h

@nateberkopec
Copy link
Member

nateberkopec commented Feb 11, 2022

We will not be yanking 5.6.0, as this issue existed in all versions of Puma and Rails. Users should upgrade Rails or Puma.

@baelter
Copy link
Contributor Author

baelter commented Feb 21, 2022

@nateberkopec Since you found and fixed the underlying issue, do you still need more proof in this PR?

@nateberkopec
Copy link
Member

Yes, we need more proof - we never committed a test for that fix 😅

@baelter baelter closed this Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature waiting-for-changes Waiting on changes from the requestor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Authentication with cookies unreliable on 5.6.0
7 participants