Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authentication with cookies unreliable on 5.6.0 #2808

Closed
pedantic-git opened this issue Jan 26, 2022 · 21 comments
Closed

Authentication with cookies unreliable on 5.6.0 #2808

pedantic-git opened this issue Jan 26, 2022 · 21 comments

Comments

@pedantic-git
Copy link

Describe the bug

[Sorry this is so vague - I don't know what the exact bug is but it seemed important enough to report.]

I had a bunch of automated test failures after upgrading to 5.6.0. Specifically this set of tests - name.pn is open source and Dockerized so you could theoretically run them yourself to see the same thing.

The issue seems to be that the Rails session cookie (I use Devise for authentication) isn't being recognised in some situations and the test is failing because the user is unauthenticated.

I confirmed that downgrading Puma to 5.5.2 returns the tests to 100% reliability, but I honestly don't know where to start on why this would be an issue. Let me know if there's any more debugging I can do.

Puma config:

Puma 5.6.0

# Puma can serve each request in a thread from an internal thread pool.
# The `threads` method setting takes two numbers: a minimum and maximum.
# Any libraries that use thread pools should be configured to match
# the maximum value specified for Puma. Default is set to 5 threads for minimum
# and maximum; this matches the default thread size of Active Record.
#
max_threads_count = ENV.fetch("RAILS_MAX_THREADS") { 5 }
min_threads_count = ENV.fetch("RAILS_MIN_THREADS") { max_threads_count }
threads min_threads_count, max_threads_count

# Specifies the `worker_timeout` threshold that Puma will use to wait before
# terminating a worker in development environments.
#
worker_timeout 3600 if ENV.fetch("RAILS_ENV", "development") == "development"

# Specifies the `port` that Puma will listen on to receive requests; default is 3000.
#
port ENV.fetch("PORT") { 3000 }

# Specifies the `environment` that Puma will run in.
#
environment ENV.fetch("RAILS_ENV") { "development" }

# Specifies the `pidfile` that Puma will use.
pidfile ENV.fetch("PIDFILE") { "tmp/pids/server.pid" }

# Specifies the number of `workers` to boot in clustered mode.
# Workers are forked web server processes. If using threads and workers together
# the concurrency of the application would be max `threads` * `workers`.
# Workers do not work on JRuby or Windows (both of which do not support
# processes).
#
# workers ENV.fetch("WEB_CONCURRENCY") { 2 }

# Use the `preload_app!` method when specifying a `workers` number.
# This directive tells Puma to first boot the application and load code
# before forking the application. This takes advantage of Copy On Write
# process behavior so workers use less memory.
#
# preload_app!

# Allow puma to be restarted by `rails restart` command.
plugin :tmp_restart
bundle exec rails server -b 0.0.0.0 -p 3000

To Reproduce

git clone https://github.com/fishpercolator/name.pn.git
cd name.pn
git checkout af3864d
docker-compose build
docker-compose run web bundle
docker-compose run web yarn
docker-compose run web spinach features/signup_and_edit_profile.feature

(Spinach is similar to Cucumber, and spins up its own copy of the app inside a Puma to run the tests against)

It doesn't fail 100% of the time and it seems to only do it when a bunch of tests are run together - they usually pass when run individually. When they fail, screenshots are output into tmp/capybara.

Sorry this is hardly an easy set of reproduction steps and it sounds like I'm expecting you to understand my app. If I understood the issue better than I do I'd file a better bug report than this!

Expected behavior

The tests pass, as they do when I downgrade Puma to 5.5.2.

Desktop (please complete the following information):

  • OS: Linux (Arch, latest)
  • Puma Version 5.6.0
@ignatiusreza
Copy link

We experienced similar issue.. upgrading to puma 5.6.0 breaks a lot of our cypress e2e tests reliably because the user session used by the test runner gets invalidated..

@oliverguenther
Copy link

oliverguenther commented Jan 26, 2022

This might be related even though we're using puma as a Capybara server and rack_session_access (which provides access to a cookie setter for testing). An isolated PR (opf/openproject#10098) bumping just Puma is reporting deadlocks in Rack::Lock:

2022-01-26T10:04:39.7910994Z       Failure/Error: @mutex.lock
2022-01-26T10:04:39.7911085Z 
2022-01-26T10:04:39.7911217Z       ThreadError:
2022-01-26T10:04:39.7911385Z         deadlock; recursive locking
2022-01-26T10:04:39.7911650Z       # /usr/local/bundle/gems/rack-2.2.3/lib/rack/lock.rb:14:in `lock'
2022-01-26T10:04:39.7911938Z       # /usr/local/bundle/gems/rack-2.2.3/lib/rack/lock.rb:14:in `call'
2022-01-26T10:04:39.7912230Z       # /usr/local/bundle/gems/rack-2.2.3/lib/rack/sendfile.rb:110:in `call'
2022-01-26T10:04:39.7912561Z       # /usr/local/bundle/gems/secure_headers-6.3.3/lib/secure_headers/middleware.rb:11:in `call'
2022-01-26T10:04:39.7912873Z       # /usr/local/bundle/gems/railties-6.1.4.4/lib/rails/engine.rb:539:in `call'
2022-01-26T10:04:39.7913188Z       # /usr/local/bundle/gems/rack-2.2.3/lib/rack/urlmap.rb:74:in `block in call'
2022-01-26T10:04:39.7913485Z       # /usr/local/bundle/gems/rack-2.2.3/lib/rack/urlmap.rb:58:in `each'
2022-01-26T10:04:39.7913775Z       # /usr/local/bundle/gems/rack-2.2.3/lib/rack/urlmap.rb:58:in `call'
2022-01-26T10:04:39.7914098Z       # /usr/local/bundle/gems/capybara-3.36.0/lib/capybara/server/middleware.rb:60:in `call'
2022-01-26T10:04:39.7914424Z       # /usr/local/bundle/gems/puma-5.6.0/lib/puma/configuration.rb:252:in `call'
2022-01-26T10:04:39.7914748Z       # /usr/local/bundle/gems/puma-5.6.0/lib/puma/request.rb:72:in `block in handle_request'
2022-01-26T10:04:39.7915082Z       # /usr/local/bundle/gems/puma-5.6.0/lib/puma/thread_pool.rb:340:in `with_force_shutdown'
2022-01-26T10:04:39.7915394Z       # /usr/local/bundle/gems/puma-5.6.0/lib/puma/request.rb:71:in `handle_request'
2022-01-26T10:04:39.7915713Z       # /usr/local/bundle/gems/puma-5.6.0/lib/puma/server.rb:441:in `process_client'
2022-01-26T10:04:39.7916045Z       # /usr/local/bundle/gems/puma-5.6.0/lib/puma/thread_pool.rb:147:in `block in spawn_thread'
2022-01-26T10:04:39.7916284Z       # ------------------
2022-01-26T10:04:39.7916472Z       # --- Caused by: ---
2022-01-26T10:04:39.7916648Z       # Capybara::ExpectationNotMet:
2022-01-26T10:04:39.7918944Z       #   expected to find text "Update rack session" in "Puma caught this error: deadlock; recursive locking (ThreadError)\n/usr/local/bundle/gems/rack-2.2.3/lib/rack/lock.rb:14:in `lock'\n/usr/local/bundle/gems/rack-2.2.3/lib/rack/lock.rb:14:in `call'\n/usr/local/bundle/gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/static.rb:24:in `call'\n/usr/local/bundle/gems/rack-2.2.3/lib/rack/sendfile.rb:110:in `call'\n/usr/local/bundle/gems/actionpack-6.1.4.4/lib/action_dispatch/middleware/host_authorization.rb:113:in `call'\n/usr/local/bundle/gems/secure_headers-6.3.3/lib/secure_headers/middleware.rb:11:in `call'\n/usr/local/bundle/gems/railties-6.1.4.4/lib/rails/engine.rb:539:in `call'\n/usr/local/bundle/gems/rack-2.2.3/lib/rack/urlmap.rb:74:in `block in call'\n/usr/local/bundle/gems/rack-2.2.3/lib/rack/urlmap.rb:58:in `each'\n/usr/local/bundle/gems/rack-2.2.3/lib/rack/urlmap.rb:58:in `call'\n/usr/local/bundle/gems/capybara-3.36.0/lib/capybara/server/middleware.rb:60:in `call'\n/usr/local/bundle/gems/puma-5.6.0/lib/puma/configuration.rb:252:in `call'\n/usr/local/bundle/gems/puma-5.6.0/lib/puma/request.rb:72:in `block in handle_request'\n/usr/local/bundle/gems/puma-5.6.0/lib/puma/thread_pool.rb:340:in `with_force_shutdown'\n/usr/local/bundle/gems/puma-5.6.0/lib/puma/request.rb:71:in `handle_request'\n/usr/local/bundle/gems/puma-5.6.0/lib/puma/server.rb:441:in `process_client'\n/usr/local/bundle/gems/puma-5.6.0/lib/puma/thread_pool.rb:147:in `block in spawn_thread'"

@rainerborene
Copy link

rainerborene commented Jan 26, 2022

Similar issue here. But not related with cookies at all. We are getting different accounts every five requests or so, specifically at this point of the code:

def current_user
  Current.user ||= User.active.find_by api_key: params[:api_key]
end

@nateberkopec
Copy link
Member

Shot in the dark here but if you have a specific test that's failing (@pedantic-git @ignatiusreza) can you try the following two git checkouts?

gem 'puma', github: 'puma', ref: 'cd940b7fd9dda7288326e774133'

and

gem 'puma', github: 'puma', ref: 'f8acac1f0702fea1a4f88d68a40bb2f53650b14c'

@nateberkopec
Copy link
Member

If you've got the time, bisecting your failing test for the commits between 5.5.2 and 5.6.0 would help immensely.

@pedantic-git
Copy link
Author

Very happy to! One moment please.

@pedantic-git
Copy link
Author

@nateberkopec

cd940b7: Passes every time
f8acac1: Fails some of the time

So it looks like it's between those two commits. I couldn't instantly find them in commit log but let me know if there are more intermediate commits you'd like me to try.

@pedantic-git
Copy link
Author

Ah - they became links! How nice! I can see that narrows it down to one commit already.

@nateberkopec
Copy link
Member

Yeah, there's no commits between those two so that means it's probably f8acac1f0702fea1a4f88d68a40bb2f53650b14c. Next step would be for someone to make a PR with a roll-back of that commit to master, try that branch, and then see if your tests pass every time.

@nateberkopec
Copy link
Member

I'm out for the next 6 hours or so, @MSP-Greg or @dentarg might wake up soon - if either of them get to this before me, if we have a clear commit where reverting it fixes the issue, let's just do that and I can release 5.6.1 later today.

@MSP-Greg
Copy link
Member

Good morning. I'm up and have been looking at f8acac1 / #2731. Need.More.Coffee...

@oliverguenther
Copy link

oliverguenther commented Jan 26, 2022

Yeah, there's no commits between those two so that means it's probably f8acac1f0702fea1a4f88d68a40bb2f53650b14c. Next step would be for someone to make a PR with a roll-back of that commit to master, try that branch, and then see if your tests pass every time.

I rolled back f8acac1 from 61ebbbe (5.6.0 commit) locally and this reliably fixes our capybara/puma deadlocks.

@dentarg
Copy link
Member

dentarg commented Jan 26, 2022

@MSP-Greg @nateberkopec Let's do the revert and a new release and then look at what's wrong, I opened #2809 reverting the offending commit

@ignatiusreza
Copy link

I can confirm that running the same cypress spec locally failed 2 out of 3 times with the 5.6.0 release, and are passing 3 out of 3 with the reverted commit 👍

@baelter
Copy link
Contributor

baelter commented Jan 26, 2022

@pedantic-git Trying to run your specs but getting

➜  name.pn git:(af3864d) docker-compose run web spinach features/signup_and_edit_profile.feature
Creating namepn_web_run ... done
#<Thread:0x000055a9ac11eee0 /usr/src/app/vendor/bundle/gems/semantic_logger-4.9.0/lib/semantic_logger/appender/async.rb:77 run> terminated with exception (report_on_exception is true):
/usr/src/app/vendor/bundle/gems/semantic_logger-4.9.0/lib/semantic_logger/appender/file.rb:172:in `initialize': Permission denied @ rb_sysopen - /usr/src/app/log/test.log (Errno::EACCES)

@pedantic-git
Copy link
Author

Oh yeah I always forget that not everyone's user id is 1001! You can either set the UID variable when you run docker-compose build or just chmod a+w that log directory so any user can write into it.

@baelter
Copy link
Contributor

baelter commented Jan 26, 2022

@pedantic-git
Copy link
Author

Oops! You'll need to run docker-compose run web rails db:setup to create the database. I missed that step when trying it at home because it found one in a shared volume.

nateberkopec pushed a commit that referenced this issue Jan 26, 2022
@nateberkopec
Copy link
Member

5.6.1 is out with a rollback of that commit. I'll leave this open until we figure out what exactly went wrong, the impact, and add a test.

@nateberkopec
Copy link
Member

Huge thanks to @ignatiusreza and @pedantic-git for the test failures here. I had a good guess as to where the problem might be but your test fails made this a much faster fix.

@nateberkopec
Copy link
Member

This issue was fixed with 5.6.2/4.3.10.

JuanitoFatas pushed a commit to JuanitoFatas/puma that referenced this issue Sep 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants