New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sidekiq UI responds with internal server error when session cookie is invalid #4671
Comments
Hi there this issue is still happening in a very weird and inconsistent way. I experienced when trying to retry some dead jobs, but couldn't debug it due to the fact that it was in production and also because some attempts were successful, not sure why. Maybe have something to do with the puma worker that is answering the request, not sure... really weird. The gem versions that I'm using are:
I will try to reproduce it in development and see if I can find more ideas of what is going on. Any idea or suggestion is welcome. |
You’ll need to give us an app that reproduces the problem. |
Sure thing! I will post here with the details when I found the cases |
I too am still seeing this issue:
|
Thanks for the fix @szechyjs ! I still wonder how/what influences the CSRF token refresh on the web interface of Sidekiq? It seems like I now get Any tips on how to investigate this? or how to make sure I get the CSRF token "loaded" correctly? EDIT: @cavi21's bewilderment matches mine as well :)
EDIT2: very strange, but I'm getting 403s nearly always on our staging environment, but some times it works. On my dev environment I cannot reproduce it at all, and never get a 403 :-/ |
@cavi21 I think your intuition was right. I debugged a bit on our staging environment, and I think it's somehow related to the puma processes. If I run only one process, it's working fine, but the more puma processes we have it seems the likelihood of failing the CSRF check increases... Perhaps the rack session isn't shared across all puma processes? We're not using cookies for the sessions, but rather storing them in redis ... I imagine it works for most people who rely on cookies for storing the sessions? but then fails in our case... |
Hi @gingerlime we also have the session on redis, so maybe there is something around that configuration. I narrowed down, as you did, to the way puma workers handle the request, but did not have the time to keep digging. If you found something or want me to try anything on my config just let me know! |
An update... did a bit more digging, and found a workaround, although I'm still not sure if it's the best one, nor why precisely it's not working. TL;DR: using redis-rack seems to work around the issue here, even with multiple puma workers. Only tested in development so far though (but with multiple puma processes). What I found so far:
Given that the session data is actually stored in a cookie anyway, I'm really confused how it relates to different puma workers, but somehow it seems to... not clue why. Hope someone else can shed some light or has some ideas what specifically to test? |
Another solution/workaround seems to be setting the sidekiq web session secret key ... no clue why it works without it in some cases or with a single puma worker process, but fails otherwise though. |
Ok, I think I'm getting to the bottom of this, but still have a few questions :) It looks like setting the secret key will prevent it from being randomly set here https://github.com/mperham/sidekiq/blob/3b5ae30c4e5e9e760268243ab5c14664a2f8d236/lib/sidekiq/web.rb#L165-L169 ... This explains why:
I'm not familiar with the sidekiq codebase though, so wasn't sure why I found a rather old note in the changelog about it. So I guess another solution is to manually add this middleware? even though mine is a rails app... Should sidekiq warn about this? setting a random secret is a nice failsafe fallback, and works in some cases, but obviously not in a multi-process environment... |
Here's a quick summary of my knowledge. Sidekiq needs a valid, working Rack session to implement CSRF protection.
There's no way for the Sidekiq UI know it is running in a multi-process environment. This is why I'm pretty silent on session issues. The situation is quite complex, there's no good solution, every app has a different mix of gems, session configuration and Rack middleware. The possible setups are so varied that I can't debug and diagnose everyone's app. So I stay silent and let people debug their own issue. |
If you have a Rails app, make sure you are mounting the web UI "inside" the Rails routes, like shown here, so it can reuse the Rails application session: |
Thanks for taking time to explain things with so much detail. I really appreciate it. I think I understand the complexity here better now, and your approach makes sense. First off, yes, we mount Sidekiq web in our Rails routes. However, we don't use the Rails' CookieStore, which I imagine is the default one. Neither are we using Devise, which might also load the middleware, although I couldn't find anything specific yet. So I imagine for most people who are on the Rails "happy path", this isn't going to be an issue out of the box. I still wonder if there's a way to avoid these awkward scenarios in some way? to help some of those odd setups. A few thoughts:
def deny(env)
# the digest allows us to "identify" the secret without exposing it
# so if there are multiple random secrets, the forbidden message + log would make it clear
digest = OpenSSL::Digest.new("sha1").hexdigest(env["rack.session.options"][:secret])[0...8] rescue nil
logger(env).warn "attack prevented by #{self.class} #{digest}"
[403, {"Content-Type" => "text/plain"}, ["Forbidden #{digest}"]]
end
|
Remove all of the hacks and support infrastructure around Rack sessions. Rails provides this by default so we don't need it for 90% of users. The other 10% should know and provide a Rack session. This is a big change and has the potential to break many installs. It will be part of the 7.0 major version bump and require a lengthy beta period to ensure we document as many edge cases and solutions as possible. See also #4671, #4728 and many others.
My biggest issue in fixing this problem is the large number of edge cases. I'd like to tear apart the current build_session method and make it as simple as possible but I don't have 50-100 different apps to test different scenarios to add warnings and log output (as you show) to handle the major edge cases. I'm thinking this will be the big refactoring for Sidekiq 7.0 and we'll have a public beta period where people will need to open issues to get the Web UI configured properly. If you want to try this today, you can read through #4791, run the new
|
I'd love to help testing the better_sessions branch. I had a very quick glimpse and it seems like a good direction to go to. I had a couple of other ideas/questions though, just want to throw it out for now:
|
Hi @gingerlime sorry for the late reply! happy to know that you figure it out what was the issue and that Mike share his thoughts about it! also happy to help testing on the About your question:
we're using authlogic for authentication. |
I like the direction you're thinking. Today Sidekiq doesn't control the session. The secret key is part of the session config.
These are really interesting ideas. I could roll my own CSRF handling using an encrypted cookie directly, independent of any Rack session. We'd still need to deprecate all of the Sidekiq::Web session stuff but the user would not need to supply any session. I like this. |
Thank @mperham !
Sorry, I guess I wasn't clear. Still thinking about the existing codebase. Currently if Sidekiq web doesn't have a secret available / the middleware isn't used, then it picks a random secret and sets it. With multi-process / multi-node environment, this means that each process will have its own secret, and then we get the problem that manifested here. If instead, the secret is randomly picked, but then stored in redis, then at least all processes / nodes will share the same secret. However, this is still a patch... And I like the direction you're going with simplifying the whole CSRF / session management, and making Sidekiq::Web potentially much more self-contained... So I think it's ok to leave this out for now. Just wanted to clarify.
I think what I had in mind is more or less the OWASP's double-submit cookie. You might not even strictly need an encrypted cookie. You can use a signed one, or even a plaintext one is secure enough (with the caveat of it potentially breaking if you deal with sub-domains which users control). There are other CSRF protection alternatives using headers etc. Somehow this feels the closest to the current implementation, yet I think you can make it self-contained and not dependent on the parent app. I'd be happy to help in the process if you're interested. Perhaps best to continue this discussion on #4791 ? |
Hi all, I have a new PR ready for testing which aims to improve the Web UI experience. Check out #4804 and try it out with: gem 'sidekiq', github: 'mperham/sidekiq', branch: 'better_sessions' I'm trying to solve all of the hacks people are using and forbidden errors you're seeing. |
* Simplify Web UI sessions Remove all of the hacks and support infrastructure around Rack sessions. Rails provides this by default so we don't need it for 95% of users. The other 5% need to provide a Rack session. This is a big change and has the potential to break installs so it deserves at least a minor version bump. See also #4671, #4728 and many others.
Ruby version: ruby 2.7.0p0
Sidekiq / Pro version(s): 6.1.1, 5.1.1
Rails version: 6.0.3.2
We are using Rails 6.0.3.2 with Sidekiq Pro. When the
rack.session
cookie is corrupted and the digest verification in Rack::Session::Cookie fails, it returns an empty hash as a session data. This leads to sidekiq web trying to get the:csrf
key in the empty hash and gettingnil
, which is then passed toBase64.strict_encode64
sending theunpack1
message to it.Steps to reproduce:
rack.session
cookie digest (after--
delimiter)POST
request with form-dataThe text was updated successfully, but these errors were encountered: