Requests fail during phased restarts, hot restarts, and shutdown #2337

cjlarose · 2020-08-13T22:46:10Z

Describe the bug

During a phased restart, some clients do not get successful responses from the server.

Requests fail in a variety of ways. The ones I've observed and that I'm able to reproduce reliably are these:

The server closes the connection without first writing a response onto the socket curl: (52) Empty reply from server
curl: (56) Recv failure: Connection reset by peer
The exception Unable to add work while shutting down is raised, invoking the lowlevel_error_handler (by default, this returns a response with response code 500)

Puma config:

port 3000
workers 2

PUMA_MASTER_PIDFILE=/usr/src/app/puma.pid
bundle exec puma --pidfile "${PUMA_MASTER_PIDFILE}"

To Reproduce

I've created a repo that reliably reproduces the errors: https://github.com/cjlarose/puma-phased-restart-errors

Expected behavior

All requests are passed to the application and the application's responses are all delivered to the requesting clients. The restart documentation for puma suggests that phased restarts are meant to have zero downtime.

Desktop (please complete the following information):

OS: Linux
Puma Version: 4.3.5

The text was updated successfully, but these errors were encountered:

MSP-Greg · 2020-08-13T23:54:47Z

@cjlarose

Thanks for testing Puma.

During a phased restart, some clients do not get successful responses from the server.

Phased restarts are not meant to be repeatedly done in a short timespan. There are one or more time settings that affect how long it takes to perform a phased restart.

Do you have unsuccessful responses when doing a single phased restart?

cjlarose · 2020-08-14T00:14:28Z

For the repo that reproduces the problem, I just did a bunch of phased restarts repeatedly to increase the likelihood of failure; I understand these are not normal operating conditions.

But yes, I originally became aware of this problem because it happens to two different applications of mine where we perform phased restarts once every hour or so. Specifically, we encounter the Unable to add work while shutting down error in production. From a brief reading of the code, it seems like that error can happen as a result of a request finishing buffering by the Reactor after the corresponding thread pool has started its shutdown procedure.

MSP-Greg · 2020-08-14T00:28:12Z

we perform phased restarts once every hour

Ok, that's reasonable. Using MRI or JRuby?

cjlarose · 2020-08-14T00:39:03Z

MRI 2.6.1 and 2.6.5

nateberkopec · 2020-08-21T15:27:40Z

Scenario 3 definitely seems like something we could easily prevent.

cjlarose · 2020-08-24T19:22:56Z

A colleague and I are looking into fixing the Unable to add work while shutting down problem (Scenario 3). We've got a plan we're working on and hope to open a PR at some point.

cjlarose · 2020-09-16T19:03:32Z

Just wanted to post an update on this: my company is spending resources on this since it's important to us that this is fixed. We have tried several approaches that have failed in various ways, but we think we're close to getting a solution that we're happy with. We'll open a PR soon with a new integration test and a fix for all 3 of the problems in the issue description.

Also I should mention that root issue a little more broad than the title of this issue implies. We've demonstrated that requests can fail during normal "hot" restarts on a non-cluster puma deployment (single server process), as long as you're using queue_requests (which is on by default); it's not really related to phased restarts directly.

nateberkopec · 2020-09-16T19:39:38Z

Thanks so much @cjlarose! I appreciate Appfolio's continued investment in the ecosystem!

volodymyr-mykhailyk · 2020-09-17T18:25:45Z

I have been investigating this problem for the last few days. In our case, it is the #2343 issue. However, will share findings here as it can help your work.

I am seeing very similar issues mentioned in the first comment. From the Nginx side they spelled as:

connect() failed (111: Connection refused) while connecting to upstream
upstream timed out (110: Operation timed out) while connecting to upstream - happens when the socket have too many requests and puma is not picking up requests in proxy_connect timeout
failed (104: Connection reset by peer) while reading response header from upstream - happens when the socket have too many requests and puma shuts down
upstream prematurely closed connection while reading response header from upstream - happens when drain_on_shutdown is disabled with queue_requests enabled. Puma read the request but hasn't sent anything in response during the shutdown.
Unable to add work while shutting down - concurrency issue inside puma

I can easily reproduce these errors with 1s requests overflowing multiple puma pods in non-cluster mode and one of the pods going through normal k8s restart.

I can confirm that Unable to add work while shutting down - is solved by disabling queue_requests. Looking through the code there is some concurrency in pushing queued requests after ThreadPool in shutdown mode.

I also experimented with drain_on_shutdown. This setting makes puma not process anything in the socket. Have you tried disabling it and see if restart issues are still reproducible?

cjlarose · 2020-09-17T22:32:49Z

@volodymyr-mykhailyk Thanks for this info! This will help in our efforts to fix the problem.

For some of the errors you describe, I can say confidently that we're able to reproduce them (Unable to add work while shutting down and Connection reset by peer especially), but some of them are a bit unexpected, especially Connection refused while connecting to upstream. Could you describe your setup with a little more detail? How many pods are you running in your tests and how are you performing a "normal k8s restart"? Are you, for example, just using a rolling restart for the service with kubectl rollout restart deployment <name>?

I also experimented with drain_on_shutdown. This setting makes puma not process anything in the socket. Have you tried disabling it and see if restart issues are still reproducible?

I haven't experimented much with this setting. My understanding is that it's off by default, but, if enabled, it'll make puma process all connections on its socket before shutting down. If you're running puma in k8s, it seems like the right thing to do in order to make sure that puma processes connections that were sent to the pod before the pod dies.

My experiments have mostly been of two types

Cluster mode server, sending a bunch of phased restarts, while sending requests
Single mode server, sending a bunch of SIGUSR1 (hot restart requests), while sending requests

In both cases, drain_on_shutdown isn't particularly relevant to the problems we're seeing. In cluster mode, drain_on_shutdown isn't desirable since we want workers to die even if there's still connections queued on the socket (another worker will pick the work up). In single mode during a hot restart, we similarly want the new server (after restart) to handle anything queued on the socket.

cjlarose · 2020-09-17T23:17:59Z

Just to provide more insight into what we've discovered so far: one source of a lot of problems is related to how the Puma::Server shutdown sequence works. The Server instance sets @queue_requests to false before shutting down the reactor (and before shutting down the ThreadPool)

puma/lib/puma/server.rb

Lines 320 to 324 in 7d00e1d

    
           if queue_requests 
        
             @queue_requests = false 
        
             @reactor.clear! 
        
             @reactor.shutdown 
        
           end

The intent here is that after the Reactor shuts down, any connections that are handled by the ThreadPool won't have the option of sending the request to the Reactor (the connection must either be closed, assuming we've written a response to the socket, or the ThreadPool must wait until the client has finished writing its request, then write a response, then finally close the connection). This works most of the time just fine.

The problem is that there are races in the implementation of the ThreadPool where it's possible for a thread to see @queue_requests == true before the Server shutdown has started, then later try to add the request to the Reactor, not knowing that it's already shutting down. One example is here:

puma/lib/puma/server.rb

Lines 401 to 404 in 7d00e1d

    
           return unless @queue_requests 
        
           close_socket = false 
        
           client.set_timeout @persistent_timeout 
        
           @reactor.add client

This can cause connection reset errors or empty replies from the server.

We have a fix for this (just wrapping the critical sections in a Mutex acquisition), but we want to make sure we don't adversely affect puma's concurrency by introducing lock contention.

Even after fixing that issue, though, we still see some connection reset errors for requests that enter the ThreadPool after the shutdown sequence has initiated: the ThreadPool writes the final response to the client socket, then closes the socket, but the client gets a ECONNRESET error. We're actively looking into this problem.

Lastly, I should mention that while the error Unable to add work while shutting down does happen in Puma 4.3.5, I haven't seen that happen in puma master on commit 0ab0eec. It's possible that problem was fixed sometime since 4.3.5 already. Edit: #2122 fixed this.

volodymyr-mykhailyk · 2020-09-18T05:51:36Z

@cjlarose Actually drain_on_shutdown is enabled by default. While this is useful to process all requests in the queue - it is having bad consequences under heavy load (a lot of requests that cannot be drained) and puma not able to process all of them in shutdown timeout.

Could you describe your setup with a little more detail? How many pods are you running in your tests and how are you performing a "normal k8s restart"? Are you, for example, just using a rolling restart for the service with kubectl rollout restart deployment ?

The setup is very straight forward. nginx-ingress puma in single-mode server with 4+ pods. We are using rolling restart/deploy. The Connection refused while connecting to upstream error is most likely caused by a lack of synchronization between upstream updates in the ingress and k8s pods list. The pod is terminated while Nginx is still had it in the list of upstreams. So it is not related to the problem you are working on.

Looking forward to your updates and will be happy to test them out.

wjordan · 2020-09-23T04:54:52Z

@cjlarose I'm sorry I didn't come upon this discussion until now, it might have saved a bit of time.

As you noted Pass queued requests to thread pool on server shutdown #2122 fixes the Unable to add work while shutting down issue, so this is fixed in v5.0.0.
The race condition described in Requests fail during phased restarts, hot restarts, and shutdown #2337 (comment) is fixed in Better error handling during force shutdown #2271, however based on Better error handling during force shutdown #2271 (comment) the Reactor refactoring in Reactor refactor #2279 may be the preferred fix to this issue.
Better error handling during force shutdown #2271 also contains an additional fix for another race-condition edge case that can occur on force-shutdown- if you have a short force_shutdown_after configured, that fix might help with some remaining connection errors. In any case, it wouldn't hurt to try if you can reproduce the issue consistently.

cjlarose · 2020-09-23T17:19:20Z

@wjordan Thanks for posting these notes and thanks for your work on Puma!

The race condition described in #2337 (comment) is fixed in #2271, however based on #2271 (comment) the Reactor refactoring in #2279 may be the preferred fix to this issue.

This issue has since been resolved by #2377

#2271 also contains an additional fix for another race-condition edge case that can occur on force-shutdown- if you have a short force_shutdown_after configured, that fix might help with some remaining connection errors. In any case, it wouldn't hurt to try if you can reproduce the issue consistently.

This is good to know! I haven't done a lot of testing myself with the force_shutdown_after option. Of course, if force_shutdown_after is enabled, we expect some kinds of errors to be visible from the client's perspective in some cases. But I can experiment with a short force_shutdown_after and see what impact it has.

volodymyr-mykhailyk · 2020-09-25T10:22:16Z

Just tested the 5.0 puma in our setup. I can confirm that the Unable to add work while shutting down error is not reproducible anymore. And overall - I can see much fewer errors related to the shutdown (unprocessed requests in sockets).

Big thanks to all Puma contributors for your hard work.

volodymyr-mykhailyk · 2020-09-25T10:32:11Z

Worth to mention one interesting finding I've seen while testing queue_requests in the setup Nginx + multiple Puma instances.

The below graphs show response time when there are more requests then puma can handle (4 pods with 4 threads each, request that take 1 second, 18 requests per second). Spikes at the beginning is rolling deploy effect.

Queue requests enabled:

Queue requests disabled:

I think disabling queue - is still the way to go in setup if there is Nginx in front and you have longer requests. It produces more predictable results, prevents overload of any particular instance (taking too much requests puma will not be able to handle in grace shutdown period), and allows to balance the load evenly across puma instances.

cjlarose · 2020-10-07T17:30:33Z

Just a quick update: #2279 seems to improved things generally with regard to connection handling during phased restarts, hot restarts, and shutdowns.

I opened #2417 to add an integration test that'll hopefully prevent regressions related to these kinds of failures.

I'm not ready to declare this issue closed quite yet (I still want to do a little more testing), but things are looking really good.

cjlarose · 2020-10-13T17:24:26Z

#2423 is merged. It adds integration tests to both single-mode and cluster-mode puma to help prevent the kinds of concurrency errors that were causing connections to be dropped during hot restarts, phased restarts, and shutdowns.

From my testing, I think #2279 fixed some of the last problems. That fix isn't in a release of puma yet, but I'd expect it to be soon.

I think we can close this issue for now. If another, more specific issue crops up, we can open a new issue.

nateberkopec added the bug label Aug 21, 2020

nateberkopec mentioned this issue Aug 21, 2020

Kubernetes downscale result in 5xx for the pods that have been killed #2343

Closed

MSP-Greg mentioned this issue Sep 6, 2020

Update Actions, test certs, test code [changelog skip] #2333

Merged

8 tasks

MSP-Greg mentioned this issue Sep 20, 2020

Change Events#ssl_error signature from (error, peeraddr, peercert) to (error, ssl_socket) #2375

Merged

8 tasks

cjlarose mentioned this issue Sep 22, 2020

Prevent connections from entering Reactor after shutdown begins #2377

Merged

8 tasks

wjordan mentioned this issue Sep 23, 2020

Better error handling during force shutdown #2271

Merged

8 tasks

cjlarose changed the title ~~Requests fail during phased restarts~~ Requests fail during phased restarts, hot restarts, and shutdown Sep 23, 2020

cjlarose mentioned this issue Oct 7, 2020

Add hot restart integration test [changelog skip] #2417

Closed

8 tasks

cjlarose closed this as completed Oct 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requests fail during phased restarts, hot restarts, and shutdown #2337

Requests fail during phased restarts, hot restarts, and shutdown #2337

cjlarose commented Aug 13, 2020

MSP-Greg commented Aug 13, 2020

cjlarose commented Aug 14, 2020

MSP-Greg commented Aug 14, 2020

cjlarose commented Aug 14, 2020

nateberkopec commented Aug 21, 2020

cjlarose commented Aug 24, 2020

cjlarose commented Sep 16, 2020 •

edited

nateberkopec commented Sep 16, 2020

volodymyr-mykhailyk commented Sep 17, 2020 •

edited

cjlarose commented Sep 17, 2020

cjlarose commented Sep 17, 2020 •

edited

volodymyr-mykhailyk commented Sep 18, 2020

wjordan commented Sep 23, 2020

cjlarose commented Sep 23, 2020

volodymyr-mykhailyk commented Sep 25, 2020

volodymyr-mykhailyk commented Sep 25, 2020 •

edited

cjlarose commented Oct 7, 2020

cjlarose commented Oct 13, 2020

Requests fail during phased restarts, hot restarts, and shutdown #2337

Requests fail during phased restarts, hot restarts, and shutdown #2337

Comments

cjlarose commented Aug 13, 2020

MSP-Greg commented Aug 13, 2020

cjlarose commented Aug 14, 2020

MSP-Greg commented Aug 14, 2020

cjlarose commented Aug 14, 2020

nateberkopec commented Aug 21, 2020

cjlarose commented Aug 24, 2020

cjlarose commented Sep 16, 2020 • edited

nateberkopec commented Sep 16, 2020

volodymyr-mykhailyk commented Sep 17, 2020 • edited

cjlarose commented Sep 17, 2020

cjlarose commented Sep 17, 2020 • edited

volodymyr-mykhailyk commented Sep 18, 2020

wjordan commented Sep 23, 2020

cjlarose commented Sep 23, 2020

volodymyr-mykhailyk commented Sep 25, 2020

volodymyr-mykhailyk commented Sep 25, 2020 • edited

cjlarose commented Oct 7, 2020

cjlarose commented Oct 13, 2020

cjlarose commented Sep 16, 2020 •

edited

volodymyr-mykhailyk commented Sep 17, 2020 •

edited

cjlarose commented Sep 17, 2020 •

edited

volodymyr-mykhailyk commented Sep 25, 2020 •

edited