Possible issue with passenger holding on to connection indefinitely with no outside http connections holding #2460

davidhatten · 2022-12-13T21:09:01Z

Please follow the steps below if you have found an issue with Passenger, or would like to request a feature.

Post to Stack Overflow instead if you are not sure, or if you have a question about your configuration.

Issue report

Fill in as much as possible so that we can understand, find and fix the problem.

Are you sure this is a bug in Passenger?
Please read our troubleshooting and debugging guides first:
https://www.phusionpassenger.com/library/admin/troubleshooting/
https://www.phusionpassenger.com/library/admin/

Please try with the newest version of Passenger to avoid issues that have already been fixed

Question 1: What is the problem?

What is the expected behavior?
-- active_client connections not persisting endlessly
What is the actual behavior?
-- with low memory: linux kills passenger processes, sometimes pointing to a utils.rb#110 file
-- with high memory:active_client connections persist for over a day. Processes hit max memory that they can. Pools hit max saturation. Passenger begins queueing requests internally and the app host becomes unresponsive to http traffic.
How can we reproduce it? Please try to provide a sample application (or Virtual Machine) demonstrating the issue. Otherwise, if we can't reproduce it, we might have to ask you a number of followup questions or run certain commands to try and figure out the problem.
-- I can't consistently reproduce it yet. It only happens in prod and with low memory, I haven't been able to catch any processes that end up being killed by the OS. I do have a show=requests from a time when the error was in full swing. I believe this is the offending connection, though again, I don't have anything supporting this to tell a better story

"active_client_count" : 393,
		"active_clients" :
		{
			"1-114832" :
			{
				"connected_at" :
				{
					"local" : "Sun Dec  4 06:49:32 2022",
					"relative" : "41m 19s ago",
					"relative_timestamp" : -2479.3378040790558,
					"timestamp" : 1670154572.1215811
				},
				"connection_state" : "ACTIVE",
				"current_request" :
				{
					"app_response_http_state" : "PARSING_HEADERS",
					"app_sink_state" :
					{
						"callback_in_progress" : false,
						"initialized" : true,
						"io_watcher_active" : false
					},
					"app_source_state" :
					{
						"callback_in_progress" : false,
						"initialized" : true,
						"io_watcher_active" : true
					},
					"content_length" : 3191,
					"flags" :
					{
						"dechunk_response" : true,
						"https" : false,
						"request_body_buffering" : false
					},
					"host" : "127.0.0.1:8000",
					"http_major" : 1,
					"http_minor" : 1,
					"http_state" : "PARSING_BODY",
					"last_data_receive_time" :
					{
						"local" : "Sun Dec  4 06:51:32 2022",
						"relative" : "39m 19s ago",
						"relative_timestamp" : -2359.2372481822968,
						"timestamp" : 1670154692.22212
					},
					"last_data_send_time" : null,
					"method" : "POST",
					"next_request_early_read_error" : "The client connection is closed before the request is done processing (errno=-1020)",
					"path" : [REDACTED],
					"refcount" : 1,
					"request_body_already_read" : 3191,
					"request_body_fully_read" : true,
					"request_body_type" : "CONTENT_LENGTH",
					"response_begun" : false,
					"session" :
					{
						"gupid" : "1a8b925-HItyJLRJxd",
						"pid" : 3516
					},
					"session_checkout_try" : 1,
					"started_at" :
					{
						"local" : "Sun Dec  4 06:49:32 2022",
						"relative" : "41m 19s ago",
						"relative_timestamp" : -2479.3378040790558,
						"timestamp" : 1670154572.121572
					},
					"state" : "WAITING_FOR_APP_OUTPUT",
					"sticky_session" : false,
					"want_keep_alive" : false
				},
				"lingering_request_count" : 0,
				"name" : "1-114832",
				"number" : 114832,
				"output_channel_state" :
				{
					"bytes_buffered" :
					{
						"bytes" : 0,
						"human_readable" : "0 bytes"
					},
					"callback_in_progress" : false,
					"mode" : "IN_MEMORY_MODE",
					"nbuffers" : 0,
					"reader_state" : "RS_INACTIVE"
				},
				"refcount" : 2,
				"requests_begun" : 1
			},

Question 2: Passenger version and integration mode:
Your answer:

6.0.15 standalone open source

Question 3: OS or Linux distro, platform (including version):

For example: Debian 8, x86_64 or OS X 10.10 Yosemite, x86_64

Your answer:

Ubuntu 20 LTS

Question 4: Passenger installation method:

Your answer:
[x] RubyGems + Gemfile
[ ] RubyGems, no Gemfile
[ ] Phusion APT repo
[ ] Phusion YUM repo
[ ] OS X Homebrew
[ ] source tarball
[ ] Other, please specify:

Question 5: Your app's programming language (including any version managers) and framework (including versions):

For example: Ruby 2.3.0, RVM, Rails 5.0.0; Node.js 4.0.0 with Express 4.13.4

Your answer:

Rails 6.1.6.1
Ruby 3.0.4p208
No UI

Question 6: Are you using a PaaS and/or containerization? If so which one?

For example: Heroku, Amazon Container Services, Docker 1.9 with an image based on passenger-docker

Your answer:
The app is hosted in a docker container

Question 7: Anything else about your setup that we should know?

Your answer:
Currently the application is in the "low" memory state, causing OOM issues.

The reason I think passenger has a lot to do with it is that utils.rb#110. That points to a line in passenger that does some thread management for exception handling.

Boosting the app to have more memory induced the holding behavior. Here's a graph of the behavior:
Queue Depth and Pool usage:

Memory:

I will continue to made changes on my end and if I discover what the app is doing that caused passenger to do this, i'll be sure to let you know.

Thanks!

We strive for quality and appreciate you taking the time to submit a report! Please note that if you want guaranteed response times and priority issue support we encourage you to join our enterprise customer base. They also provide us with the means to continue our high level of open source support!

The text was updated successfully, but these errors were encountered:

davidhatten · 2022-12-20T18:19:54Z

Ok, this can be closed or merged or whatever works on your side to manage it. I have discovered the overall shape of the issue and while passenger played a role, i'm not sure there actually is a bug in passenger.

Situation: the controller was sleeping on redis key creation (sleep 0.05 until Redis.setnx), for a lock. One issue was that the key creation and the ttl setting were on different lines which could cause a deadlock. However, this deadlock occurred even when no issue setting the ttl happened. Eventually the sleeping Passenger thread would run out of memory and the OS in the container would kill it. The ttl on the key was 30 seconds so the webapp thread could be waiting for 30s in .05s bits.

It is worth noting that this issue became visible with Rails 6 + Passenger 6. If it was existing before, it wasn't attracting OS attention and the processes weren't getting killed for OOM issue.

The underlying design we were using has been rectified and this issue cannot happen anymore.

Add to the pile of "Passenger doesn't like a sleep-ing webapp processes", which is fair, because sleep-ing on a webapp process is pretty galaxy brain anyway.

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible issue with passenger holding on to connection indefinitely with no outside http connections holding #2460

Possible issue with passenger holding on to connection indefinitely with no outside http connections holding #2460

davidhatten commented Dec 13, 2022

davidhatten commented Dec 20, 2022 •

edited

Possible issue with passenger holding on to connection indefinitely with no outside http connections holding #2460

Possible issue with passenger holding on to connection indefinitely with no outside http connections holding #2460

Comments

davidhatten commented Dec 13, 2022

Issue report

Thanks!

davidhatten commented Dec 20, 2022 • edited

davidhatten commented Dec 20, 2022 •

edited