Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sidekiqswarm: Add support for preloading Rails app #4646

Closed
donaldong opened this issue Jul 20, 2020 · 10 comments
Closed

sidekiqswarm: Add support for preloading Rails app #4646

donaldong opened this issue Jul 20, 2020 · 10 comments

Comments

@donaldong
Copy link

Ruby version: 2.5.5
Sidekiq / Pro / Enterprise version(s): Sidekiq 5.1.3 / Pro 4.0.3 / Ent 1.7.2

Context

In my project, with a fresh irb process

require 'bundler/setup'
Bundler.require('default', 'development')

consumes 272MB memory.

Repro

Start a sidekiq swarm with 30 processes. The processes consume 10GB (> 30 * 272 MB) memory.

Investigation

I looked into /proc/<pid>/smaps and aggregated the memory segments:

{"Shared_Clean:"=>15876, "Shared_Dirty:"=>42376, "Private_Clean:"=>0, "Private_Dirty:"=>279384, "Shared_Hugetlb:"=>0, "Private_Hugetlb:"=>0}

About ~9GB (30 * 272 MB) of the segments are under Private Dirty.

I patched it so it calls boot_system before forking the child processes. With the patch,

{"Shared_Clean:"=>6312, "Shared_Dirty:"=>299552, "Private_Clean:"=>0, "Private_Dirty:"=>21836, "Shared_Hugetlb:"=>0, "Private_Hugetlb:"=>0}

The private mem segs only take ~0.62GB, and the overall memory usage dropped to ~1.5GB.

Thoughts on this approach?

@mperham
Copy link
Collaborator

mperham commented Jul 20, 2020

I'm unclear what you are suggesting.

  1. Are you suggesting including the environment group in the Bundler.require?
  2. Are you suggesting booting Rails before forking?

1 shouldn't be too hard; you can already do it today with SIDEKIQ_PRELOAD=default,development. 2 is a huge change and is fraught with peril, that peril is why every daemon except Sidekiq includes after_fork callbacks.

@donaldong
Copy link
Author

donaldong commented Jul 20, 2020

Are you suggesting booting Rails before forking?

Yes, that's what I'm trying to suggest. For my setting at least, the gems preloaded by Bundler.require won't stick in the shared memory. All the forked processes ended up having a copy of pretty much all gems.

I'm not sure how to evaluate the risks of this approach -- each process should have their own copy of all objects (copy-on-write), what might go wrong if Rails was booted before forking?

@mperham
Copy link
Collaborator

mperham commented Jul 20, 2020

The issue is that any locks, threads or sockets created in Rails initializers will be shared by all child processes. That's the purpose of after_fork blocks: to cleanup or re-initialize anything needed by each child. Unfortunately it's also a major source of support issues, people frequently don't read the docs and they would think it's a Sidekiq bug breaking their app. I could make it opt-in but that's why I've never implemented it.

@donaldong
Copy link
Author

donaldong commented Jul 20, 2020

I could make it opt-in

That'd be great! Thank you for explaining, I see how this can be tricky now. For apps that do not require re-initializing, this can be a big win.

Maybe there can be an after_fork hook in sidekiq too?

@mperham mperham changed the title sidekiqswarm: Rails environment is not loaded into shared memory sidekiqswarm: Add support for preloading Rails app Jul 20, 2020
@mperham
Copy link
Collaborator

mperham commented Jul 20, 2020

No guarantees but I'll investigate.

@mperham
Copy link
Collaborator

mperham commented Jul 20, 2020

I'm implementing this right now. Here's the before and after memory data I see on OSX.

Before

❯ ps aux | grep sidekiq
mikeperham       38022   0.0  1.0  4436192  81776 s000  S+    2:54PM   0:01.77 sidekiq 6.1.1 myapp [0 of 2 busy]      
mikeperham       38021   0.0  1.0  4441336  82664 s000  S+    2:54PM   0:02.00 sidekiq 6.1.1 myapp [0 of 2 busy] leader   
mikeperham       38020   0.0  1.0  4435668  81688 s000  S+    2:54PM   0:01.77 sidekiq 6.1.1 myapp [0 of 2 busy]      
mikeperham       38019   0.0  1.0  4435668  81600 s000  S+    2:54PM   0:01.77 sidekiq 6.1.1 myapp [0 of 2 busy]      
mikeperham       38013   0.0  0.5  4383092  40528 s000  S+    2:54PM   0:00.94 sidekiqswarm, managing 4 processes      

After

❯ ps aux | grep sidekiq
mikeperham       14192   0.0  0.5  4432088  42012 s000  S+   10:24AM   0:00.29 sidekiq 6.1.1 myapp [0 of 2 busy]      
mikeperham       14191   0.0  0.5  4435636  43440 s000  S+   10:24AM   0:00.31 sidekiq 6.1.1 myapp [0 of 2 busy] leader      
mikeperham       14190   0.0  0.5  4431564  41980 s000  S+   10:24AM   0:00.29 sidekiq 6.1.1 myapp [0 of 2 busy]      
mikeperham       14189   0.0  0.5  4431564  41940 s000  S+   10:24AM   0:00.29 sidekiq 6.1.1 myapp [0 of 2 busy]      
mikeperham       14183   0.0  1.0  4413596  80344 s000  S+   10:24AM   0:02.06 sidekiqswarm, managing 4 processes

Notice the multiple children drop from 80MB to 40MB, the single parent swarm process grows from 40MB to 80MB. This is on a trivial default Rails app; that's massive savings.

As for an after_fork callback, here's my proposed API:

Sidekiq.configure_server do |config|
  config.on(:fork) do
    # ActiveRecord::Base.clear_active_connections! or whatever
  end
end

@donaldong
Copy link
Author

Awesome! This is great!

@mperham mperham closed this as completed Jul 21, 2020
@mperham
Copy link
Collaborator

mperham commented Jul 21, 2020

mperham added a commit that referenced this issue Jul 21, 2020
@justin808
Copy link

With a memory pattern like this on a Heroku P-L, is there any advantage to disabling with SIDEKIQ_PRELOAD=.

In other words, is there any positive or negative speed tradeoff with reduced memory usage?

image

@mperham
Copy link
Collaborator

mperham commented Nov 1, 2021

@justin808 Impossible for me to know. There are always CPU time and memory space tradeoffs in software. That's literally what a cache is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants