Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve architecture for horizontal scaling #515

Open
Tracked by #586
spender0 opened this issue Nov 3, 2023 · 5 comments
Open
Tracked by #586

Improve architecture for horizontal scaling #515

spender0 opened this issue Nov 3, 2023 · 5 comments
Labels
kind/enhancement Improvements or new features

Comments

@spender0
Copy link

spender0 commented Nov 3, 2023

Hello!

  • Vote on this issue by adding a 馃憤 reaction
  • If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

Hello Pulumi team! I've been using Pulumi for years and recently started using Pulumi Kubernetes Operator.
Having 40+ stacks based on the same typescript npm project taken care of by one Pulumi operator installation I found design problems in the operator.

When it runs npm install and Pulumi code for several stacks it consumes a lot of CPU and memory. But this happens only after git changes. So most of the time operator pod is doing nothing when there are no git changes. But I need to have it with proper CPU and Memory requests set to avoid OOM Kill. So the pod's resources are underutilized. It is burning money most of the time.

Screenshot 2023-11-03 at 12 50 09鈥疨M

The problem is partly related to #368
When I set little resources operator got OOMKilled during infra provisioning and the stack state file is locked by concurrent update.

In addition to the resource problem, it is not possible to scale up the operator deployment horizontally to increase the speed of syncing the big number of stacks. Only one pod can work on stacks at one moment, for this reason, there is k8s lease locking.

As a solution, I would decouple the "npm install" and "pulumi up" functionality from the operator pod into a worker pod so the operator could assign the worker pod onto one stack individually to provision it and once the stack is done the worker should die to save costs. The operator pod should be only a controller for stacks and worker pods. This would make Pulumi Operator more scalable to suit big platforms having hundreds or thousands of stacks.

I would be glad to provide additional information, just let me know.

Affected area/feature

@spender0 spender0 added kind/enhancement Improvements or new features needs-triage Needs attention from the triage team labels Nov 3, 2023
@mikhailshilkov
Copy link
Member

cc @rquitales @EronWright for you awareness.

@mikhailshilkov mikhailshilkov removed the needs-triage Needs attention from the triage team label Nov 3, 2023
@lukehoban
Copy link
Member

lukehoban commented Feb 7, 2024

Note that this is similar to (or potentially ultimately the same as) what鈥檚 discussed in #78 (run the deployments as Jobs)

#434 Is another even more extreme option for separating the deployments from the operator compute (running them in Pulumi Deployments instead of directly inside the cluster).

@danielloader
Copy link

How is concurrency limited when handling stacks all changing at the same time? If OP has 40+ stacks and they're all being refreshed/updated at the same time, would some simple concurrency controls smooth the spike out over a longer time?

@spender0
Copy link
Author

How is concurrency limited when handling stacks all changing at the same time? If OP has 40+ stacks and they're all being refreshed/updated at the same time, would some simple concurrency controls smooth the spike out over a longer time?

I set MAX_CONCURRENT_RECONCILES variable in the operator pod to 4. If I set a higher value, e.g 10, the operator will consume way more resources and will be OOM-killed unless I dedicate even more memory to the pod. This will lead to money burning as most of the time the pod is doing nothing as there are no changes in the stacks.

If I leave MAX_CONCURRENT_RECONCILES=4 the update is too slow when all stacks receive a change.

@danielloader
Copy link

Good to know for someone new to the operator, was just thinking out loud about the concurrency but it makes sense if the update is too slow too. Given those requirements it does feel like pushing those sessions out to Job pods so they can be on demand distributed out to the wider cluster makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Improvements or new features
Projects
None yet
Development

No branches or pull requests

4 participants