Skip to content

Commit

Permalink
quick nits noticed live
Browse files Browse the repository at this point in the history
Signed-off-by: clux <sszynrae@gmail.com>
  • Loading branch information
clux committed Apr 24, 2024
1 parent 8628d19 commit 3a83184
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 3 deletions.
2 changes: 1 addition & 1 deletion docs/controllers/availability.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Scaling a controller beyond one replica for HA is different than for a regular l

A controller is effectively a consumer of Kubernetes watch events, and these are themselves unsynchronised event streams whose watchers are unaware of each other. Adding another pod - without some form of external locking - will result in duplicated work.

To avoid this, most controllers lean into the eventual consistency model and run with a single replica, accepting higher tail latencies due to reschedules. However, once the performance demands are strong enough, these pod reschedules will dominate the tail of your latency metrics, making scaling necessary.
To avoid this, most controllers lean into the eventual consistency model and run with a single replica, accepting higher tail latencies due to reschedules. However, if the performance demands are strong enough, these pod reschedules will dominate the tail of your latency metrics, and that multi-replica more attractive.

!!! warning "Scaling Replicas"

Expand Down
6 changes: 4 additions & 2 deletions docs/controllers/scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ This chapter is about strategies for scaling controllers and the tradeoffs these
- Why is the reconciler lagging? Are there too many resources being reconciled?
- What happens when your controller starts managing resource sets so large that it starts significantly impacting your CPU or memory use?

Scaling an efficient Rust application that spends most of its time waiting for network changes might not seem like a complicated affair, and indeed, you can scale a controller in many ways and achieve good outcomes. But in terms of costs, not all solutions are created equal; are you improving your algorithm, or are you throwing more expensive machines at the problem to cover up inefficiencies?
Scaling an efficient Rust application that spends most of its time waiting for network changes might not seem like a complicated affair, and indeed, you can scale a controller in many ways and achieve good outcomes. But in terms of costs, not all solutions are created equal:

> Can you improve your algorithm, or should you throw more expensive machines at the problem?
## Scaling Strategies

Expand Down Expand Up @@ -66,7 +68,7 @@ Explicitly labelled shards is less common, but is a powerful option. It is used

A mutating admission policy can help automatically assign/label partitions cluster-wide based on constraints and rebalancing needs.

In cases where HA is required, a leases can be used gate access to a particular shard. See [[availability#Leader Election]]
In cases where HA is required, leases can be used gate access to particular shards. See [[availability#Leader Election]]

--8<-- "includes/abbreviations.md"
--8<-- "includes/links.md"

0 comments on commit 3a83184

Please sign in to comment.