Timed hashes for load balancers #5762

PoneyClairDeLune · 2023-08-18T04:14:03Z

Is it possible to add a directive like lb_time_hash?

With policies utilizing sticky hashes in place, the same IP addresses and ports may get mapped to a single upstream indefinitely. Introducing an additional hash that refreshes with a given time interval helps avoid single upstreams getting constant use from a single IP address.

The idea goes as follows:

lb_time_hash is disabled by default, and only accepts a time value higher than 10 seconds (or other more appropriate values). Values above 0s and below 10s get treated as if 10s is set, and any non-positive time value gets interpreted as lb_time_hash being disabled.
When lb_time_hash gets enabled, it only gets applied to sticky hash-based policies, namely ip_hash, client_ip_hash, uri_hash, query, header and cookie.
When calculating sticky hashes, the additional hash provided by lb_time_hash is introduced along with other inputs. The additional hash only updates to a different randomized value in relation to the launch time of Caddy itself, not system time.

Here is an example.

A Caddy server with three upstreams (A, B, C) is configured to have IP hash configured with lb_time_hash set to 60s.
After boot up with initial time hash randomized, Caddy will route traffic from 10.0.0.1 to upstream A.
After the first 60 seconds, the time hash gets updated with a random value, causing Caddy to route traffic from said IP to upstream C instead.
...

The text was updated successfully, but these errors were encountered:

francislavoie · 2023-08-20T03:02:55Z

Interesting idea.

I'm not sure how it would work algorithmically though. I guess we'd do something like:

time = currentTimeSecs - (currentTimeSecs % configSecs)

The idea is we subtract the modulo so that it stays the same time until the next configSecs window is ellapsed. Does this make sense?

I'm considering whether it should be configured as part of the lb policy or as a separate option. We have fallback now for many policies, so having options for policies has precedent now.

mholt · 2023-08-20T06:23:04Z

Thanks for opening an issue.

Introducing an additional hash that refreshes with a given time interval helps avoid single upstreams getting constant use from a single IP address.

What is the problem with this though?

The whole point of IP-hash-based LB is to keep clients with the same backend as long as it exists. If that's not what you want, then use a different policy.

I guess I fail to see the motivation for this feature.

If it does get implemented, it'd probably be best as a separate plugin for now to see if it would become popular enough to be included in the core.

PoneyClairDeLune · 2023-08-22T14:15:18Z

The whole point of IP-hash-based LB is to keep clients with the same backend as long as it exists. If that's not what you want, then use a different policy.

The hash-based load balancer isn't just for mapping an IP to the same backend, but rather reproducible and (somewhat) predictable load balancing. The introduction of time in these hash-based load balancers becomes more like a mild measure preventing a single server from being overloaded by requests, while still having all the benefits hashes provide.

Using a different policy is not feasible, as only round-robin, weighted round-robin, least load and random could be chosen, and none of them satisfies the need of reproducibly mapping a client to an upstream by certain criteria.

PoneyClairDeLune · 2023-08-22T14:22:16Z

time = currentTimeSecs - (currentTimeSecs % configSecs)
The idea is we subtract the modulo so that it stays the same time until the next configSecs window is ellapsed. Does this make sense?

I'm actually thinking of something like

time = rand.Uint32();

With the value getting refreshed to a new one after the set interval. This may be the cheaper way, as no additional modulo operations are introduced in every hash attempt.

mholt · 2023-08-22T16:05:11Z

The hash-based load balancer isn't just for mapping an IP to the same backend, but rather reproducible and (somewhat) predictable load balancing. The introduction of time in these hash-based load balancers becomes more like a mild measure preventing a single server from being overloaded by requests, while still having all the benefits hashes provide.

I still don't really understand this; is there any reading material about this you could point me to? I'm not sure the benefits of sharding by timestamp, since a client that is overwhelming server A at first will just start overwhelming server B a little later.

It sounds like a job for a rate limiter.

mholt · 2024-05-10T19:33:49Z

Re-reading this very carefully I think I understand a little better what is being asked. Basically IP+timeblock as part of the LB hash. But I still don't fully understand the motivation:

helps avoid single upstreams getting constant use from a single IP address.

(I guess I am not sure why this is a problem.)

Still, I'd be curious if someone implements this as a load balancing policy plugin, and sees real-world benefits with it, then we can more easily reconsider 😃

mholt added the feature ⚙️ New feature or request label Aug 18, 2023

mholt added needs info 📭 Requires more information plugin 🔌 A feature outside this repo labels May 10, 2024

mholt closed this as not planned Won't fix, can't repro, duplicate, stale May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timed hashes for load balancers #5762

Timed hashes for load balancers #5762

PoneyClairDeLune commented Aug 18, 2023

francislavoie commented Aug 20, 2023

mholt commented Aug 20, 2023

PoneyClairDeLune commented Aug 22, 2023 •

edited

PoneyClairDeLune commented Aug 22, 2023 •

edited

mholt commented Aug 22, 2023

mholt commented May 10, 2024

Timed hashes for load balancers #5762

Timed hashes for load balancers #5762

Comments

PoneyClairDeLune commented Aug 18, 2023

francislavoie commented Aug 20, 2023

mholt commented Aug 20, 2023

PoneyClairDeLune commented Aug 22, 2023 • edited

PoneyClairDeLune commented Aug 22, 2023 • edited

mholt commented Aug 22, 2023

mholt commented May 10, 2024

PoneyClairDeLune commented Aug 22, 2023 •

edited

PoneyClairDeLune commented Aug 22, 2023 •

edited