New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proxy: Load Balance using multiple A records per single hostname #1545
Comments
Hi @furkanmustafa , thank you for following the issue template. I do not think this is a bug, though. This is how the Go resolver works: https://golang.org/pkg/net/#Dial
If the first IP address is succeeding, it will be used. Maybe this would be a better fit upstream as a Go bug? |
I think this would be a feature request rather than a bug request. For Dial's behavior, I think it works as it supposed to. But for a load balancer, I would expect counting all those IPs as upstream servers. If you think this suggested behavior would be appropriate for caddy's behavior, I can try to make a PR. If not, then I think the next proper solution would be to use a service proxy between caddy and the containers, to make this look like a single target to caddy. But also seems like a bit redundant to do the load balancing work again at an additional layer. |
(Sorry, I confused it for a bug report because it followed the bug template.) I have heard that using DNS for load balancing is not a good idea (TTLs and such); why not just use Caddy's built-in load balancing features that do it properly? |
That's a mistake probably on my part :) In this use case, the upstream target is changing rapidly. scaling up and down, or containers are replaced quickly with a new one (with a new IP). And service discovery is available using the internal dns server. Using this DNS way, we would avoid generating web server configuration using a template file, and restarting/reloading it on each change etc. But of course, additionally the cache ttl on the load balancer side should be extremely low (<10s) to make this work smoothly. Everything would be dead simple to work with in this setup, but not the most elegant way to do it maybe. For comparison with nginx behavior; nginx does load balance to multiple A records as desired, but it completely respects the DNS ttl value, therefore rapid changes in the upstream service is not reflected in a reasonable amount of time. That part could be also considered as a design failure on the internal DNS server of the system(rancher), because it reports ttl values as 600 seconds, which doesn't fit to it's purpose. If rancher reports a ttl of zero, or least something down to a few seconds, that'd work perfectly. Therefore, a proper implementation of DNS cache would work on load balancer side. Another better/cleaner approach would be Caddy to provide API to add/remove upstreams to a virtualhost on the go. That'd require some integration opposed to DNS solution. But would provide complete automation possibility, and would be elegant enough. This is probably out of this issue's scope, that can be moved if considered as a better solution. Just a random idea for how I'd wish API to work; Config:
It'd return 502 until any host added to upstream via API. API Interface:
|
This is something that would be useful for my use case too. The same feature is available also for example in Envoy, which calls it Strict DNS Service Discovery mode, in which it considers every A record a separate upstream, and it's periodically, asynchronously refreshing the list of IPs. |
Just to add another common usage scenario for this: if you have headless services on Kubernetes, their A records return multiple ips, one for each pod managed by that service. That feature would be useful to load balance those pods directly with Caddy See: This scenario already (kinda) works using SRV records, but only if you have named ports on the Service. The problem with SRV records is that basically the DNS resolver is the one doing the load balancing. (#2100 (comment)) |
@JCMais Good news on that front, we have already built an ingress controller that works with Caddy 2 which should do just what you need. |
@mholt is Caddy really production ready? What advantages does it have over nginx ingress? I used Caddy before because the configuration was simple (perfect for a dev environment). |
Yes it is. Our ingress controller automatically provisions certificates for all your hosts and is extensible with modules in more ways than nginx. |
@mholt is this the official ingress repo? https://github.com/wehco/caddy-ingress-controller |
@ghostsquad No, that's a third-party one. We haven't released ours yet, but stay tuned. It's really awesome 😎 |
@mholt is there an issue or PR I can subscribe to? |
If you happen to be at Velocity Conference tomorrow, we're demoing it at our booth. :) Otherwise, just follow caddyserver on Twitter for now. 👍 |
I, unfortunately, won't be at the conference, but will definitely follow on Twitter. Thanks! |
Was this ever implemented? Using a single dns record that points to multiple backend server ip addresses is a nice and easy way to reduce duplication and configuration |
@thenewguy see #1545 (comment), Caddy v2 will have this. The code for the ingress controller is not yet public but should be soon I think. |
@francislavoie awesome! |
The Caddy ingress controller for Kubernetes is here: https://github.com/caddyserver/ingress Just to gauge interest on this issue, it seems like everyone looking for this feature wanted it for use with Kubernetes? If so, I think we can close this issue. |
Kubernetes is not the only reason/system where people use dns based service discovery. |
Understood, but I'm specifically looking to gauge how many people actually need this that aren't already covered by the k8s ingress controller. We don't want to waste our time implementing a feature nobody will use. |
As Caddy 2 now has several dynamic ways of configuring proxy backends, I'll close this until we find a use case that seems necessary to specifically do load balancing using DNS A records. (Feel free to continue discussion if you have a legit use case.) |
👋 Hi, I stumbled upon this issue while researching whether Caddy could be used as a reverse proxy to an AWS ALB... AWS load balancers can have an instance per AZ / subnet (typically 2 - 6) whose ipv4 addresses are all returned when resolving the A record for the load balancer's DNS name. So, to effectively reverse proxy an AWS ALB, all of the A records need to be resolved and load balanced by the reverse proxy. Is there an alternative strategy for accomplishing this in Caddy v2? |
This should be pretty easy to implement using |
I came across this while researching about service discovery using DNS A records with caddy. Our use case would be this: I'm still in my early stages of programming, but I'll take a look into this. |
Currently working on this as part of the refactoring of Caddy's reverse proxy which will allow dynamic upstreams via new "upstream source" modules (or something like that, naming is hard). I'm actually doing this for SRV lookups, but A/AAAA load balancing works pretty similarly. |
Implementation can be seen in #4470 |
Would anyone like to give my PR, #4470, a try? It only supports JSON config for now, but you can just use |
This is now merged, see #4470. |
1. What version of Caddy are you running (
caddy -version
)?2. What are you trying to do?
Trying to proxy and load-balance to an upstream server, but, not by listing all IPs/hostnames, because the system I use (rancher) is providing multiple A records for service DNS requests.
Example dns reply using
dig
;Caddy resolves the host OK, but sends all requests to the IP address in the first A record.
3. What is your entire Caddyfile?
4. How did you run Caddy (give the full command and describe the execution environment)?
Using Docker/Rancher, with the dockerfile here.
6. What did you expect to see?
7. What did you see instead (give full error messages and/or log)?
8. How can someone who is starting from scratch reproduce the bug as minimally as possible?
The text was updated successfully, but these errors were encountered: