Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit explanation of site block address matching functionality in Caddyfiles #235

Open
sevmonster opened this issue May 14, 2022 · 3 comments
Labels
documentation Improvements or additions to documentation

Comments

@sevmonster
Copy link

sevmonster commented May 14, 2022

In learning Caddyfile syntax, I attempted this site block at the top of the Caddyfile...

https:// {
    tls mycert.pem mykey.pem {
        [...]
    }
}

...and saw that it "worked": my other site blocks appeared to inherit the certificate and key. This led me to believe that:

  1. any directive I were to add would either take precedence, since I had defined it first in the Caddyfile and it was being executed first; or
  2. the block was acting as a default/fallback to other more specific blocks, as I had seen in my searches that the length of matchers influenced their order, so it would make sense that that functionality would carry over to site blocks.

But it appears neither of those situations are actually the case, and when I added header directives to this block, I spent many frustrating minutes trying to decipher why it wouldn't apply those headers to any site.

Turns out the certificates were being used because they were loaded into the certificate cache by the site block, not because the block was being selected or used as a fallback; it just so happened that the certificate I brought with me to Caddy already had SANs for every domain name I was using, and all the other site blocks were pulling it in since they could use it, and did not have any TLS settings of their own defined, right?

As far as I can see, every route from an adapted Caddyfile will have terminal: true, and routes without matchers (e.g. https://) will be inserted last, well after the other site blocks that execute and terminate the chain before it. So the only way the above example site block would ever match is if no other block did.

I think it would be beneficial in the documentation to explain somewhere/make more explicit where necessary:

  • how site blocks are ordered during adaptation, like how directives are ordered;
  • that adapted site blocks are not cascading and terminate after the first match;
  • that site blocks will automatically look in the cache for a certificate that matches their address; and
  • that the idea of having defaults for certain values, e.g. headers and certificates, are not best approached via site blocks.

I have since manually edited the adapted output of my Caddyfile to get the kind of functionality I want: a matcherless route that did not terminate, to have a set of default headers applied to all the following routes. Maybe this could be an improvement to look into for Caddyfiles, where the closest working method is currently to create a snippet and import it into every site block.

...And if there is a way to do this that I haven't discovered yet, I'd love to hear about it.

@francislavoie
Copy link
Member

Yeah, it's a hard balance.

The issue is that the Caddyfile algorithm for handling site blocks is... honestly, spaghetti. All kinds of little conditions to handle edgecases. So trying to verbalize that spaghetti to English and document it is a whole undertaking and I'm not sure there'd really be much value in doing that -- might be a waste of our time that we could spend it elsewhere, and realistically, how many people will actually read the in-depth English explanation when they could have instead just read the code?

You're right though, there is room for a higher-level explanation without getting into the nitty gritty; right now we just kinda gloss over "these are the things that are valid as site addresses" in https://caddyserver.com/docs/caddyfile/concepts#addresses but we could explain they're sorted and are mutually exclusive (another way of saying "doesn't cascade"). In general, they're sorted according to the length of the host matcher, so that roughly "most specific one is first", essentially (with a bunch of special casing for wildcard certs and whatnot).

that site blocks will automatically look in the cache for a certificate that matches their address

FYI you can turn off this behaviour with auto_https ignore_loaded_certs: https://caddyserver.com/docs/caddyfile/options#auto-https

You're right that the default behaviour could probably be more discoverable. Not sure how though. It's not really relevant to site blocks so much. Probably best in the tls directive's docs.

that the idea of having defaults for certain values, e.g. headers and certificates, are not best approached via site blocks.

What you're trying is kind-of an edgecase tbh; we do mention elsewhere "if you want full control then use JSON" e.g. https://caddyserver.com/docs/getting-started#json-vs-caddyfile

I'm not sure how we'd talk about that specific idea. It's pretty abstract.

to have a set of default headers applied to all the following routes. Maybe this could be an improvement to look into for Caddyfiles, where the closest working method is currently to create a snippet and import it into every site block.

Yeah, snippets are the recommended way. I realize it causes duplication, but we don't want to make Caddyfile sites act like CSS (cascades) cause that really adds a lot of baggage. It sounds simple, but... it can get real messy real fast.

I see so often that people need to import something in like, all but one of their sites. If they tried to use this kind of feature, they'd then ask "okay but how do I exclude this from this one site?" I don't want to have to answer that question or implement something to deal with it. Best if it's kept simple. Just use snippets. It's explicit, you can look at any given site block in your config and you don't need to think about implicit stuff inherited from elsewhere.

Hopefully that answers your questions

@sevmonster
Copy link
Author

sevmonster commented May 14, 2022

I'm not sure there'd really be much value in doing that -- might be a waste of our time that we could spend it elsewhere, and realistically, how many people will actually read the in-depth English explanation when they could have instead just read the code?

While I agree with everything you're saying, a baseline understanding about why something won't work the way someone expects is better than none at all. I am fortunate enough to be able to get a grasp on things without much trouble, but someone coming from a highly nested and structured Nginx environment like I did could potentially run into the same situation as me and not be able to work their way out of it. For all intents and purposes I came into the project in "nginx mode" and it took some rethinking and retooling to make better Caddyfiles rather than doing a 1:1 port of my Nginx configs. In my research I found more than one person that opted to stick to Nginx because they couldn't figure it out. Here's one.

Maybe a FAQ for onboarding Nginx users? With some of the common differences between the two. I suppose it doesn't have to just be Nginx and there could be sections for other reverse proxies too, but I am not versed enough in any others to provide any feedback. (If there is something like that I missed it!)

[...] but we could explain they're sorted and are mutually exclusive (another way of saying "doesn't cascade"). In general, they're sorted according to the length of the host matcher, so that roughly "most specific one is first", essentially (with a bunch of special casing for wildcard certs and whatnot).

Exactly this. In hindsight I cannot think right now of a situation where the order of the adapted site blocks would matter, unless the user was trying to do something unorthodox like I was, or if they had a lot of wildcards that somehow interacted strangely. So I think it would just be safe enough to just mention this on a high level and recommend some links to the JSON docs if they want to manually define the order. I also think it isn't worth it to try to explain the edge cases, like you said. Anyone interested in such depth would probably be better off using the JSON config for its versatility.

FYI you can turn off this behaviour with auto_https ignore_loaded_certs

I actually wanted that functionality, my objective here is to minimize repitition and as such maintenance.

You're right that the default behaviour could probably be more discoverable. Not sure how though. It's not really relevant to site blocks so much. Probably best in the tls directive's docs.

I think maybe a separate page explaining the cert cache mechanism, how it interacts with the tls directive (i.e. any cert files you specify are loaded into the cache), and how automatic HTTPS will only attempt to generate certs it does not already have in the cache, would be a good way to go about it. Links to that page could be included on the tls directive page and automatic HTTPS page. You're correct that it's not really related to site blocks, it was more that my initial misunderstanding and assumption led me to believe site blocks worked in a way in which they do not. Had I read something about how the cert cache worked on the tls directive's page, I would not have come to such an assumption about site blocks in the first place.

[...] I'm not sure how we'd talk about that specific idea. It's pretty abstract.

It would be easy enough to just mention that, by default, Caddyfiles do not cascade or inherit settings, or however else you wish to describe it. Putting that next to the site block description in the concepts page would be a good place, I think. And a link to the JSON documentation about terminal would be helpful for those that wish to use such a feature. Now that I better understand the system and how to achieve my goals I am totally fine with using JSON config.

Best if it's kept simple. Just use snippets. It's explicit, you can look at any given site block in your config and you don't need to think about implicit stuff inherited from elsewhere.

I understand and agree completely. Leaving the Caddyfile simpler is probably the saner way to go.

@francislavoie
Copy link
Member

In my research I found more than one person that opted to stick to Nginx because they couldn't figure it out. Here's one.

Actually, in that case, they said they're using Nginx Proxy Manager which is a third-party project which provides a web UI for configuring Nginx. They didn't keep using Nginx configs specifically. I'm not concerned with that particular user, frankly.

Anyways, thanks for the comments, I'll try to address some of these points soon 👍

@francislavoie francislavoie added the documentation Improvements or additions to documentation label May 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants