Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httpcaddyfile: Allow hostnames & logger name overrides for log directive #5643

Merged
merged 3 commits into from
Aug 2, 2023

Conversation

francislavoie
Copy link
Member

@francislavoie francislavoie commented Jul 21, 2023

I've been thinking about this for a while. We've had a few cases where users ask to be able to configure logging per subdomain when using a wildcard site block (i.e. this pattern https://caddyserver.com/docs/caddyfile/patterns#wildcard-certificates).

Normally, the log directive auto-detects the hostname to use from the site block's names. So if you define a log inside *.example.com, it will map *.example.com to log0 (or whatever auto-increment ID you get for that logger name if not 0).

It's always been possible to define more than one log directive inside of a site block, we didn't prevent it; but it wasn't very useful to do, because then you'd just be writing access logs to two different places which is usually not what you want.

So making use of the above, we can add a new hostnames option to the log directive that overrides the auto-detection for that logger and specifically only logs for those hostnames. This makes it easy to setup one log file per subdomain, while still using a wildcard cert.

Here's a full Caddyfile example (except for tls and fallback handle that you'd typically want as per the pattern linked above):

*.example.com {
    log {
        hostnames foo.example.com
        output file /var/log/foo.example.com.log
    }
    @foo host foo.example.com  
    handle @foo {
        respond "foo"
    }
    
    log {
        hostnames bar.example.com
        output file /var/log/bar.example.com.log
    }
    @bar host bar.example.com
    handle @bar {
        respond "bar"
    }
}

I'd recommend using snippets to make this shorter because it would make it possible to avoid repeating the domain for both the logger and host matcher. I'll probably write up an example for the docs that does that.

Edit: I changed it from hostname to hostnames because I realized "why not allow a logger to log more than one subdomain". It aligns better with the fact that site blocks are a list of hosts and not just a single one, so both approaches take a list. Slightly more flexible, but I expect the vast majority of users will be just configuring one hostname, if they want to override at all. The adapter test covers both.

@francislavoie francislavoie requested a review from mholt July 21, 2023 01:38
@francislavoie francislavoie added the feature ⚙️ New feature or request label Jul 21, 2023
@francislavoie francislavoie force-pushed the log-hostname-override branch 3 times, most recently from 27e3838 to 7c2cf6f Compare July 21, 2023 01:57
@francislavoie francislavoie changed the title httpcaddyfile: Allow hostname override for log directive httpcaddyfile: Allow hostnames override for log directive Jul 21, 2023
@AskAlice
Copy link

relevant thread surrounding the rationale for this feature
https://caddy.community/t/wildcard-cert-with-logs-separated-by-hostname/20552/2

@AskAlice
Copy link

AskAlice commented Jul 21, 2023

Say there are two log directives

*.example.com {
	log {
		output file /var/log/caddy/access.log
	}
    log {
        hostnames foo.example.com
        output file /var/log/caddy/foo.example.com.log
    }
    @foo host foo.example.com  
    handle @foo {
        respond "foo"
    }
    @foo host bar.example.com  
    handle @bar {
        respond "bar"
    }
}

do the logs for 'foo' end up in both the access.log and the foo.example.com.log?

to me it always made more sense to try and nest the log directive inside the handle function, that way any request that ends up inside the handle function instead gets logged in a separate way, overriding the logging strategy defined in the parent.

That way logs for undefined/unexpected hostnames end up in the access.log, and the defined hostnames each get a logfile.

my intuitive first guess on how this could work, before realizing it threw an error was

*.example.com {
	log {
		output file /var/log/caddy/access.log
	}
    @foo host foo.example.com  
    handle @foo {
        respond "foo"
	    log {
	        output file /var/log/caddy/foo.example.com.log
	    }
    }
    @foo host bar.example.com  
    handle @bar {
        respond "bar"
	    log {
	        output file /var/log/caddy/bar.example.com.log
	    }
    }
}

requests to foo.example.com end up in foo.example.log
requests to bar.example.com end up in bar.example.log
requests to zxvnzxkvlzsjf.example.com end up in access.log

A practical reason for a setup like this would be to deliberately search for out of place http requests targeted at hosts that don't or no longer exist and block the IPs associated with such abusive requests.

So, even if the log directive shouldn't be inside a handle, (like the format in your PR) it makes sense that if logs are specified for a hostname, then caddy should only log that information in the log directive that specifies the hostname.

@francislavoie
Copy link
Member Author

francislavoie commented Jul 21, 2023

Say there are two log directives, do the logs for 'foo' end up in both the access.log and the foo.example.com.log?

Good question. It won't, no, only the most specific logger; i.e. if there's a logger for foo.example.com it will take precedence over a logger for *.example.com.

Since your site block's address is *.example.com, loggers configured inside that site will have that hostname by default, unless you use the new hostnames option to override that with something more specific.

to me it always made more sense to try and nest the log directive inside the handle function

Yeah I wish we could do that. It's not possible though.

The problem is that log is not a "HTTP handler" directive, it's just special directive that configures things at the HTTP server level, so it can't be used within HTTP routes, and must be top-level in the site block.

For example, tls is another non-handler directive, because it configures how TLS handshakes should work, and TLS is a layer above HTTP.

If you run caddy adapt --config /path/to/Caddyfile --pretty you'll get an idea for what config is produced by log; you'll notice it doesn't add to routes but instead adds a logger in the top-level of the config, and it adds logger_names in the relevant server.

In Caddy v3 or whatever, we might redesign how logging is configured. But that's still far away.

@AskAlice
Copy link

AskAlice commented Jul 23, 2023

SGTM thanks for the explanation.

I think an alternate solution to this problem that could ease complexity would be to enforce wildcard TLS for a given domain name at the server level, but given how the docs guide people to use this format with wildcard domains, this implementation makes sense.

@francislavoie
Copy link
Member Author

Yeah there's some ideas about that in #5447

@francislavoie
Copy link
Member Author

After playing around with the question in https://caddy.community/t/multiple-access-logs-in-site-block-only-last-defined-works/20599, I added a second commit to this branch adding another feature; overriding the logger name with the log directive.

In practice, the only time this should be useful I think is to have 2 separate outputs for a single access log source. But also it gives slightly more control over JSON output if that matters at all for users (99.9% of the time it shouldn't).

It was mostly trivial, but I had to adjust a few things to cover some edgecases.

@francislavoie francislavoie changed the title httpcaddyfile: Allow hostnames override for log directive httpcaddyfile: Allow hostnames & logger name override for log directive Jul 25, 2023
@francislavoie francislavoie changed the title httpcaddyfile: Allow hostnames & logger name override for log directive httpcaddyfile: Allow hostnames & logger name overrides for log directive Jul 25, 2023
@mholt
Copy link
Member

mholt commented Jul 25, 2023

Thanks for working on this.

Do you want me to get this in 2.7 (might need some more time to review) or can it wait until after 2.7?

@francislavoie
Copy link
Member Author

It could wait, but also it's not that complex so if we could squeeze it in that'd be cool too.

@steffenbusch
Copy link
Contributor

Reference: https://caddy.community/t/multiple-access-logs-in-site-block-only-last-defined-works/20599/7

I have used the feature to override the logger name from this PR and it works fine. But when the debug directive is added to the global options, this exception is raised on caddy run:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0xd5adec]

goroutine 1 [running]:
github.com/caddyserver/caddy/v2/caddyconfig/httpcaddyfile.ServerType.Setup.func1({{0x4000686948, 0x3}, {0x25cf560, 0x0, 0x0}, 0x0})
        github.com/caddyserver/caddy/v2@v2.6.4/caddyconfig/httpcaddyfile/httptype.go:245 +0xac
github.com/caddyserver/caddy/v2/caddyconfig/httpcaddyfile.ServerType.Setup({}, {0x4000682880, 0x2, 0x40005f17d0?}, 0x40005f1740)
        github.com/caddyserver/caddy/v2@v2.6.4/caddyconfig/httpcaddyfile/httptype.go:275 +0x1b04
github.com/caddyserver/caddy/v2/caddyconfig/caddyfile.Adapter.Adapt({{0x194a340?, 0x25cf560?}}, {0x40000ca400, 0xfd, 0x200}, 0x40005fac60?)
        github.com/caddyserver/caddy/v2@v2.6.4/caddyconfig/caddyfile/adapter.go:50 +0xf0
github.com/caddyserver/caddy/v2/cmd.loadConfigWithLogger(0x400024b730, {0xffffef1938ed, 0x1b}, {0x0, 0x0})
        github.com/caddyserver/caddy/v2@v2.6.4/cmd/main.go:161 +0x5d8
github.com/caddyserver/caddy/v2/cmd.LoadConfig({0xffffef1938ed, 0x1b}, {0x0, 0x0})
        github.com/caddyserver/caddy/v2@v2.6.4/cmd/main.go:92 +0x40
github.com/caddyserver/caddy/v2/cmd.cmdRun({0x0?})
        github.com/caddyserver/caddy/v2@v2.6.4/cmd/commandfuncs.go:205 +0x538
github.com/caddyserver/caddy/v2/cmd.WrapCommandFuncForCobra.func1(0x4000222600?, {0x14aa487?, 0x2?, 0x2?})
        github.com/caddyserver/caddy/v2@v2.6.4/cmd/cobra.go:126 +0x30
github.com/spf13/cobra.(*Command).execute(0x4000222600, {0x40005d86e0, 0x2, 0x2})
        github.com/spf13/cobra@v1.7.0/command.go:940 +0x5c4
github.com/spf13/cobra.(*Command).ExecuteC(0x25720c0)
        github.com/spf13/cobra@v1.7.0/command.go:1068 +0x340
github.com/spf13/cobra.(*Command).Execute(...)
        github.com/spf13/cobra@v1.7.0/command.go:992
github.com/caddyserver/caddy/v2/cmd.Main()
        github.com/caddyserver/caddy/v2@v2.6.4/cmd/main.go:64 +0x68
main.main()
        caddy/main.go:14 +0x1c

The Caddyfile used here was:

{
        debug

        log access-formatted {
                include http.log.access.foo
                output file access-localhost.log
                format console
        }

        log access-json {
                include http.log.access.foo
                output file access-localhost.json
                format json
        }
}

localhost:8881 {
        log foo
}

Go version is go version go1.20.6 linux/arm64

caddy was built with xcaddy build --with github.com/caddyserver/caddy/v2=github.com/caddyserver/caddy/v2@log-hostname-override

caddy version
v2.6.4 => github.com/caddyserver/caddy/v2@v2.7.0-beta.2.0.20230725182109-017d63aad4cf h1:8kzZAx1LlWUwq1Ymj3290m4WKImVKiC2W3SdFSSW10o=

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach. Thanks Francis!

@mholt mholt added this to the v2.7.0 milestone Jul 31, 2023
@francislavoie
Copy link
Member Author

Should be fixed now @steffenbusch if you'd like to try again and confirm.

@mholt
Copy link
Member

mholt commented Aug 2, 2023

Great! Thank you @francislavoie. @steffenbusch we might release Caddy 2.7 tomorrow, especially if you can confirm it works for you 😃

@steffenbusch
Copy link
Contributor

Thank you very much @francislavoie and @mholt.
I can confirm that the panic: runtime error is gone and the Caddyfile configuration works as intended.

@francislavoie francislavoie merged commit 5c51c1d into master Aug 2, 2023
20 checks passed
@francislavoie francislavoie deleted the log-hostname-override branch August 2, 2023 07:13
@mholt
Copy link
Member

mholt commented Aug 2, 2023

Awesome, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature ⚙️ New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants