Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fileserver: Define request vars for tracking content served #5556

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

charles-dyfis-net
Copy link
Contributor

@charles-dyfis-net charles-dyfis-net commented May 23, 2023

Adds two variables for requests being handled as static files:

  • handlers.file_server.path - filesystem path as used by caddy to open file being served; may be an index file, a compressed proxy, or otherwise something other than the original content.
  • handlers.file_server.info - fs.FileInfo structure with basename and stat data.

Per discussion in NixOS/nixpkgs#222354 (with thanks to @emilazy and @mholt)


Also, adopts a patch from @francislavoie to avoid setting Last-Modified when mtime is 1 second past epoch (as is the case for files in the Nix store).

Copy link
Member

@francislavoie francislavoie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objections to this on its own, this seems fine.

But I do think we could probably just special case modtime=1 by comparing before & after http.ServeContent whether the Last-Modified header was modified with beforeLastModified != w.Header().Get('Last-Modified') && modtime == 1 then delete the header. This would directly solve it for Nix I think in the base case. As Matt said before, I don't think anyone other than Nix can reasonably expect to have files with modtime=1 so that should be safe.

Comment on lines 40 to 44
// fs.FileInfo -- note that this contains only a basename, not a full path
const StaticFileInfoKey = "caddyhttp.fileserver.info"

// string -- actual path used by caddy
const StaticFilePathKey = "caddyhttp.fileserver.path"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should go at the bottom of the file. Also, it should be http.file_server.*, see the reverse proxy which sets up similar variables. Actually for the proxy we're just setting it in the replacer directly instead of as a var. I'm not sure which way we should go, both have valid arguments and are equally as usable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempting to follow convention more closely in 0609137.

@francislavoie francislavoie added the feature ⚙️ New feature or request label May 23, 2023
@francislavoie francislavoie added this to the v2.7.0 milestone May 23, 2023
@francislavoie francislavoie changed the title Define request variables tracking content served by staticfiles fileserver: Define request vars for tracking content served May 23, 2023
@emilazy
Copy link

emilazy commented May 23, 2023

The mtime=1 case is already handled for the ETag (but unfortunately not Last-Modified due to the Go HTTP library): #5550

But that replaces incorrect caching with no caching at all; on the Nix side we want to set good ETags to get caching back, which we can do with domain-specific knowledge (paths in the Nix store are a good proxy for their content due to their immutability and the presence of the hash of the complete build environment); we plan to use a Caddy plugin to handle this, but we need access to the file path and stat info in order to determine whether the file is in the Nix store (there can be symlinks into the store from outside, etc.) and set the ETag appropriately.

We're open to whatever way of accomplishing this works best for Caddy and has the lowest maintenance and performance overhead; this is just what we came up with in the nixpkgs PR.

@francislavoie
Copy link
Member

francislavoie commented May 23, 2023

My suggested diff:

diff --git a/modules/caddyhttp/fileserver/staticfiles.go b/modules/caddyhttp/fileserver/staticfiles.go
index 2b5cc3de3..48bf5c797 100644
--- a/modules/caddyhttp/fileserver/staticfiles.go
+++ b/modules/caddyhttp/fileserver/staticfiles.go
@@ -462,11 +462,23 @@ func (fsrv *FileServer) ServeHTTP(w http.ResponseWriter, r *http.Request, next c
 		w = statusOverrideResponseWriter{ResponseWriter: w, code: statusCodeOverride}
 	}
 
+	// Nix uses modtime=1 for files in its store, and it's useless to set
+	// the Last-Modified header in that case, so we need to check if the
+	// header was set by anything else first (i.e. user config or another
+	// module) so that we can later check if http.ServeContent set the header.
+	modTime := info.ModTime()
+	beforeLastModified := w.Header().Get("Last-Modified")
+
 	// let the standard library do what it does best; note, however,
 	// that errors generated by ServeContent are written immediately
 	// to the response, so we cannot handle them (but errors there
 	// are rare)
-	http.ServeContent(w, r, info.Name(), info.ModTime(), file.(io.ReadSeeker))
+	http.ServeContent(w, r, info.Name(), modTime, file.(io.ReadSeeker))
+
+	// Delete the Last-Modified header if it was changed and modtime=1 (i.e. Nix store)
+	if beforeLastModified != w.Header().Get("Last-Modified") && modTime.Unix() == 1 {
+		w.Header().Del("Last-Modified")
+	}
 
 	return nil
 }

Re Etag, makes sense. So 👍 to me to adding to vars or replacer to allow your plugin to have the context it needs. My diff above is a suggestion to make it behave better in the case the plugin isn't used.

@charles-dyfis-net
Copy link
Contributor Author

nod -- I was thinking of doing some work to make the plugin usage automatic, but even then that would only work for people running Caddy on NixOS, configured via Nix, not people running Caddy with configuration that isn't Nix-managed (as might typically be the case with folks running Nix on MacOS or an arbitrary non-NixOS Linux distro). Certainly, if y'all are comfortable with having codepaths that only Nix users are likely to need, it's better for our mutual userbase to have a less-surprising out-of-the-box experience.

charles-dyfis-net pushed a commit to charles-dyfis-net/caddy that referenced this pull request May 23, 2023
Patch by Francis Lavoie, taken from github comment at caddyserver#5556 (comment); committed by Charles Duffy
charles-dyfis-net and others added 2 commits May 23, 2023 18:20
Adds two variables for requests being handled as static files:

- `handlers.file_server.path` - filesystem path as used by caddy to open file being served; may be an index file, a compressed proxy, or otherwise something other than the original content.
- `handlers.file_server.info` - fs.FileInfo structure with basename and stat data.

Per discussion in NixOS/nixpkgs#222354 (with thanks to @emilazy and @mholt)
Patch by Francis Lavoie, taken from github comment at caddyserver#5556 (comment); committed by Charles Duffy
@charles-dyfis-net charles-dyfis-net marked this pull request as ready for review May 24, 2023 15:18
@charles-dyfis-net
Copy link
Contributor Author

Hmm -- unless I'm missing something, there's a big-picture problem here.

If we're using request context or vars set by staticfiles, that means we have to run after staticfiles.

To set an ETag honored by the go standard-library ServeHTTP implementation, we have to run before staticfiles.

@charles-dyfis-net charles-dyfis-net marked this pull request as draft May 24, 2023 22:28
@emilazy
Copy link

emilazy commented May 24, 2023

(You're missing that you can adjust the response after calling next.ServeHTTP in a middleware handler; see the handler docs. But there are some other considerations we should think about too; currently in the process of writing a comment on the nixpkgs PR.)

Edit: Actually this might not be true because the headers will already be written out after that? But I think that's what the Caddy header deferral stuff that I don't fully understand is for.

@francislavoie
Copy link
Member

That's correct, you can adjust the response headers on the way out by doing it after calling next. That's how the header directive's defer mode works.

But for it to work correctly, you'll need to wrap the http.ResponseWriter with your own which will intercept writing the status code to apply your header changes just before that.

See these parts to get an idea how it works:

w = &responseWriterWrapper{
and
func (rww *responseWriterWrapper) WriteHeader(status int) {

@emilazy
Copy link

emilazy commented May 24, 2023

Ah, thank you; I missed the detail that you can pass down a different ResponseWriter to the next handler. Is there any meaningful performance impact to that?

@francislavoie
Copy link
Member

Barely. Nanoseconds at worst.

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working with us to make something viable. We're making progress, I think we're almost there! :)

@@ -71,10 +71,15 @@ func init() {
// modified the last component of the path (the filename).
//
// This handler sets the Etag and Last-Modified headers for static files.
// Files with an mtime of 1 second past epoch do not get an Etag.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Files with an mtime of 1 second past epoch do not get an Etag.
// Etag is omitted for files with a Unix epoch mod time of 0 or 1.

Comment on lines +79 to +82
// Variables are set on request context to store the path of the file being
// served (if serving a precompressed file, its name will be here), and a
// fs.FileInfo struct with stat data.
//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to describe what the variables are actually named -- and since this is a terminal handler (it writes the response and does not call a next middleware), the variable can only be used in response wrappers (i.e. returning back "up" the middleware chain). Not sure how to word that exactly but I can offer a suggestion if you'd like :D


// Delete the Last-Modified header if it was changed and modtime=1 (i.e. Nix store)
if beforeLastModified != w.Header().Get("Last-Modified") && modTime.Unix() == 1 {
w.Header().Del("Last-Modified")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? When http.ServeContent() returns, the response is already written.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point, I'm not sure. I didn't try it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something we could handle with the same ResponseWriterWrapper trick, right? So maybe the mtime=1 special casing doesn't need to be upstream at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just move these lines up to before ServeContent is called?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because ServeContent is what is setting Last-Modified. That's what we're trying to intercept in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, duh.

Yeah, a ResponseWriter wrapper is probably best then.

@mholt
Copy link
Member

mholt commented Jun 15, 2023

We'll probably tag 2.7 beta 2 in a few moments -- if this isn't ready before the release candidate (or stable, whichever comes first), might just bump this back to after 2.7.

@mholt mholt modified the milestones: v2.7.0, v2.7.1 Jun 20, 2023
@mholt mholt modified the milestones: v2.7.1, v2.7.2 Aug 3, 2023
@francislavoie francislavoie modified the milestones: v2.7.2, v2.7.3 Aug 3, 2023
@mholt mholt modified the milestones: v2.7.3, v2.7.4 Aug 3, 2023
@mholt mholt added this to the 2.9.0 milestone Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature ⚙️ New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants