Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache Busting #1979

Open
KerimG opened this issue Feb 9, 2020 · 34 comments · May be fixed by #3018
Open

Cache Busting #1979

KerimG opened this issue Feb 9, 2020 · 34 comments · May be fixed by #3018

Comments

@KerimG
Copy link

KerimG commented Feb 9, 2020

Is there any way to make mkdocs build the site with some hash string built into the names of the CSS and JS files to do some cache busting?

@waylan
Copy link
Member

waylan commented Feb 9, 2020

No, not that I'm aware of. MkDocs only copies CSS and JS files without modification. Plugins don't even get access to those files.

@KerimG
Copy link
Author

KerimG commented Feb 10, 2020

What's the current common way of making sure the website and script assets aren't loaded from the cache?
I guess some hacky headers could do the trick but many host their docs on managed webspaces.

@waylan
Copy link
Member

waylan commented Feb 10, 2020

MkDocs is a static site generator. Static pages don't need to be concerned with such things. If you need this sort of thing, then perhaps MkDocs is not the right tool for you.

@KerimG
Copy link
Author

KerimG commented Feb 10, 2020

That is a bit of a silly non-answer. If your STATIC* website uses CSS or JS, being able to bust the cache is a useful thing.

A custom theme might have a small error that needs fixing, the JS might contain an API key (e.g. analytics) that might need updating. Waiting for the cache time to run out might not be feasible, neither is sending HTTP headers to direct the browser to not cache anything. Managed web spaces, which are a prime hosting solutions for static websites, might not even allow modified headers.

@tomchristie
Copy link
Member

tomchristie commented Feb 10, 2020

MkDocs doesn't support this, no.

Yes, in some cases you might be behind a cache, but they won't be setting far-future expiries.

GitHub pages is a good example that's got a fairly heavy caching policy, even then worst case is your page will be up to date in <60 seconds from deployment.

@oprypin
Copy link
Contributor

oprypin commented Apr 24, 2021

Why was this closed? Clearly, many people think that this is a needed feature. And it is not solved. If the maintainers don't understand the use case, they should continue asking, not just outright tell people what they should or shouldn't be doing.


I have observed this myself many times: in typical configurations of webservers, the browser will end up requesting the HTML anew every time, but CSS linked from the page will be persisted.

And certainly, if I rework, say, the namings of all classes in both the CSS and HTML of my site, I need to ensure that the CSS will be reloaded, not just the HTML. Otherwise, with deep enough changes, my site will end up basically unstyled.
And adding some hash to the CSS's source location is the industry-standard way of achieving this. It makes no difference if the site is static or not, unless it's "static" in the sense of "will never change".

You can try going to https://oprypin.github.io/crsfml/ which is a MkDocs site hosted on GitHub Pages.
What you'll observe is that your Web browser will re-download the page itself every time, but it will persist the CSS linked from it. Even F5 won't reload it.

Cold load (requests filtered to just CSS)After pressing F5 (all requests)

Cold load

After pressing F5

In this example we have 5 CSS files.

  • Two of them with hashes, because mkdocs-material theme itself thoughtfully includes those.
    • So it is indeed fine for the browser to forever cache those files, as files with those particular names will never change.
  • The font presumably will never change, so that's fine.
  • And the latter two are normal CSS added through MkDocs, and they have no hashes.
    • If I change anything in those files, users of my site can open the home page, and even press Reload, and they will still be seeing the old CSS with new content. That is a problem.

And MkDocs is exactly the tool that should be helpful in this regard. Rather than asking creators to make sure to rename their file to "style2.css" "style3.css" every time, it could very easily load the URL "style.css?d41d8cd9" instead of "style.css", and everything would just be better.

@KerimG
Copy link
Author

KerimG commented May 2, 2021

@oprypin thanks for the effort. I would like to implement this feature but I am not familiar enough with the codebase. We've been using mkdocs as a documentation site for our clients but are considering moving to Gatsby or Next.

@oprypin
Copy link
Contributor

oprypin commented May 2, 2021

Oh, implementing it is not the problem. Here you go: https://github.com/oprypin/mkdocs/compare/bust
The problem is that this issue is denied.

@waylan
Copy link
Member

waylan commented May 2, 2021

I went back and reread this. It is clear to me that initially this appeared to be related to server settings/configuration, which is not something that is within MkDocs ability to address (and what he issue was closed). But now it seems that the discussion has turned to browser cache. Although, wouldn't HTTP headers be the way to address this, which again, is a server config thing and out-of-scope for MkDocs? So what do you suggest MkDocs does to address the issue?

@oprypin
Copy link
Contributor

oprypin commented May 3, 2021

The original issue description suggests this action directly:

make mkdocs build the site with some hash string built into the names of the CSS and JS files

I had also expanded on that in my posts.


There is no way to fully control caching through HTTP headers. If a server sets a header saying "browser, please don't even bother checking this file within the next day", then there's no way to undo that by sending a different header next time, because there won't be a next time.

And, well, you can be sure that there are some servers that set that and don't let you change it. GitHub Pages in particular sets it to 10 minutes, but that's just a good bit of luck.

Then you can say "just configure your server to never cache". But that's not a solution, because... caching is nice.

The industry standard practice is to attach a hash to the requested filename, to ensure the browser has not encountered this name before and will request the file anew. At the same time, caching can be configured at "full strength".

You can even look at the <script> tags of this pages's HTML (GitHub).

@KerimG
Copy link
Author

KerimG commented May 3, 2021

My initial post didn't have anything to do with server settings. One can mitigate the issue somewhat with http headers but as @oprypin Pointed out correctly, that's not granular. And that's on the assumption that you can modify the http headers.

I didn't think cache busting is such a controversial issue lol. It's a common thing with virtually every other framework.

@waylan
Copy link
Member

waylan commented May 3, 2021

Wait, so you mean we need to have a rolling filename which changes with each build. That seems ridiculous. Do other static site generators actually do this?

@KerimG
Copy link
Author

KerimG commented May 3, 2021

Yes. (see second strategy, which is the one I see most often). Files are cached by name, so that's what most frameworks work with. Not sure how this is "ridiculous". I've seen more ridiculous things. Like maintainers of popular static site generators being unaware of that concept and prematurely closing issues related to it 😂

Flask Cache-Busting
Webpack Cache-Busting
Laravel Cache-Busting

Jokes aside, I do think it would be a really cool feature to add.

@waylan
Copy link
Member

waylan commented May 3, 2021

Those are web frameworks for dynamic content, not static site generates and therefore, didn't answer the question I asked. What does Sphinx or Jekyll do (to name a few)? That said, I couldn't help but notice that for Flask, you linked to an extension. Any reason this couldn't be implemented as a MkDocs plugin?

@KerimG
Copy link
Author

KerimG commented May 3, 2021

Yikes. A site being static or not has little to do with its caching strategy. Static doesn't mean that you write the CSS/JS once and then never ever touch it again.

Jekyll: https://ultimatecourses.com/blog/cache-busting-jekyll-github-pages , https://ethanmarcotte.com/wrote/stupid-jekyll-tricks/

GatsbyJS: https://www.gatsbyjs.com/docs/caching/

@waylan
Copy link
Member

waylan commented May 3, 2021

So GatsbyJS clearly includes support out-of-the-box. However, it appears that Jekyll does not. Those two links are to hacks by users to make it work for their specific needs. The second describes a solution which requires a custom non-standard server config, so that is a non-solution for most users. However, the first could easily be implemented with a few tweaks to the theme templates using the build_date_utc context variable (perhaps using build_date_utc.timestamp() or a call to build_date_utc.strftime with your format of choice). For that matter, a third-party theme could implement that with no changes to MkDocs at all.

@oprypin
Copy link
Contributor

oprypin commented May 3, 2021

Wait, so you mean we need to have a rolling filename which changes with each build. That seems ridiculous

You made this assumption and stuck with it, but that's not what was suggested. Of course changing the file name every time is inferior. We have been talking about adding a hash of the file's content (sorry, the part about content was not explicitly written, though it should be clear anyway), so it would not change often.

So GatsbyJS clearly includes support out-of-the-box. However, it appears that Jekyll does not. Those two links are to hacks by users

That is true, but I don't think it is useful to say it. Is there any implied argument based on this? The only one I could guess would be (what I would definitely disagree with) "because there is one static framework not providing this solution (even though people need it), that means any static framework can disregard this".

For that matter, a third-party theme could implement that with no changes to MkDocs at all.

Why should each theme need to implement this, in an inferior way, if MkDocs could do it once? I just don't get it, what is the point of this comment?

I get that we are already in a state where it's almost impossible to get any improvement to MkDocs through your filter. And that mkdocs-material is taking over development of all features that MkDocs is missing, making it the only viable theme. But why should that be encouraged even more?

@waylan
Copy link
Member

waylan commented May 5, 2021

First of all, it is clear that this request was initially misunderstood. I now understand that this is requesting, at a minimum, that URLs to (at least some) media files have a unique (per build) hash applied to them as a URL parameter (?version=hash). While this is a reasonable request and perhaps should not have been closed so hastily, I am not convinced we need to add a feature for the reason outlined at the end of this comment.

However, before we get to that, there seems to be a lot of frustration expressed in the comments above. I have attempted to address why I responded the way I did so that we can avoid these types of issues in the future.

Wait, so you mean we need to have a rolling filename which changes with each build. That seems ridiculous

You made this assumption and stuck with it, but that's not what was suggested. Of course changing the file name every time is inferior. We have been talking about adding a hash of the file's content (sorry, the part about content was not explicitly written, though it should be clear anyway), so it would not change often.

It was clear early on that the feature request was misunderstood. And yet no clear explanation was given. So of course, my assumption was not readjusted. @oprypin most of my disagreements with you are, in my opinion, based on your assertions with nothing to back them up except "of course everyone already knows this." Perhaps some helpful links and or brief explanations would help alleviate that in the future.

So GatsbyJS clearly includes support out-of-the-box. However, it appears that Jekyll does not. Those two links are to hacks by users

That is true, but I don't think it is useful to say it. Is there any implied argument based on this? The only one I could guess would be (what I would definitely disagree with) "because there is one static framework not providing this solution (even though people need it), that means any static framework can disregard this".

It demonstrates that others do not see a need to include support despite your assertions to the contrary. True, it doesn't prove that no one needs this. But is does suggest that perhaps not everyone needs/cares about this. Assertions to the contrary do not persuade me, but tend to only make me dig in and push back harder.

For that matter, a third-party theme could implement that with no changes to MkDocs at all.

Why should each theme need to implement this, in an inferior way, if MkDocs could do it once? I just don't get it, what is the point of this comment?

This comment serves three purposes:

  1. If users really want/need this, nothing prevents them from getting this now.
  2. If third-party themes already are including hashes in their URLs and MkDocs suddenly adds them, that could create a problem. Therefore, adding support is more complicated that it might otherwise be. How do we provide a graceful transition for third-party themes which already add hashes to URLs?
  3. Given my view of the function MkDocs serves (see below), this fact demonstrates that we don't necessarily need to make any changes to MkDocs.

I get that we are already in a state where it's almost impossible to get any improvement to MkDocs through your filter. And that mkdocs-material is taking over development of all features that MkDocs is missing,

I think that it is great that a third-party theme is adding features. I see MkDocs as a basic framework upon which users can add their own themes and/or plugins to create a static site generator that meets there needs. In my opinion, the framework already exists so MkDocs is (nearly) feature complete. Most additional features users might want to add should be implemented via themes and or plugins. Occasionally, we may need to make a minor adjustment to an API so that a new feature can be added via a plugin/theme, but unfortunately, most feature requests are not framed this way.

Yes, it is disappointing that only one theme is implementing various features. However, I see that as a failure of the other themes rather than a failure of MkDocs.

Given the above, my "filter" is to ask if a requested feature is absolutely necessary for every user in every situation. If it is, we provide the minimum necessary to give the users that feature. If it is not, then we simply provide a mechanism for users to get that feature through third party themes and/or plugins. True, there are various existing features which do not meet that criteria. However, those features were added before we provided support for third-party themes or plugins. The goal moving forward is to move those features out of MkDocs and into themes and/or plugins. The only reason that hasn't happened yet is a lack of time on my part and a seeming lack of interest from others to help (people seem to be happy to submit PRs adding stuff, but never removing things).

So, coming back to the immediate issue, I'm not convinced that everyone needs this. Therefore, my inclination is to leave this as something which can be implemented by a third-party theme and/or plugin.

@waylan
Copy link
Member

waylan commented May 5, 2021

Moving on, lets consider possible solutions:

  1. We leave this for themes to address. An argument could be made that we should update the built-in themes to include those hashes where appropriate. This would provide no clashes with any existing themes which already do this. However, unless we added a theme specific setting, this would lock in the built-in themes to this one method only. And it would not help any existing third-party themes which are not already providing a solution.
  2. We leave this for third-party plugins to address. Such a plugin might use the on_files event and modify File.url for specific types of files. This would provide a clean method for all themes. If a theme already provides hashes, that theme can document that their users do not need to enable a plugin to get the desired behavior. And for those who are not using such a theme, they can still get the behavior if they desire. Also, there is no lock-in. For example, a plugin could modify File.url, file.dest_path, and File.abs_dest_path to include the hash right in the filename rather than as a URL parameter.
  3. We add hash generation directly to MkDocs (presumably when we initially assign a value to File.url). This will create conflict for any existing themes which are manually adding a hash in the template (there would now be two hashes separated by an extra ?). It will also lock in one very specific method of addressing the issue. Some users may want to not have hashes at all or they may want to use a different method to invalidate browser cache and now that option is not so easily available to them.

It seems clear to me that option 2 is the most flexible and can meet the needs of the most users. Multiple different plugins could exist to serve different needs, or perhaps one highly configurable plugin could exist which includes various options. The beauty is that that is entirely up to the creators of the plugin. And as the plugin is generally being used by people who have a use for it, it is more likely to be maintained without adding any burden to the maintenance of MkDocs itself.

Given the above, I am going to leave this issue closed.

@oprypin
Copy link
Contributor

oprypin commented Jul 8, 2021

I consider this an issue affecting most users and I think MkDocs is the best place to solve it. So, reopening.
I don't have any immediate plan, though.

@oprypin oprypin reopened this Jul 8, 2021
@develop-Greenant
Copy link

agreed that this would be good to solve with a relatively simple mechanism if possible.

For those with the same issue (users don't see latest content due to local browser cache), there is a simple workaround using meta tags.

note this is not a solution to the more general issue of cache busting but can help in a pinch!

add to <theme>/overrides/main.html

  <!-- prevent caching -->
  <meta http-equiv="cache-control" content="no-cache, must-revalidate, post-check=0, pre-check=0" />
  <meta http-equiv="cache-control" content="max-age=0" />
  <meta http-equiv="expires" content="0" />
  <meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT" />
  <meta http-equiv="pragma" content="no-cache" />
  <!-- aggressively cause a reload every X seconds -->
  <meta http-equiv="refresh" content="300">

Not all of these tags are strictly necessary but they ensure that:

  • compatibility with web servers (even those not fully compliant)
  • content not cached at browser
  • refresh happens of page every 5 min so that even page change without user refresh will be successful.

Note:

  • these changes may negatively impact your score on search engines (as refresh can be abused)
  • the refresh is visible but not too obtrusive
  • load on web server will increase significantly

ToDo:

  • consider proper solutions as proposed above
  • append details of how reverse-proxy can mitigate some of the undesirable side-effects.

Comment:
Ultimately, I agree that solution number 2 above will be a good approach. I also think that some general advice about web server config and use of meta tags will help solve the issues for a lot of users.

In summary, this tip is definitely not a solution, just a partial (and practical) workaround!

@pideu-mh
Copy link

pideu-mh commented Apr 7, 2022

I just want to let the participants of this discussion know that this is still a real-world problem.

I uploaded a new version of a documentation to a public website and noticed that the content was up-to-date, but the styling was off.

Long story short: after a lot of head scratching I noticed by inspecting the browser cache that the expiry time for the CSS files is 1 month(!) and that the browser was simply using the old, cached version together with the new content. By clearing the browser cache the problem was solved, however, I did not have to do that for - I don't know - 10+ years.

Thinking about it, this can really become a problem. If customers visit the page once in a while and I update the website in the meantime, it can happen that the customers have the same experience as me: content is up-to-date, but the styling is off.

If I am lucky, the customer complains and I can tell him to clear his browser cache, but this is not a good solution at all. I'd really like to avoid to have this problem in the first place by fixing it in the web server or in the content that is served.

@meenzen
Copy link

meenzen commented May 6, 2022

I've just encountered this problem in a way that hasn't been mentioned yet.

After deploying a new version of my docs, the newly added pages were showing as expected. However none of the new content could be found using the search. It turns out the browser was still using the now outdated search_index.json.

Because this file can be quite big, disabling caching is not an option. Instead you'll want to aggressively cache this file. Appending hashes would ensure that the browser uses the correct file in any caching setup.

@orientalpers
Copy link

I want this feature too! Appending hashes to extra css/js file is great solution

@wilhelmer
Copy link
Contributor

Thanks to @kamilkrzyskow, the mkdocs-minify-plugin now supports cache busting via the cache_safe option.

You can even use this option without minifying any files.

@oprypin
Copy link
Contributor

oprypin commented Oct 21, 2022

I would recommend to instead wait for MkDocs 1.5.0 which will have this feature.

bust

@wilhelmer
Copy link
Contributor

I know that it's planned for 1.5.0, it's for those who can't wait 😉

We will deprecate the option as soon as 1.5.0 is released.

@andreportela
Copy link

I am very interested in this feature.

Appending hashes at the end of cacheable resources when building a static site allows one to use an aggressive caching mechanism. It is helpful for me who want to build a static site because I can put a CDN on the front of my site, dial caching to the maximum, and not worry about caching invalidation.

This keeps the site blazing fast, keeps a minimum load on my server, and ensures I don't worry about users accessing stale content. Not having a way to aggressively cache search_index.json might be a deal breaker, depending on the context. For example, my static website will target users with a slow connection.

Is #3018 still a thing? I also checked mkdocs-minify-plugin, but it seems stalled.

@wilhelmer
Copy link
Contributor

mkdocs-minify-plugin isn't stalled – you should be able to use it with the cache_safe option set to true to append hashes to your JS and CSS files. JSON files aren't supported yet.

@oprypin
Copy link
Contributor

oprypin commented Jun 11, 2023

For the file search_index.json in particular, the solution for it will need to be totally separate - that is, it cannot benefit from a common implementation, but that is also a positive because it can be solved without needing to solve the general problem.

One just needs to edit the implementation of the search plugin to make it possible to customize the file path for search_index.json.

index_path = 'search_index.json';

And accordingly also this will need to be solved in the material/search plugin.
https://github.com/squidfunk/mkdocs-material/blob/1c22ca42f25544f45e9dddac09266de0ee87f0a8/src/assets/javascripts/bundle.ts#L114

@oprypin
Copy link
Contributor

oprypin commented Jun 18, 2023

@andrewschott
Copy link

@develop-Greenant geniously wrote:

  <!-- prevent caching -->
  <meta http-equiv="cache-control" content="no-cache, must-revalidate, post-check=0, pre-check=0" />
  <meta http-equiv="cache-control" content="max-age=0" />
  <meta http-equiv="expires" content="0" />
  <meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT" />
  <meta http-equiv="pragma" content="no-cache" />

Ty very much sir! This solved my issue with page updates not showing up until a browser refresh.

@oprypin oprypin modified the milestones: 1.5.0, 1.6.0 Jul 18, 2023
@oprypin
Copy link
Contributor

oprypin commented Jul 18, 2023

Apologies, I didn't manage to give my pull request the due attention and live testing that it requires, and I'll have to just defer it until the release that is after the upcoming release. I hope to make the next-next release soon, though (~month)

@pawamoy
Copy link
Sponsor Contributor

pawamoy commented Apr 17, 2024

This might already be supported by the minify plugin and its cache_safe feature, see https://github.com/byrnereese/mkdocs-minify-plugin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.