Cache busting
Caching CSS, JS, images and other website resources on the client is a standard for quality websites. If you want your site to be fast, you should be doing browser caching using ETag
and Cache-control
or Expires
headers. If you are not yet caching your website’s assets client-side, hit the Google’s performance guide for in-depth info.
Unfortunately, caching creates a problem. When you update the CSS, JS, or other resources, the user’s browser won’t download the updated files until the specified caching period is over. For CSS and JS, I usually set the caching period to be 1 year. Without cache busting, the user would have to wait 1 year or push refresh (aka. F5) to see the changes. And users won’t go around pushing F5 on any website they visit. That’s just silly :)
Luckily invalidating (or busting) cached website resources, such as CSS & JS, is not hard: serve the asset under a new, unique name.
For example, if your CSS file was previously called style.css
, you can rename the new version to style-v2.css
and change the href
of <link>
to style-v2.css
in your HTML manually.
Automatic cache busting
Renaming the assets everytime manually (as mentioned above) is doable if the project is small and you have the nerves. But it’s actually simple to automate.
Many Web frameworks – such as Ruby on Rails – actually already provide automatic cache busting. If you’re using such a framework, you only have to configure it. There are also libraries and tools made for busting the cache. mod_pagespeed is one. I’ve used it successfully in the past for specifically this.
If you want to create your own automatic simple cache busting, I’ll explain my technique for automatic cache busting with WordPress and PHP.
How to: Automatic cache busting in WordPress
First of all, I have my asset (a CSS file) located at /wp-content/themes/ajk/css/main.css
.
Add this auto_version()
function into functions.php
of your theme:
Add a <link>
to the CSS file into the WordPress theme’s header.php
with auto_version()
like this:
<link rel="stylesheet" href="<?php echo auto_version(get_template_directory_uri() . '/css/main.css'); ?>">
And you should see this in your HTML source:
<link rel="stylesheet" href="/wp-content/themes/ajk/css/main.1361054840.css">
See that 1361054840
? That’s a unique identifier. It will change everytime the file changes.
But this isn’t enough, this alone won’t work. This won’t load our CSS at all, because a CSS file with numbers like that doesn’t exist.
Add a rewrite rule to a .htaccess
file (Apache only):
# CSS/JS auto-versioning RewriteEngine On RewriteRule ^(.*)\.[\d]{10}\.(css|js)$ $1.$2 [L]
(Note: I only have (css|js)
written on the RewriteRule
line there, but you should also cache and cache bust images, webfonts and so forth)
Now when the browser asks for main.1361054840.css
Apache will serve the main.css
file!
How to: Automatic cache busting in other environments
If you understand PHP and Apache rewrite rule syntax, you can easily port this technique to other platforms. A co-worker recently did this for a .NET project based on the description above:
- He created a C# version of the
auto_version()
shown above - He then created a
HttpHandler
in C#/.NET for recognizing the unique identifier and serving the actual resource file
Some notes
Build process busting
If you happen to have a build process for your project, changing the asset’s name and updating the URLs could be done in the build phase. No need to waste CPU cycles on-demand, like with the .htaccess
method or a HttpHandler
. For example, Ruby on Rails has a task for precompiling the assets, which will create “copies” of the asset under a new, unique name.
Hash instead of timestamp
Google’s guide for optimizing the cache tells you that a hash is more reliable than a last modified timestamp. The timestamp method is easier to implement, but timestamps can be wrong on different systems, or may not even get updated when you deploy, or update, or whatever… So if you want to go pro, you should definitely use a hash instead of the last modified timestamp. My auto_version() is just a simple example and it gets the job done for me.
Query strings
Some people suggest adding a query string to the resource URL. Like here:
<link rel="stylesheet" href="/css/styles.css?v2">
Or:
<link rel="stylesheet" href="/css/styles.css?1234567890">
This will definitely bust the cache, but it might have some unwanted consequences: atleast some proxies will not cache URLs that have query strings in them.
Summary
Pretty much every website should be optimized for speed. Especially considering the rising mobile devices usage these days. Browser caching is one of the ingredients for speed, but it creates the problem of “how-to invalidate the cache”. And this is easy: rename the cached resources & optionally automate this renaming.
I’m happy to take on any critique, comments, suggestions and the sort! Send me a tweet or comment below.