Determining how content on your site should be cached isn't a simple topic. Last time, I covered cache contexts and tags. Today, I'd like to get into a couple more advanced topics: The use of custom cache tags and of max-age.
Custom Cache Tags
Drupal's built-in selection of cache tags is large, and some contributed modules add additional tags appropriate to what they do, but for really refined control you might want to create your own custom cache tags. This is, surprisingly, quite easy. In fact, you don't have to do anything in particular to create the tag – you just have to start using it and, somewhere in your code, invalidate it:
As an example, to continue the scenario from part one about a page showing recent articles, there is one thing about this page that the tags we've already looked at don't quite cover. What if a new article gets created with a publish date that should be shown on your page? Or, maybe an article which isn't currently displayed has its publish date updated, and now it should start showing up? It's impractical to include a node-specific tag for every article that might possibly have to show up on your page, especially since those articles might not exist yet. But we do want the page to update to show new articles when appropriate.
The solution? A custom cache tag. The name of the tag doesn't matter much, but might be something such as my_module:article_date_published. That tag could be added on the page, and it could be invalidated (using the function above) in a node_insert hook for articles and in a node_update anytime that the Date Published field on an article gets changed. This might invalidate the cached version of your page a little more frequently than is strictly necessary (such as when an article's publish date gets changed to something that still isn't recent enough to have it show up on your custom page), but it certainly shouldn't miss any such updates.
This is a simple example of a custom cache tag, but they can be used for many other situations as well. The key is to figure out what conditions your content needs to be invalidated in and then start invalidating an appropriate custom tag when those conditions are met.
Rules for Cache Max-Age
Setting a maximum time for something to be cached is sort of a fallback solution – it's useful in situations where contexts and tags just can't quite accomplish what you need, but should generally be avoided if it can be. As I mentioned in a previous article, a great example of this is content which is shown on your site but which gets retrieved from a remote web service. Your site won't automatically know when the content on the remote site gets updated, but by setting a max-age of 1 hour on your caching of that content, you can be sure your site is never more than an hour out of date. This isn't ideal in cases where you need up-to-the-minute accuracy to the data from the web service, but in most scenarios some amount of potential "delay" in your site updating is perfectly acceptable, and whether that delay can be a full day or as short as a few minutes, caching for that time is better than not caching at all.
However, there is one big caveat to using max-age: It isn't directly compatible with the Internal Page Cache module that caches entire pages for anonymous users. Cache contexts and tags "bubble up" to the page cache, but max-age doesn't. The Internal Page Cache module just completely ignores the max-age set on any parts of the page. There is an existing issue on Drupal.org about potentially changing this, but until that happens, it's something that you'll want to account for in your cache handling.
For instance, maybe you have a block that you want to have cached for 15 minutes. Setting a max-age on that block will work fine for authenticated users, but the Internal Page Cache will ignore this setting and, essentially, cause the block to be cached permanently on any page it gets shown on to an anonymous user. That probably isn't what you actually want it to do.
You have a few options in this case.
First, you could choose to not cache the pages containing that block at all (using the "kill switch" noted in part one). This means you wouldn't get any benefit from using max-age, and would in fact negate all caching on that page, but it would guarantee that your content wouldn't get out of date. As with any use of the "kill switch," however, this should be a last resort.
Second, you could turn off the Internal Page Cache module. Unfortunately, it doesn't seem to be possible to disable it on a page-by-page basis (if you know a way, please drop us a line and we'll update this post), but if most of your pages need to use a max-age, this may be a decent option. Even with this module disabled, the Internal Dynamic Page Cache will cache the individual pieces of your page and give you some caching benefits for anonymous users, even if it can't do as much as both modules together.
My preferred option for this is actually to not use a max-age at all and to instead create a custom, time-based cache tag. For instance, instead of setting a max-age of 1 hour, you might create a custom cache tag of "time:hourly", and then set up a cron task to invalidate that tag every hour. This isn't quite the same as a max-age (a max-age would expire 1 hour after the content gets cached, while this tag would be invalidated every hour on the hour) but the caching benefits end up being similar, and it works for anonymous users.
Now that we've gotten an overview of how to determine what rules you should use to cache content on your site, it's time to get a little bit more technical. Next time, I'll be taking a look at how Drupal stores and retrieves cached data, which can be immensely useful to understanding why the cache works the way it does, and it's also quite helpful to know when fixing any caching-related bugs you might encounter. Watch this blog for all the details!