Thoughts on an invalidation protocol

Mark Nottingham <mnot@xxxxxxxxxxxxx> · Wed, 29 Nov 2006 16:09:29 -0800

One of the requirements I'm coming up against a lot, especially in  
(but not limited to) accelerator deployments, is the need for greater  
control over the cache; Web developers want the full benefits of  
caching, but they can't predict when a response will become stale.

There are already a few ways to do this now; ESI <http://www.w3.org/ 
TR/esi-invp> defines an XML format for invalidating the cache (AFAIK  
only Oracle does this); many caches support a PURGE (or similar)  
method, the MediaWiki folks are using multicast HTCP <https:// 
wikitech.leuksman.com/view/Multicast_HTCP_purging>, and I seem to  
recall a previous round of attempted standardisation that ended with  
a whimper <http://www.ietf.org/html.charters/OLD/webi-charter.html>.

There are at least three problems with these approaches;

1) They require the server to be aware of and push invalidations to  
the caches. While that's fine for a small deployment in the same  
place and administered by the same party, it becomes unworkable if  
there are a large number of caches, they're geographically disparate,  
and/or the origin servers are owned by different people than the caches.

2) They aren't compatible with the HTTP freshness model. To be able  
to invalidate something implies that it can be cached, and that means  
giving it some sort of freshness lifetime. However, the freshness  
lifetime you give something you know you can invalidate is probably a  
lot longer than the one you'd give something that you couldn't, so in  
practice people will give overly optimistic freshness lifetimes,  
which may cause problems with downstream caches.

3) They aren't fault-tolerant. Similarly, if there's a communication  
failure, the cache has a fresh response that they'll serve despite  
the server's wishes to invalidate it. While sometimes that's OK, in  
other situations it can be a problem.

I've been thinking about an invalidation protocol for Web caches that  
addresses these problems.

It would work by associating responses with an "event channel" which  
carries invalidations (to start with; other things like content  
updates might also be interesting).

While such a response is fresh, it would be served as normal;  
however, once it becomes stale, its freshness would be extended,  
provided that:
1) The cache is "in contact" with the channel (the meaning of which  
depends on the channel)
2) The channel does not contain an invalidation event that applies to  
that response

E.g., a response might have:
  Cache-Control: max=age=30 event-channel="http://www.example.com/ 
events/channel1"

This indicates that caches -- whether or not they understand this  
protocol -- can consider the response for 30 seconds; however, those  
that do understand it can extend its freshness while in contact with  
the channel with the URI "http://www.example.com/events/channel1"; and  
it doesn't say to invalidate this response.

This approach has a few interesting properties.

1) New caches can support invalidations without explicit coordination  
with the server; they only need to be able to subscribe to the  
channel and receive updates from it. Unless the channel mechanism  
requires it, the server doesn't need to keep any state about the  
client (see below).

2) Caches that don't understand this Cache-Control extension will  
still behave correctly.

3) If a cache loses touch with the channel, it can still fall back to  
"default" cacheability (30 seconds in this case, but it can be  
anything from 0 to a year, as per HTTP).

Effectively, the max-age reflects the maximum amount of time the  
server is comfortable serving that response; invalidations received  
during that time may not take effect (because of a downstream,  
invalidation-ignorant cache, or because of optimisations in that cache).

The other interesting thing is that since a channel is identified by  
a URI, any number of protocols can be used to get the events to the  
cache. If you want to use Multicast or a Message Bus to transport  
them, you'll have to figure out a URI scheme for that, but even more  
interesting, you can use a Atom <http://www.ietf.org/rfc/rfc4287>  
feed to do it using plain HTTP. E.g.,

   <?xml version="1.0" encoding="utf-8"?>
   <feed xmlns="http://www.w3.org/2005/Atom";
         xmlns:inv="TBD">
     <title>Example Invalidation Feed</title>
     <link rel="self" href="http://example.org/inv.atom"/>
     <updated>2003-12-13T18:30:02Z</updated>
     <author>
       <name>John Doe</name>
       <email>jdoe@xxxxxxxxxxx</email>
     </author>
     <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
     <inv:lifetime>2678400</inv:lifetime> <!-- 31 days -->
     <inv:precision>30</inv:precision> <!-- 30 seconds -->

     <entry>
       <title/>
       <link href="http://example.org/image.gif"/>
       <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
       <updated>2003-12-13T18:42:02Z</updated>
     </entry>

     <entry>
       <title/>
       <link href="http://example.org/other/thing.html"/>
       <id>urn:uuid:0777D8BE-5229-44C7-A766-1A8A50007190</id>
       <updated>2003-12-13T18:30:15Z</updated>
     </entry>
   </feed>

Again, there are several interesting properties here.

a) Since it's just HTTP, the Atom feed documents that represent the  
channel can themselves be cached, so that they can be shared  
efficiently (and a bit more reliably) between a cluster or hierarchy  
of caches.

b) Yes, the cache(s) will need to poll the Atom feed, but the poll  
frequency can be attenuated to make that workable while still  
delivering invalidations; e.g., if a channel represents an entire Web  
site (or several sites), and a cahe polls once every 10 seconds, that  
should be a reasonable load for the origin server.

c) Since it's just plain HTTP and an XML format, most any site can  
produce an event feed with widely available tools.

There are a lot of other issues to cover (e.g., invaliding groups of  
URIs, updating content, interaction with other freshness controls)  
but I would very much like to get feedback first on whether this is  
an interesting direction to go in, if there are any huge problems I  
don't see, and whether this would be a useful facility in Squid.

In particular, this approach is "looser" than some other approaches;  
events may take some time to propagate (which is unavoidable in a  
distributed system, but here it takes longer than some may be  
comfortable with), certainly isn't atomic (e.g., you will see  
situations where one cache will have a different idea of a response's  
freshness as compared to another), will fall back to the "default"  
cacheability on any kind of error in the invalidation channel, and so  
on.

Thoughts?

--
Mark Nottingham
mnot@xxxxxxxxxxxxx