More on Using ETags and UUIDs for Internal Caching

A while back, I wrote about ETags and how they could be used within a web application for caching. Since then, I’ve had some time to work with using ETags and UUIDs in two applications—Iron Money and a yet-to-be-launched site.

Both of these sites use ETags and UUIDs for their resources; when a resource is created, it’s given a UUID and an ETag; when it’s updated, the ETag is changed. They also use Memcached for caching resources; when a resource is created, the Memcached key is the UUID and ETag; when the resource is updated, the resource is simply stored again. Memcached already throws out objects that haven’t been fetched recently, so the old resources eventually fall out of the cache.

When a resource is requested, both sites simply look up the ETag in the database and then check the cache to see if the resource is already stored. If it’s not, then they go through the regular process of fetching the resource from the database. This is particularly efficient because the database can easily have an index on the UUID and ETag, making it very cheap (less than a millisecond) to look up the most recent ETag before fetching from the cache.

For resources that are compromised of other resources (say, a list of accounts), the process is the same; the ETag for the main resource is fetched (the /accounts/ ETag), and Memcached stores a list of UUIDs for the resource (all of the account UUIDs). Then, each resource is fetched individually with the same process (get the ETag, check Memcached for the resource). This means that the resources are not stored multiple times in the cache, but the entire list of resources can still be cached.

Overall, this system works very well; most resources stay in cache and there are no concerns regarding ACID compliance. However, one issue that I have run into is with the aforementioned lists of resources. For small lists of resources, the system works very well; it takes about a millisecond to get the ETag for the list of resources, and then takes about a millisecond for each resource.

This, however, does not work well with very large lists of resources. Even with everything in cache, a millisecond for each resource in a list can take too long if there are 100 or more resources to fetch from the cache.

So what’s the solution? If you’re willing to trade memory for speed, then there’s a rather simply solution: simply store the ETags for each of the resources in a list. By storing the UUIDs and ETags for each resource in a list, you can avoid looking up each resource’s ETag, which will make everything faster.

0 Responses to “More on Using ETags and UUIDs for Internal Caching”


  • No Comments

Leave a Reply