Tuesday, June 9, 2009

Caching the non-existence of data

Generally, when you look an item up in a cache, you:
  • Check the cache for the item
  • If it's not found, go get the item from the backing store
  • Put the item into the cache, and return it to your caller
But what should the behavior be when the caller wants to look for a non-existent item?

A simple implementation is to use the same algorithm as above, except that the last step then becomes:
  • Return to your caller the fact that no such item was found.
This is a fine implementation, and probably works well for many cases.

However, if (a) it's expensive to go get the item from the backing store, and (b) requests for non-existent items are routine, then you may find this implementation unsatisfactory.

Instead, you might want to consider caching the non-existence of the item, by which I mean you could have your cache have a special cache entry which allows you to hold a "no such item with this key exists" token, so that when your caller asks for an item which does not exist, you can satisfy that request from your cache.

There are a variety of details that exist in this implementation, but I think that the basic idea is sound. For the cache that I maintain at work, I'm considering whether I need to pursue this idea, because I have some reason to believe that some of my performance issues are due to requests for non-existent data.

But first I'm going to try to figure out why my callers are asking for the non-existent data routinely, because I don't think that should be a common behavior, and might indicate a bug in the callers.

Yet it got me wondering: how do other caches behave? Is it fairly routine for a modern object cache implementation to cache information about non-existing objects? Or do most caches only cache information about existing objects?

1 comment:

  1. Caching the non-existance of an entity, generally referred to as negative caching, is pretty common actually.

    -Brian

    ReplyDelete