Thursday, October 28, 2010

Static cache

Let's try to resolve a pretty regular case - some kind of key-value static cache. If there is no stored value for the specified key, the cache will call some external routine to get value and add it to the internal dictionary.

Here is a naive approach of the cache. It is straightforward enough.

public class Cache<TKey, TValue>
{
  Func<TKey, TValue> _resolver;
  Dictionary<TKey, TValue> _cache = new Dictionary<TKey, TValue>();

  public Cache(Func<TKey, TValue> resolver)
  {
    _resolver = resolver;
  }

  public TValue this[TKey key]
  {
    get
    {
       TValue value;
       if (!_cache.TryGetValue(key, out value))
       {
         value = _resolver(key);
         _cache.Add(key, value);
       }
       return value;
    }
  }
}

And also somewhere in a client code:
class ClientCode
{
  static Cache<TKey, TValue> _cache = new Cache<TKey, TValue>(Resolver);
}
Where Resolver is an user-defined function to get a value from the outer source.



Hope, you see where the approach is weak - it is not thread-safe. The naive code is not protected from concurrent writing and reading. Let's try to fix our class.

From the first view, it is enough to wrap whole getter in a lock:

class ClientCode
  public TValue this[TKey key]
  {
    get
    {
       lock(_monitor)
       {
           TValue value;
           if (!_cache.TryGetValue(key, out value))
           {
             value = _resolver(key);
             _cache.Add(key, value);
           }
           return value;
      }
    }
  }

_monitor is just a private field initialized with new object.

This solution makes our class thread-safe, but let's imagine the following use case. Let's assume that our external routine (_resolver) is very slow. In this case the _cache becomes locked for very long time in cases of cache misses. Moreover, in case of cache miss, the cache will be locked even for reading.

To resolve the reading locking issue, some inexperienced developer could try to use double-checked locking pattern:

Invalid solution
public TValue this[TKey key]
{
 get
 {
  TValue value;
  if (!_cache.TryGetValue(key, out value))
  {
   lock (_monitor)
   {
    if (!_cache.TryGetValue(key, out value))
    {
     value = _resolver(key);
     _cache.Add(key, value);
    }
   }
  }
  return value;
 }
}

While double-checked locking pattern is a good one for lazy initializations, it never should be used for manipulations with collections (I'll post about it a bit later, in two words - while writing is under lock, reading is not).

So, the next choice could be Read-Write lock, but there is a better solution for this particular case. Let's look on the locking issue more deeper.

The issue is that cache values are actually not coupled with each other, but all of these items share the same lock. Another issue is that potential slow routine (_resolver) is called under this shared lock. So let's separate these locks - first lock will be used to protect _cache and then let's use individual locks for every cached value and lock this individual locks only if the value is not yet fetched from the _resolver. Here is the mentioned solution:

public class Cache<TKey, TValue>
{
  object _monitor = new object();
  Func<TKey, TValue> _resolver;
  Dictionary<TKey, CacheItem> _cache = new Dictionary<TKey, CacheItem>();

  public Cache(Func<TKey, TValue> resolver)
  {
   _resolver = resolver;
  }

  public TValue this[TKey key]
  {
   get
   {
    CacheItem item;
    lock (_monitor)
    {
     if (!_cache.TryGetValue(key, out item))
     {
      item = new CacheItem();
      _cache.Add(key, item);
     }
    }
    return item.GetValue(key, _resolver);
   }
  }

  internal class CacheItem
  {
   readonly object _itemMonitor = new object();
   private volatile bool _isInitialized;
   private TValue _value;

   public TValue GetValue(TKey key, Func<TKey, TValue> resolver)
   {
    if (!_isInitialized)
    {
     lock (_itemMonitor)
     {
      if (!_isInitialized)
      {
       _value = resolver(key);
       _isInitialized = true;
      }
     }
    }
    return _value;
   }
  }
}
Let's track the flow:
  1. The _cache is protected by _monitor. Meantime, it is locked only for the very quick operation - get or create a cache item.
  2. Every cache item has its individual lock.
  3. The cache item is protected from being initialized twice by using double-checked locking pattern and it perfectly fits here.
Note: The class could be slightly simplified by using Lazy<T> from .NET 4.0.

public class Cache<TKey, TValue>
private Dictionary<TKey, Lazy<TValue>> _cache = new Dictionary<TKey, Lazy<TValue>>();
public TValue this[TKey key]
{
 get
 {
  Lazy<TValue> item;
  lock (_monitor)
  {
   if (!_cache.TryGetValue(key, out item))
   {
    item = new Lazy<TValue>(() => _resolver(key), LazyThreadSafetyMode.ExecutionAndPublication);
    _cache.Add(key, item);
   }
  }
  return item.Value;
 }
} 

3 comments:

  1. Nice article. But it has the following problems:

    - The purpose of the double-checked lock in the CacheItem.GetValue() is unclear. I don't know whether double-checked lock works as expected in .net, but it is broken under almost any programming language. See http://c2.com/cgi/wiki?DoubleCheckedLockingIsBroken for details. In short, compiler, execution environment or CPU can swap assignment to _value with assignment to _isInitialized on lines 43 and 44 in the CacheItem.GetValue(). In this case the following race condition is possible:
    1) Thread 1 sets _isInitialized to true under the lock.
    3) Thread 2 notifies that the _isInitialized is set to true outside the lock (line 37) and returns uninitialized value from the CacheItem.GetValue().

    - It would be worth noting about potential problems of homegrown caches. Careless implementations of such caches usually lead to "Out of memory exception" and "99% of CPU in GC" lockups.

    ReplyDelete
  2. 1. When CacheItem is initialized, there is no need to synchronize access to its Value. You're right about careful usage of double checked lock pattern, so I've added "volatile" modifier to _isInitialized field. .NET guarantee that this combination of volatile and lock will be treadsafe.
    2. Guess you say about careless usage rather careless implementation. Sure, this cache is stupid and has no cleaning strategy. But it is still nicer than Singltones.

    ReplyDelete
  3. 1. Double-check locking for each CacheItem is useless premature optimization. It would be better optimizing the global _monitor lock inside the Cache implementation if profiling under real-world load points to this "bottleneck". The most obvious optimization is splitting the Cache._cache dictionary into an array of dictionaries with distinct locks.

    2. It depends on the choice for cache purging strategy. The strategy can be implemented either inside the cache implementation or left as an exercise for cache users :) Users usually fail to implement cache purging strategy correctly, so eventually their application will hang with "100% CPU in GC" and never-ending "Out of memory" exceptions.

    ReplyDelete