full-stack overflow

07 Oct 2018

Caching and the Least Recently Used Replacement Policy

Caching data is the process of saving data in memory that we will use again soon. It provides the advantage of speed, since a memory lookup is much faster than retrieving data from disk or over a network. In addition, caching data rather than fetching and processing it saves CPU cycles for serving more users.

The main thing to know is that caching (like most everything in computing) is a time-space tradeoff. If you have extra space, you can use it up to save time.

Caching is related to memoization: saving the result of expensive computations so that you do not have to perform them again. The general consensus from Wikipedia and StackOverflow is that caching is more general (can be for both input and output) and usually refers to saving the result of data that another service has computed, and that this output is non-deterministic (it varies with time).

Memoization would be saving the result of a factorial function so you don’t have to recalculate it for values you’ve already computed.

Caching would be saving the result that’s going to change, say a weather report that’s updated every hour.

You’ve experienced caching yourself (your browser is doing it right now). If you’re a developer, you’ve likely made changes to an existing script, hit refresh, and … nothing happens. Your browser’s stashed a local copy that it’s accessing instead of requesting it fresh from the server.

Caches are valuable tools, not only for browsers, but also for servers. Adding a cache to an existing webserver can often more fully utilize its memory and allow you to serve more traffic with better performance, at the same cost.

Limitations

Caching is limited by memory (you’ll run out eventually!) and the time-expiration of information. If you’re a weather website and you get reports from a station every 5 minutes, it’s fine to cache for that long, but after 5 minutes, you better request it fresh.

Eviction Strategies

An eviction strategy is simply: when the cache is full, and we add another item, how do we decide which element to remove? The best answer is: whichever element we’re never going to need again! This would, of course, require us to know the future, so it’s impossible. Instead, we can try to get close to the optimal based on probabilities.

There are a lot of strategies to remove items from a cache.

All these strategies that are not simply random typically consider two factors:

  1. Frequency of use: count the number of times a cache item is requested; discard the item with the lowest number of requests
  2. Time of last use: assign an “age” to each element each time an element is added to the cache; discard the “oldest” element

Example: Least-Recently Used (LRU)

One common cache strategy is least-recently used. When the cache is full and we’ve gotta add something new, get rid of the thing that’s the oldest thing in the cache. This is what that might look like:

var LRUCache = function (capacity) {
  this.capacity = capacity;
  this.size = 0;
  this.cache = {};
  this.keysByAge = [];
  this.hits = 0;
  this.misses = 0;
};

LRUCache.prototype.get = function (key) {
  if (this.cache[key]) {
    // Promote key to most recently used
    this.hits++;
    this.keysByAge = [key, ...this.keysByAge.filter((k) => k !== key)];
    return this.cache[key];
  } else {
    this.misses++;
    return -1;
  }
};

LRUCache.prototype.put = function (key, value) {
  if (this.size < this.capacity) {
    this.cache[key] = value;
    this.keysByAge.push(key);
    this.size++;
  } else {
    // evictAndPut
    const OLDEST = this.keysByAge.shift();
    delete this.cache[OLDEST];
    this.keysByAge = [key, ...this.keysByAge.filter((k) => k !== OLDEST)];
    this.cache[key] = value;
  }
};

We store key-value pairs in a hash table, so the lookup is O(1). To track the age of objects, we keep a separate array of keys, keysByAge. Each time an element is used, we promote the element to the front of the array. Each time an element is added, we push the element onto the array.

In this way, elements are ordered from most-recently-used to least-recently-used. When we try to add an element and the cache is at capacity, we simply take the last element in the keys array, delete the reference in the cache, and then add per usual.

This solution is suboptimal, since each get and put is O(n). The optimal solution would use a doubly-linked list instead of an array to maintain the age-ordered list of keys. The important thing to understand though, is the concept at play. You’re probably not going to code one of these yourself from scratch.

Cache benefits: a little demo.

If you use the code above, you can test it like this:

var cache = new LRUCache(20);
for (var i = 0; i < 100; i++) {
  cache.put(i, true);
  cache.get(Math.floor(Math.random() * i));
}
console.log("Hit ratio: " + cache.hits / (cache.hits + cache.misses));

This is not so scientific: just throw 100 elements into the cache, and after you do, randomly access elements that might exist (from 0 to i). Then measure how many hits there are.

I got data about like this for 100 elements. Plot these points and it’ll look logarithmic:

cache size hit ratio
1 0.01
2 0.07
4 0.19
8 0.21
16 0.48
32 0.71
64 0.94

The big takeaway here is that if you’ve got the space, saving 32% of your elements means you’ll have the element you need 70% of the time! If it takes 50ms to make a data-heavy network request when there’s a cache miss, this can save you 3.5 seconds per 100 requests, or an average of 35ms/request.

Caching allows us to bargain with time-space complexity: when time is important and space is inexpensive, they often offer a solution.

Caching in the real world

Like more or less everything that involves a doubly-linked list, don’t write it yourself (but it’s good to understand it). There are a lot of libraries out there that can do this stuff for you like Memcached, Redis, or Python's functools.