Cacheback is an extensible caching library that refreshes stale cache items asynchronously using a Celery task. The key idea being that it’s better to serve a stale item (and populate the cache asynchronously) than block the user in order to repopulate the cache synchronously.
Using this library, you can rework your views so that all reads are from cache - which can be a significant performance boost.
A corollary of this technique is that cache stampedes can be easily avoided, avoiding sudden surges of expensive reads when cached items becomes stale.
Cacheback provides a decorator for simple usage, a subclassable base class for more fine-grained control and helper classes for working with querysets.
Consider a view for showing a user’s tweets:
from django.shortcuts import render
from myproject.twitter import fetch_tweets
def show_tweets(request, username):
return render(request, 'tweets.html',
{'tweets': fetch_tweets(username)})
This works fine but the fetch_tweets function involves a HTTP round-trip and is slow.
Performance can be improved by using Django’s low-level cache API:
from django.shortcuts import render
from django.cache import cache
from myproject.twitter import fetch_tweets
def show_tweets(request, username):
return render(request, 'tweets.html',
{'tweets': fetch_cached_tweets(username)})
def fetch_cached_tweets(username):
tweets = cache.get(username)
if tweets is None:
tweets = fetch_tweets(username)
cache.set(username, tweets, 60*15)
return tweets
Now tweets are cached for 15 minutes after they are first fetched, using the twitter username as a key. This is obviously a performance improvement but the shortcomings of this approach are:
Now, consider an alternative implementation that uses a Celery task to repopulate the cache asynchronously instead of during the request/response cycle:
import datetime
from django.shortcuts import render
from django.cache import cache
from myproject.tasks import update_tweets
def show_tweets(request, username):
return render(request, 'tweets.html',
{'tweets': fetch_cached_tweets(username)})
def fetch_cached_tweets(username):
item = cache.get(username)
if item is None:
# Scenario 1: Cache miss - return empty result set and trigger a refresh
update_tweets.delay(username, 60*15)
return []
tweets, expiry = item
if expiry > datetime.datetime.now():
# Scenario 2: Cached item is stale - return it but trigger a refresh
update_tweets.delay(username, 60*15)
return tweets
where the myproject.tasks.update_tweets task is implemented as:
import datetime
from celery import task
from django.cache import cache
from myproject.twitter import fetch_tweets
@task()
def update_tweets(username, ttl):
tweets = fetch_tweets(username)
now = datetime.datetime.now()
cache.set(username, (tweets, now+ttl), 2592000)
Some things to note:
This pattern of re-populating the cache asynchronously works well. Indeed, it is the basis for the cacheback library.
Here’s the same functionality implemented using a django-cacheback decorator:
from django.shortcuts import render
from django.cache import cache
from myproject.twitter import fetch_tweets
from cacheback.decorators import cacheback
def show_tweets(request, username):
return render(request, 'tweets.html',
{'tweets': cacheback(60*15, fetch_on_miss=False)(fetch_tweets)(username)})
Here the decorator simply wraps the fetch_tweets function - nothing else is needed. Cacheback ships with a flexible Celery task that can run any function asynchronously.
To be clear, the behaviour of this implementation is as follows:
Much of this behaviour can be configured by using a subclass of cacheback.Job. The decorator is only intended for simple use-cases. See the Sample usage and API documentation for more information.