TL;DR: We are releasing our fork of django-memoize with a feature to update cache at zero downtime (scroll to the end to see how to use it)

Sometimes API functions take a long time to run. A normal practice is to pre-calculate the result and cache it so that subsequent API calls can get the result from cache.

For example, one of Portcast API is for getting the average prediction accuracy for the past 30 days. The calculation could take a couple of minutes because it goes through all the predictions made within the past month. Our API response time would take a hit if we do this at runtime. Fortunately, this metric does not need to be real time. It is ok for this number to be updated only couple of times a day.

In this case, we can calculate the 30-day average accuracy in advance and cache it for the next couple of hours until we calculate it again. All the accuracy API requests would look into the latest cache and return the cached values instead of doing the calculations on-the-fly.

We use a library called django-memoize to cache our results. It remembers what the result is when a function is called with a set of parameters. The behaviour of the library can be seen in the following example:

Python 3.7.4 (default, Jul  9 2019, 18:13:23) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from memoize import memoize, delete_memoized
>>> @memoize(timeout=60)
... def cached_function(x):
...     print('recalculating')
...     return x
... 
>>> print(cached_function('value'))
recalculating
value
>>> print(cached_function('value'))
value
>>> print(cached_function('value_2'))
recalculating
value_2
>>> delete_memoized(cached_function, 'value')
>>> print(cached_function('value'))
recalculating
value
  1. A timeout can be provided in seconds so that cache is only valid for a period of time
  2. If the function has not been invoked with a set of parameters before, it invokes the function logic as is and returns the result
  3. If the function has been invoked with the exact parameters before the the previous cache has not expired, it returns the cached result and the function logic is not invoked at all
  4. You can force delete a previous cache even before the expiration time so that the subsequent call to the function is invoked as is

Problem statement

The natural way of updating the cache in this case would be to delete the old cache and then run the function again so that new cache is created.

>>> delete_memoized(cached_function, 'value')
>>> print(cached_function('value'))
recalculating
value 

It introduces a downtime between when old cache is deleted and when new cache is ready. The thing is we are caching the results of these functions exactly because they take a long time to run. Therefore, this downtime is not negligible.