Caching for Your Django Application Using Django-Redis | HashedIn

Caching for your Django Application using django-redis

Technology - 11 Jun 2017
Pansul Bhatt

This is the eleventh post on the Django blog series. In this post we will go through how to assess which cache can be used and how to use it. The inspiration of this post has mainly been, to introduce a caching mechanism in
Charcha Discussion Forum. The source code for the charcha forum is available here.

Why is this required?

Caching is one of the most useful techniques used by developers to improve the performance of their application. The challenges that most sites face is that they have to serve multiple clients and for obvious reason your application might have to show different data or templates to different clients based on your use-case. Lets take an example to better explain this, lets say you are an admin of Charcha and you are authorized to see all the comments and votes given to all the blogs. Now lets say a new user comes along and all he/she is authorized to do is see the blog and the top comment. According to this logic your server would have to make separate queries to the database and generate different templates based on your authorization (let’s just say permissions from now onwards). Now this is just a simple example using Charcha, if we go higher we might find that these sort of cases are really common. For a high traffic website you would basically be asking your server to handle x number of requests, making it perform n number of queries to generate some ‘y’ number of templates. We can all agree that for a high traffic sites the overhead can be pretty overwhelming. To counter this we introduce caching.

What caching would be doing is, it would be removing the burden of you having to do the queries again and again, rather it would just store the result and send it directly to the client. Now this all sounds pretty much on the upside but caching does have a lot of drawbacks.

Django already comes with a cache system where it lets you save the pages but that’s not it, django does much more than that. It provides you with different levels of cache granularity. In this blog we would discuss which cache system is best suited for us. We would tackle the advantages and disadvantages of the caches as well. Let’s try dividing all the cache that could be used as per the aforementioned levels of cache granularity. Do note that Charcha might not really require caching at all if the maintenance might be high. We have already introduced some caches which have already been mentioned in our seventh post of this series.

Setting up Cache in Django

Setting up cache in django is exceedingly easy. All we have to do is define what type of caching do we want to integrate in our settings.py (or common.py in Charcha), how long will it live for and where it can be stored.

Let’s tackle all the levels of cache granularity that django provides us.

1. Caching Database

Django provides us with the availability with saving the cached data to our database. This works swimmingly if we have a well-indexed database.

To set it up all we have to do is create the cache database table as given below:


This will make the table as per the expected setting of your django app. Now all we have to do is set the backend in our cache preference and set the location of the database table.


And we are done. This is not as effective and would probably require a lot of adjustments once we start using a lot of tables. Since there are better options available we are going to have a look at those.

2. Using Memcached

One of the most popular and efficient type of cache supported natively by Django is MEMCACHED. As the name suggest MEMCACHED is a memory based cache server. It can dramatically reduce the number of database queries a server has to do and increase the performance of an application by 10x.

Generally, database calls are fast but not fast enough, since the query takes CPU resources to process and data is (usually) retrieved from disk. On the other hand, an in-memory cache, like MEMCACHED, takes very little CPU resources and data can be directly picked up from the memory. Its not a query like structure like SQLrather MEMCACHED uses a key-value pair to get all the data, therefore for obvious reasons you go from a complexity of O(n) or O(n^2) to O(1).

There are a few ways to use MEMCACHED, you could individually install it (if you don’t already have it) by using the command below:


Once you have MEMCACHED installed, it is pretty easy to use it. All we have to do is call it in settings.py and we are done.


One of the most popular features of MEMCACHED is its ability to share its cache over multiple servers. That means that we can basically run MEMCACHED daemons on multiple machines, and the program will treat the group of machines as a single cache. How?


We can also check the behavior of our cache using:


It feels like we have been going over how great MEMCACHED is but it has a huge disadvantage. Due to its cache granularity level (being memory), if the server crashes you lose all your data as well and that’s where it hits you. You basically go down with your server and restarting your server would basically be like a clean slate on your MEMCACHED.

3. Using Django-redis

Its a valid point to note that Redis holds many advantages over MEMCACHED, the only disadvantage being Redis is at a more lower granular level than MEMCACHEDRedis offers clustering, and unlike MEMCACHEDsupport is provided out-of-the-box. Being built-in provides a more robust solution that is easier to administrate. It is also exceedingly fast and easy to use. It uses the same key-value pair as its opponent, so it’s not going to be that difficult to understand. Overall, I feel that both these caching systems would not hold that big of a performance improvement over the other, so it boils more towards how comfortable you are with between the two systems. Personally, I like Redis more as it is more easy to set up and gives us a wider range of possibilities. The area where Redis wins over its opponents is it data persistence, restore and high availability. This might not really make sense to use unless your data is important.

So lets download Redis in our application and see it in action using this. Alternatively, you can install Redisusing the commands below:


Lets run the server now using:


Redis also provides us with this awesome cli. We can get in this cli and start seeing all the keys which are getting stored as well.


One of the ways Django can be integrated with Redis is through django-redis to execute its commands in Redis. We can install django-redis using:


django-redis is also going to be easy to include as all we have to do is add it in our settings.py. Do note that Redis by default runs on port 6379, so we are going to point to that location in our settings.py for django-redis to listen to.


Charcha does not have that big of an overhead so adding django-redis might seem a bit like overkill and so did MEMCACHED but for implementation purposes and for more heavy traffic sites we want to give you an example on how to use django-redis.

Let’s try seeing how well our application runs when we make a lot of requests at the same time. We can use loadtest here to make concurrent calls and assess the performance of django-redis.


Before django-redis in loadtest
Before django-redis in loadtest

So the total time taken to load is about 7 seconds. This is pretty commendable but can be optimized a bit further. We need to start analysing why does it take this long and what all views are getting called. For this we would use django-toolbar.

When we look at what the code is doing, we can make decisions on how to make changes to improve the performance. I can see that there are a lot of the same queries happening for the same requests. All I have to do is add:


and reap all the benefits of django-redis. You can see the output below yourself. Stand back ladies and gentlemen, analysing benchmarking is happening.

After django-redis in loadtest
After django-redis in loadtest

Yup, thats right. We have achieved a response of 2 seconds. The overhead is not that much so using django-redis here is more of a call we need to take.

4. Using Varnish

Varnish is a caching HTTP reverse proxy. It’s always going to be in front of your server be it Apache or Nginx. The way varnish works is , it helps in caching your static pages. The problem with varnish is that it cannot cache dynamic content. If your site is not as dynamic, you could make good use of varnish but you should always check which all views which can be cached and then cache them. Second, identify how long of a delay you could tolerate someone seeing stale content. The best thing about django and varnish is how well they both fit together.

Setting up varnish is also really easy to do:


Let’s take a page and see its performance and how it is working (for this we will take the discussion page).

Before Varnish
Before Varnish

How will Varnish help? What does it do? Basically, Varnish is going to sit between your django and your users. So, all we have to do now is add the cache_page decorator on our view and it is going to do everything for us. Let’s try applying it in the upvote_post and see what happens.


After Varnish
After Varnish

What just happened? Well, varnish was waiting for the response from upvote_post and when you made a server call to the function, it held the response with itself. Now the next time we made the call, without having to go to the server view again, ‘varnish’ just sent back the response.

To be more secure we could add a cookie header with the request so we could have some security at this level as well.

This entire implementation is what is called as the per-view cache. To explain it in a more lehmann term, we are basically storing/caching all the responses from the views individually.

Varnish also has its own configuration language, this can be used in places for normalization where the endpoints are not different based on the user’s authentication. How?


We can go ahead with Varnish for now and start caching the views at least. This has a lot of documentation and I urge you to read up on it.

Summary

Caching as previously mentioned is a way of reducing the server load and improving the performance of your application. Although most caches assiduously keep on working to reduce the dependency on your server, one must always keep in mind the overhead of the cache as well. Charcha might not require that many levels of caches as compared to high traffic sites. If we take an example, where Charcha becomes something like stackoverflow we can start adding cache to reduce the server response, something like using Redis/django-redis as a store for storing objects, result of DB queries and use Varnish for serving our static pages. It really depends on your use-case and what you require your cache to do. For now in Charcha, the previous implementation for caching as mentioned in the seventh post would do just fine.

Free tag for commerce

E-book on Digital Business Transformation