In this post, we will go through how to assess which cache can be used and how to use it. The inspiration of this post has mainly been, to introduce a caching mechanism in
Charcha Discussion Forum. The source code for the Charcha forum is available here.
Why is this required?
Caching is one of the most useful techniques used by developers to improve the performance of their application. The challenges that most sites face is that they have to serve multiple clients and for an obvious reason your application might have to show different data or templates to different clients based on your use-case.
Let’s take an example to better explain this, let’s say you are an admin of Charcha and you are authorized to see all the comments and votes given to all the blogs. Now let’s say a new user comes along and all he/she is authorized to do is see the blog and the top comment. According to this logic, your server would have to make separate queries to the database and generate different templates based on your authorization (let’s just say permissions from now onwards).
Now, this is just a simple example using Charcha, if we go higher we might find that these sort of cases are really common. For a high traffic website, you would basically be asking your server to handle X number of requests, making it perform N number of queries to generate some ‘y’ number of templates. We can all agree that for high traffic sites the overhead can be pretty overwhelming. To counter this we introduce caching.
What caching would be doing is, it would be removing the burden of you having to do the queries, again and again, rather it would just store the result and send it directly to the client. Now, this all sounds pretty much on the upside but caching does have a lot of drawbacks.
Django already comes with a cache system where it lets you save the pages but that’s not it, Django does much more than that. It provides you with different levels of cache granularity. In this blog, we would discuss which cache system is best suited for us. We would tackle the advantages and disadvantages of the caches as well. Let’s try dividing all the cache that could be used as per the aforementioned levels of cache granularity.
Do note that Charcha might not really require cache at all if the maintenance might be high. We have already introduced some cache which has already been mentioned in our one of our posts.
Setting up Cache in Django
cache in django is exceedingly easy. All we have to do is define what type of
caching do we want to integrate in our
common.py in Charcha), how long will it live for and where it can be stored.
Let’s tackle all the levels of cache granularity that django provides us.
1. Caching Database
Django provides us with the availability of saving the cached data to our database. This works swimmingly if we have a well-indexed database.
To set it up all we have to do is create the cache database table as given below:
This will make the table as per the expected configuration of your django app. Now, all we have to do is set the backend in our cache preference and set the location of the database table.
And we are done. This is not as effective and would probably require a lot of adjustments once we start using a lot of tables. Since there are better options available we are going to have a look at those.
[/et_pb_text][et_pb_text admin_label=”2. Using Memcached” _builder_version=”3.0.86″ background_layout=”light”]
2. Using Memcached
One of the most popular and efficient types of
cache supported natively by Django is
MEMCACHED. As the name suggest
MEMCACHED is a memory based cache server. It can dramatically reduce the number of databases queries a server has to do and increase the performance of an application by 10x.
Generally, database calls are fast but not fast enough, since the query takes CPU resources to process and data is (usually) retrieved from disk. On the other hand, an in-memory cache, like,
MEMCACHED takes very little CPU resources and data can be directly picked up from the memory. It’s not a query like structure like
MEMCACHED uses a key-value pair to get all the data, therefore for obvious reasons you go from a complexity of O(n) or O(n^2) to O(1).
There are a few ways to use,
MEMCACHED you could individually install it (if you don’t already have it) by using the command below:
Once you have
MEMCACHED installed, it is pretty easy to use it. All we have to do is call it in
settings.py and we are done.
One of the most popular features of
MEMCACHED is its ability to share its cache over multiple servers. That means that we can basically run
MEMCACHED daemons on multiple machines, and the program will treat the group of machines as a single cache. How?
We can also check the behavior of our
It feels like we have been going over how great
MEMCACHED is but it has a huge disadvantage. Due to its cache granularity level (being memory), if the server crashes you lose all your data as well and that’s where it hits you. You basically go down with your server and restarting your server would basically be like a clean slate on your
3. Using Django-redis
It’s a valid point to note that
Redis holds many advantages over
MEMCACHED, the only disadvantage being
Redis is at a more lower granular level than
Redis offers clustering, and unlike
MEMCACHEDsupport is provided out-of-the-box. Being built-in provides a more robust solution that is easier to administrate. It is also exceedingly fast and easy to use. It uses the same key-value pair as its opponent, so it’s not going to be that difficult to understand. Overall, I feel that both these caching systems would not hold that big of a performance improvement over the other, so it boils more towards how comfortable you are with between the two systems.
Personally, I like
Redis more as it is easier to set up and gives us a wider range of possibilities. The area where
Redis wins over its opponents are it data persistence, restore and high availability. This might not really make sense to use unless your data is important.
So let us download
Redis in our application and see it in action using this. Alternatively, you can install
Redisusing the commands below:
Let us run the server now using:
Redis also provides us with this awesome cli. We can get in this cli and start seeing all the keys which are getting stored as well.
One of the ways
Django can be integrated with
Redis is through
django-redis to execute its commands in
Redis. We can install
django-redis is also going to be easy to include as all we have to do is add it in our
settings.py. Do note that
Redis by default runs on
port 6379, so we are going to point to that location in our
django-redisto listen to.
Charcha does not have a big overhead, so adding
django-redis might seem a bit like overkill and so did
MEMCACHED but for implementation purposes, and for more heavy traffic sites we want to give you an example on how to use
Let’s try seeing how well our application runs when we make a lot of requests at the same time. We can use
loadtest here to make concurrent calls and assess the performance of
So the total time taken to load is about 7 seconds. This is pretty commendable but can be optimized a bit further. We need to startanalyzing why does it take this long and what all views are getting called. For thi,s we would use
When we look at what the code is doing, we can make decisions on how to make changes to improve the performance. I can see that there are a lot of the same queries happening for the same requests. All I have to do is add:
and reap all the benefits of
django-redis. You can see the output below yourself. Stand back ladies and gentlemen, benchmarking are happening.
Yup, thats right. We have achieved a response of 2 seconds. The overhead is not that much so using
django-redis here is more of a call we need to take.
4. Using Varnish
Varnish is a caching HTTP reverse proxy. It’s always going to be in front of your server be it
Nginx. The way
varnish works is, it helps in
caching your static pages. The problem with
varnish is that it cannot cache dynamic content. If your site is not as dynamic, you could make good use of
varnish but you should always check which all views which can be cached and then cache them. Second, identify how long of a delay you could tolerate someone seeing stale content. The best thing about
varnish is how well they both fit together.
Setting up varnish is also really easy to do:
Let’s take a page and see its performance and how it is working (for this we will take the discussion page).
Varnish help? What does it do? Basically,
Varnish is going to sit between your django and your users. So, all we have to do now is add the
cache_page decorator on our view and it is going to do everything for us. Let’s try applying it in the
upvote_post and see what happens.
What just happened? Well, the varnish was waiting for the response from
upvote_post and when you made a server call to the function, it held the response with itself. Now the next time we made the call, without having to go to the server view again, ‘varnish’ just sent back the response.
To be more secure we could add a cookie header with the request so we could have some security at this level as well.
This entire implementation is what is called as the per-view cache. To explain it in a more layman term, we are basically storing/caching all the responses from the views individually.
Varnish also has its own configuration language, this can be used in places for normalization where the endpoints are not different based on the user’s authentication. How?
We can go ahead with
Varnish for now and start caching the views at least. This has a lot of documentation and I urge you to read up on it.
Caching as previously mentioned is a way of reducing the server load and improving the performance of your application. Although most
caches assiduously keep on working to reduce the dependency on your server, one must always keep in mind the overhead of the
cache as well.
Charcha might not require that many levels of caches as compared to high traffic sites. If we take an example, where Charcha becomes something like stackoverflow we can start adding
cache to reduce the server response, something like using
django-redis as a store for storing objects, the result of DB queries and use
Varnish for serving our static pages. It really depends on your use-case and what you require your
cache to do.
For now, in Charcha, the previous implementation for
caching as mentioned in our previous post would do just fine.