This is the seventh post on the Django blog series. In this post we will learn on how we can optimize the django application’s performance. The inspiration of this post has mainly been, the performance improvement of Charcha Discussion Forum. The source code for the charcha forum is available here.
For this post, like all preceding posts in this series, we are going to use Charcha’s codebase as an example and try to optimize the pages from there on out. Also, as the range for optimization is exceedingly high we would try to tackle all the domains one by one. The main focus would be to attain maximum proficiency in the backend (as it is a django application at the end of the day) and the measurement would be done through
django debug toolbar.
The way we are going to proceed with this is by seeing the performance of the application and see where all can we optimize it domain by domain. To perform the assessment on the site we would be using a bunch of tools, the most important one being
django debug toolbar and chrome’s network tab. We are going to hit the front-end portion first and move forward from there.
Optimizing the Front-end:
Typically when you are looking to optimize the application on the front-end we should look for the entire stack to understand how our app is working. This approach should always be followed in all cases as it gives you a better perspective on how things are functioning.
The moment I got to know that Charcha is rendering its template from the server, my focused turned towards looking at only caching and minification of the pages (to be honest this should be present in all applications).
* Caching Approach:
Before we start writing any code I want you to understand the working of caching in server-side rendering and how powerful it really is? As you may know, whenever there is an endpoint which returns any data, it can generally be cached. So the main question is that can we cache HTML elements? Yes, by rendering on the server we can easily cache our data.
So what’s going to happen is that the first time your client will get a response and that response will now be cached, so the next time when the same response is made not only will your browser have to do the rendering, your server will also not have to. Now that we have an approach for caching we can start implementing it. In order to perform server-side caching, we are going to use
whitenoise. Now to integrate
whitenoise in our Django app, we just have to follow the following steps:
- Configuring static files: In Django this isa pretty common approach where you need to add a path of your static files in your
- Enable WhiteNoise: We need to add the whitenoise to our
settings.py. Do not forget about the hierarchy of your
MIDDLEWARE_CLASSES, and whitenoise middleware should be above all middleware’s apart from
whitenoise is going to start serving all your static files but our main purpose for caching is still not done. One thing to note here is that
whitenoise also supports compression which we can use.
- Whitenoise Compression and Caching support:
whitenoisehas a storage backend which automatically takes care of the compression of the static files and associates each file with a unique identifier so that these files can be cached forever. To exploit this feature all we have to do is add the following lines in our
- Handling Content Delivery Network: Although we have enabled caching we should still look at CDN to gain maximum efficiency. As
whitenoiseprovides appropriate headers, our CDN can serve the cache files and make sure that for duplicate responses we are not hitting the application again and again, rather serve those request by itself. All we have to do is set an
DJANGO_STATIC_HOSTand access it in
- WhiteNoise in development: This is one of the main parts as whenever we run the
runservercommand, django takes over and handles all the static files on its own. So we will not be using
whitenoiseas effectively as we want to. For using
whitenoisewhat we need to do is disable Django’s static file handling and allow WhiteNoise to take over the handling of all the static files. So we need to edit our
- Caching images: There are a few available settings which could be used to configure
whitenoise. We are going to try to cache the images for a longer duration of time so as to gain some more performance. For this all we have to do is add a
whitenoiseheader function which would basically look for all the images and cache them for us. So we write something like:
Now our caching for the front-end is complete. Let’s move on to the second phase of front-end optimization and try to minify our files.
* Minifying the Files:
For minifying our files we are going to use
spaceless template tags. Although this will not minify our files per se, it is still going to give us a better result as it would help us reduce our page weight. How?? Well,
spaceless template tagsremoves whitespace between HTML tags. This includes tab characters and newlines. This is really easy to implement in Django templates. All we need to add is
spaceless and close it with
base.html. As our
base.html is going to be used everywhere as the basic skeleton, we can be sure that it will be applied to all the other templates within it as well.
Now that we are done with our front-end optimization let’s move to our back-end. This is perhaps the one place where we would be able to achieve the maximum amount of efficiency.
Optimizing the Back-end:
Ideally, the flow for optimizing your code is to move from your queries to how your Django is functioning. In queries, we need to see the number of queries, the amount of time it takes to execute the query and how our
Django ORM'S are working. Internally Django already does a lot for us but we can optimize its queries even further.
We need to start scavenging the code and check the total number of query calls per page. There are a few ways to do this. The most obvious one being logging all the queries in sql.log and start reading the queries from there. There is one other way of accomplishing this, that is by using
django debug toolbar.
django debug toolbar is a really powerful tool and extremely easy to use. It helps in debugging responses and provides us with a bunch of configurable panels that we could use. We are mainly going to use this to see how quick our queries are and if we are performing any redundant queries.
Django Debug Toolbar
So let’s first integrate the toolbar in Charcha. To install
django debug toolbar we are going to follow this doc. Its pretty straightforward and easy to implement. So first we need to install the package using
One thing to note here is that if you are using a
virtualenv avoid using sudo. Ideally we should not be using sudo anyways if we are in a
virtualenv. Now our
django debug toolbar is installed. We need to now add it in our
INSTALLED_APPS by adding
Now we need to put the debug toolbar in our middleware. Since there is no preference of the
django debug toolbar middleware the middleware stack we can add it wherever we see fit.
Now we need to add the url to configure the toolbar. We just need to add the url and put it in our
DEBUG mode, the rest
django debug toolbar will take care of, on its own.
There are also a few configurations that we could use but for now we are going to set it to default as its not required (with the correct version this should be done automatically).
Also, since we are using our local we need to set the following
django debug toolbar to understand.
And now we test. Mostly our screen would look something like the image below.
We can now start checking the number of queries that are being run and how much time they are taking to execute as well.
This part is actually a bit easy as all we have to do is set a benchmark on the basis of the amount of data that we have. We also need to consider the higher effect where if we have a lot of data within a lot of foreign keys, we might need to start using indexes there.
To show how we are going to refactor the code we went ahead and started seeing the performance of all the pages. Most refactoring would mostly look at how we could reduce the number of queries. As an example we are going to look at the discussion page.
Now according to the
django debug toolbar there are a lot of queries that we are making 8 queries whenever we are making our call to see the post. This would, later on, start giving us problems if we don’t eliminate a few of them. The way we are going to approach optimizing queries is in subparts as follows:
Solving N+1 queries:
We can solve the problem of
N+1 queries simply by using
select_related in Charcha. With the two functions, we can have a tremendous performance boost as well. But first, we need to understand what exactly is it that they do and how can we implement it in
select_related should be used when the object that you are selecting in a single object of a model, so something like a
ForeignKey or a
OneToOneField. So whenever you make such a call,
select_related would do a join of the two tables and send back the result to you thereby reducing an extra query call to the
Let’s see an example to better understand how we can integrate this in Charcha.
We are going to take the example of the
Post model which looks something like this:
Do note that we have a
custom manager defined, so whichever query we need to execute we can define it in our manager. Now in your first glance, you can see that our
class Post is inheriting from
Votable, so we now need to see what is happening in this class.
Whenever we make a call to check the
Author of our
Post we will be doing an extra call to the database. So now we go to our
custom manager and change the way are fetching the data.
If we use the
django debug toolbar you would see that whenever we do a call like
get_post_with_my_votes().author, we are going to be executing an extra query to the User table.
This is not required and can be rectified easily by using
select_related. What do we have to do? Just add
select_related to the query.
And that’s it. Our redundant query should be removed. Lets check it using
django debug toolbar.
We can use
prefetch_related when we are going to get a set of things, so basically something like a
ManyToManyField or a reverse
ForeignKey. How does this help? Well, the way
prefetch_related works is it makes another query and therefore reduces the redundant columns in the original object. This as of now is not really required so we are going to let it be.
* Query in a loop:
Though this is not done anywhere in Charcha but this a practise which a lot of people follow without realising its impact. Although this seems pretty basic I still feel that requires its separate section.
The above example is just a sample of how disasterous a query in a loop could be for a large dataset and this above example can easily be solved by the previous discussion (on
prefetch_related) which we had.
I recommend using denormalization only if we have some performance issues and not prematurely start optimizing it. We should check the queries before we start with our denormalization as it does not make any sense if we keep on adding more complexity if it is not impacting our performance.
The best place to explain denormalization implementation in Charcha is in the
Votable model as done in the 2nd blog of this series. Why the
Votable table? This is because we want to show the score i.e. upvotes – downvotes and the comments on the home page. We could make a join on the
Votable table and call it from there but it would slow down our system. Instead what we are doing is, adding the fields of
flag in the
Votable model. This would in turn reduce our calls.
Now we can just inherit these properties in whichever models we require and from there we can move forward. Although this seems pretty easy to implement it does have a drawback. Every time there is some changes that is made, we would have to update these fields. So instead of making a
JOIN we would rather have a more complex
WBS for Tracking Comments Hierarchy
This is also a form of denormalization. Each comment needs a reference to its parent comment and so on , so it basically makes something like the tree structure.
The problem here is that self-referential calls are exceedingly slow. So we can refactor this approach and add a new field called
wbs which would help us track the comments as a tree. How would it work? Every comment would have a code, which is a dotted path. So the first comment would have the code “.0001” and the second top-level comment would have the code “.0002” and so on. Now if someone responds to the first comment it gets a code of “.0001.0001”. This would help us avoid doing self-referential queries and use
Now the limitation for this field is we would only allow 9999 comments at each level and the height of the wbs tree would only go till 6, which is sort of acceptable in our case. But in the case of having to go through a large database, we would have to index this field as well. We would discuss this in the next section.
Indexes is one of the many standard DB optimization techniques and
django provides us with a lot of tools to add these indexes. Once we have identified which all queries are taking a long time we can use
Meta.index_together to add these from Django.
Before we start adding indexes we need to identify where all should we add these properly and to do that we will use
django debug toolbar to see how fast we get our response. Now we are going to look at the post which we had before and we are going to track its queries. We are going to select a query which we could easily optimize indexes (given below)
Now, this particular query is taking 1.45s to execute. Now, all we have to see is the table and which field we could add the index on. Since this query belongs to model
Votes we are going to add the index on
And that’s all we have to do. Now we run our migrations and check our performance.
Now, this query is taking only 0.4 seconds to execute and that is how easy it is to implement indexes.
Django QuerySets are LAZY
One thing to note when using django is that django querysets are lazy , what that means is
queryset does not do any database activity and django will only run the query when the
queryset is evaluated. So if we make a call from the
Post model like
This would make three separate queries which is not really required.
One thing to notice using the
django debug toolbar is that almost all the pages have a query made to retrieve the Django session.
The query that we are tackling is given below and since its being used in almost all the places we can simply cache it once and reap the benefits everywhere else.
Django stores the sessions in our database and expects us to occasionally prune out old entries which we would not be doing.
So on each request, we are doing a SQL query to get the session data and another to grab the User object information. This is really not required and we can easily add a cache here to reduce the load on our server. But the question still remains on which store should we use? From a higher operational point of view, introducing a distributed data store like
redis is not really required. Instead, we could simply use cookie-based sessions here.
Do note that we would not be using cookie-based sessions for highly secure sites but Charcha does not need to be highly secure.
How do we implement it?
Using cookie-based sessions is very easy to do. We can simply follow the steps given in this link. All we have to do is add the following in our
common.py as per Charcha) and see the magic happen. We will switch to storing our sessions in our cache and easily remove a SQL query from every single request to our site.
Now from the snapshot given below, we can see that our redundant query is removed.
Hence, we have accomplished our task of optimizing our site.
In this post, we have discussed how one should proceed when optimizing their site. We have tackled all the domains, discussed the various techniques to optimize a Django site and talked about the tools like which
django debug toolbar can be used for a sites assessment.