At HashedIn, we commonly deploy Django based applications on AWS Elastic Beanstalk. While EB is great, it does have some edge cases. Here is a list of things you should be aware of if you are deploying a Django application.
Aside: If you are starting a new project meant for the elastic beanstalk, Django Project Template can simplify the configuration.
Gotcha #1: Auto Scaling Group health check doesn’t work as you’d think
Elastic Beanstalk lets you configure a health check URL. This health check URL is used by the elastic load balancer to decide if instances are healthy. But, the auto scale group does not use this health check.
So if an instance health check fails for some reason – elastic load balancer will mark it as unhealthy and remove it from the load balancer. However, auto scale group still considers the instance to be healthy and doesn’t relaunch the instance.
Elastic Beanstalk keeps it this way to give you the chance to ssh into the machine to find out what is wrong. If auto scaling group terminates the machine immediately, you won’t have that option.
The fix is to configure autoscale group to use elastic load balancer based health check. Adding the following to a config file under .ebextensions will solve the problem.
Credits: EB Deployer Tips and Trick
Gotcha #2: Custom logs don’t work with Elastic Beanstalk
By default, the wsgi account doesn’t have write access to the current working directory, and so your log files won’t work. According to Beanstalk’s documentation, the trick is to write the log files under the /opt/python/log directory
However, this doesn’t always work as expected. When Django creates the log file in that directory, the log file is owned by root – and hence Django cannot write to that file.
The trick is to run a small script as part of .ebextensions to fix this. Add the following content in .ebextensions/logging.config
:
With this change, you can now write your custom log files to this directory. As a bonus, when you fetch logs using elastic beanstalk console or the eb tool, your custom log files will also be downloaded.
Gotcha #3: Elastic load balancer health check does not set host header
Django ’s settingALLOWED_HOSTS
requires you to whitelist host names that will be allowed. The problem is, elastic load balancer health check does not set hostnames when it makes requests. It instead connects directly to the private IP address of your instance, and therefore the HOST header is the private IP address of your instance.
There are several not-so-optimal solutions to the problem
Terminate health check on apache – for example, by setting the health check URL to a static file served from apache. The problem with this approach is that if Django isn’t working, health check will not report a failure
Use TCP/IP based health check – this just checks if port 80 is up. This has the same problem – if Django doesn’t work, health check will not report a failure
Set ALLOWED_HOSTS = [‘*’] – This disables Host checks altogether, opening up security issues. Also, if you mess up DNS, you can very easily send QA traffic to production.
A slightly better solution is to detect the internal IP address of the server and add it to ALLOWED_HOSTS at startup. Doing this reliably is a bit involved though. Here is a handy script that works assuming your EB environment is Linux:
Depending on your situation, this may be more work than you care about – in which case you can simply set ALLOWED_HOSTS to [‘*’].
Gotcha #4: Apache Server on EB isn’t configured for performance
For performance reasons, you want text files to be compressed, usually using gzip. Internally, elastic beanstalk for python uses Apache web server, but it is not configured to gzip content.
This is easily fixed by adding yet another config file.
Also, if you are versioning your static files, you may want to set strong cache headers. By default, Apache doesn’t set these headers. This configuration file sets Cache-Control headers to any static file that has version number embedded.
Gotcha #5: Elastic Beanstalk will delete RDS database when you terminate your environment
For this reason, always create an RDS instance independently. Set the database parameters as an environment variable. In your Django settings, use dj_database_url to parse the database credentials from the environment variable.