Farming equipment rental business is an unorganized sector. Suppliers (who own farming equipment and offer them on rent) mostly take orders in an ad-hoc fashion and simply maintain records in diaries.

 

This unorganized setup makes the timely availability of the required equipment almost always uncertain and thereby keeps the farmers on edge and also delay the farming process. 

 

In today’s tech-savvy world, we do have a solution for it. To tackle this problem, our client approached us with an idea which they called “Uber for farmers”.

 

We started by building a reliable and scalable cloud-based platform which can organize the farming equipment rental business. The platform aimed to bridge the gap between farmers and suppliers via technology and ensure reliable, timely and on-demand service. 

 

Taking the cue from Uber and Ola, the popular rental cab-service in India, the client came up with the thought of extending the concept to the farming sector where suppliers can declare their equipment and farmers can book this equipment online and the system can help them track the availability. 

 

Within a span of 3-6 months, we were able to come up with a platform and mobile apps for suppliers and farmers. Using the app, the farmers can search and place orders for equipment and suppliers can accept and fulfill them.

 

These apps allow suppliers to be more organized, as they can track their orders, incomes, inventories, get monthly and yearly reports on revenue etc.

 

Farmers have visibility of the available types of equipment in their neighborhood, select them from different pricing models based on duration (per hour) or dimension (per acre). Later we added more sophisticated features and apps as per the business needs.

What we have built 

Platform

Back-end application which houses the core business logic and hosts web interface and APIs to be consumed by the apps. Presently, its a Python-Django based monolith web application with various coherent subsystems like Order Management, Inventory Management, Pricing, Order Tracking, Onboarding, Notifications, Promotion etc.

 

In addition to Django, we have used plugins of Django and python, such as DRF(Django Rest Framework), Django-FSM, Arrow, jinjaSQL etc.

 

DjangoFSM

In any order management system, tracking order lifecycle is important and any transition in its state should be controlled and audited. FSM has helped here by providing a controlled and predictable way to perform order state transitions.

 

It defines Django Field and decorators to help you easily define complete state flow. The transitions between state can be authored based on user permission and some business rules. eg.

# Order can be transitioned to started state only when
# 1. It is already in a confirmed state
# 2. User who has triggered this has correct permission
@transition(field=state, source=‘confirmed’, target=‘started’,
  permission=lambda instance, user: not user.has_perm(‘order.can_start’))
def start(self):
  pass
Here is the GitHub link of the project to read more.

 

DjangoDRF

Building rest APIs requires serializing and deserializing data models in JSON format. DRF is a good Django-plugin which helps in quickly building rest APIs from Django models.

 

Celery

Celery is a python based framework for facilitating background workers. The platform uses celery to run simple and workflow-oriented tasks like Order Assignment, Notification, Breakdown Checkup. Designing workflow with celery was one pain which we have done for few things.

 

To discuss a simple collaborative workflow modeling with celery let’s take the case of order assignment and do a little peep into how we have modeled its workflow with celery. For any new order, we have to find and assign the order to a supplier. Following are the business steps for new order assignment.

 

 

  1. Find supplier (nearby to order location, order equipment, availability of order schedule, a rating of supplier etc.)
  2. Contact each supplier sequentially, wait for their response for 2 mins.
  3. If the supplier has accepted within X Mins, with “within X mins, where X can be configured from the backend”,
    then send a confirmation to the farmer and stop the process.
  4. Else continue with next supplier.
  5. If no one has confirmed then try the same process after some time.
  6. You can see, the process of assignment may require 2~10 Min(assuming 5 suppliers, 2 Min each) and shouldn’t be done in the request thread. Also, these steps require a sleep between contacts to wait for confirmation, which can block the worker thread. So, we decided to create smaller tasks and switches between them more like recursive manner. Following are the logical celery tasks;

 

  1. FindSupplierTask: Takes an order and just find the list of probable suppliers.
  2. ContactSupplierTask: Takes an order and supplier and contact them by sending new order notification.
  3. DeciderTask: Acts as a collaborator between contact and FindSupplier task. It takes an order and a list of suppliers and then contacts them one by one.

 

Implementation flow with celery tasks are;

  1. order management system submits a new order to FindsupplierTask.
  2. FindSupplierTask task then finds a list of probable suppliers and then invoke DeciderTask with order_id, suppliers found and current index to zero.
  3. DeciderTask always checks if the order has been confirmed or not and if not then if suppliers list has been exhausted then stops and update status else call ContactSupplierTask with the order, supplier and a timeout(which says when decider should be called again.)
  4. ContactSupplierTask then sends a notification to supplier and schedule Decider task to be called after ETA was given by decider in the form of timeout.

 

This way, every task has a short life during which, it performs the assigned tasks and then delegates it to the next task for subsequent action. The decider task acts as a coordinator between different task and takes the decision based on the output of the previous task.

 

These are very high-level details but it gives the basic idea. Celery is a powerful framework for running such background tasks, retrying then on failure with back-offs. Refer to celery documentation for more details.

 

Arrow

Any web application built for the global audience should handle date-time very carefully. And as a rule of thumb, it is always better to store date-time in UTC and they convert to appropriate timezone as per user preference in the presentation layer. Python date-time is too naive and verbose for such tasks. An arrow extends python date-time and provides helpful factories and fluent APIs to play with date-times and timezones.

 

JinjaSQL

Platform exposes some APIs to communicate the summary of business performance to apps. Since the data layer uses relational schema and normalized tables, a lot of information has been scattered in various tables which needs to join and extracted for these APIs.

 

Also, these data need to be processed dynamically to get digestible information. Using ORM for such dynamic and processed data is complex and time-consuming. This is where we have used JinjaSQL(in-house open source project), which allows writing raw SQL using jinja templates and hence become powerful in generating dynamic SQL.

Apps

We built the mobile apps for the suppliers, hub managers, and management and provided backend support for the farmer app. Since the target segment was rural areas, currently the mobile apps are available on Android only. 

 

Architecturally apps have been designed as the native android by utilizing its excellent Architectural Components and MVVM model with a focus on minimizing bandwidth uses by utilizing read through cache approach. It uses SQLite DB to store(cache) data on mobile devices.

 

For updates generated from the app, we have opted a simple strategy of first running the update on the server and then refresh the cached objects. This way we are able to avoid any conflict and side-effects of update which is controlled by changing business requirements.

Third Party Integrations

We have used the following third party applications/tools for various needs of the project.

  1. Exotel: Exotel provides communication services like IVR, SMS, OTP etc. To allow orders to be started, stopped, or reporting any breakdown from basic phones or from an area with low network coverage, we have integration with Exotel which allows doing IVR calls. Operators can make toll-free IVR calls, enter their order codes and select the operation they want to perform. The platform also uses Exotel for sending SMS notifications in various languages. Exotel provides a nice dashboard for easy managing your SMS templates and tracking status of sending SMSs’.
  2. Two Factor: Our apps use OTP based authentication system. The platform has integration with TwoFactor to provide OTP service.
  3. FCM (Firebase Cloud Messaging): To communicate any update like new order notification to the supplier, start, stop, breakdown, cancellation, acceptance etc. we are using google’s FCM service which provides a reliable mechanism to push updates on mobile devices.
  4. AWS (S3, RDS, EC2, etc.): We have used AWS as a cloud provider for hosting scalable and fault-tolerant application. Following are the list of few services which we have used from AWS.
    1. RDS (Relational Database Service): AWS RDS service provides fully managed relational DB service with automated backups. We have used Mysql of RDS for our DB layer.
    2. S3(Simple Storage Service): AWS S3 services provide scalable object store services with SLA of 11 9’s. We are using this service to host all the public and private static assets.
    3. ElasticCache: Caching service used by the application to store frequently used objects and as a broker for Celery Workers.
    4. Elastic Beanstalk: AWS provides short of PASS service using beanstalk which manages EB, auto-scaling, runtime-environment etc. for your application. We are using EB to host platform services.
    5. Squealy: One of the obvious business need was to have a reporting service to easily and timely generate and share reports to concerned persons. We have used an in-house solution, Squealy, for these reporting needs. Squealy is a Django + JinjaSQL + Celery based open source web application which can be deployed on your infrastructure and provides you interface for writing SQL using powerful syntaxes of jinjaSQL and generate reports from them. These reports can be generated in various forms like tables, charts etc. and can be downloaded or scheduled to email them on a timely basis.

 

 

This is just a 10000-feet highlight of integrations which we have and the things which have achieved. The system is online and functional with features being discovered and developed on the top. To discuss more on this, please feel free to reach out to us at contact@hashedin.com

Docker Compose is a tool for defining and running multi-docker apps. It allows you to create and test applications based on multifaceted software stacks and libraries. In this blog we explore ways to use docker-compose to manage deployments.

Need for Docker Compose

An application can consist of multiple tiers or sub-components. In containerized deployment, these components need to be deployed as an individual unit. For example, if an application consists of database and caching server, then the database and caching server should be considered as individual components and should be deployed as a separate component. A very simple philosophy is, “Each container should run only one process”.

 

Running multiple containers using docker CLI is possible but really painful. Also, scaling any individual component might be a requirement but this adds more complexity in management of containers. Docker-Compose is a tool which addresses this problem very efficiently. It uses a simple YML file to describe complete application and dependency between them. It also provides the convenient way to monitor and scale individual components, which it termed as services. In the following section, we will see how to use docker compose to manage Charcha’s production ready deployment.

 

Using docker compose to manage Charcha’s production ready deployment

In the previous blog create production-ready docker image, we have created a production ready docker images ofcharcha.We are going to use the same image in this discussion. Let’s start with simple compose file. For the production system we need the following things:

 

  1. Database: As per our settings file, we need postgres.
  2. App Server: A production ready app server to serve our Django app. We are going to use gunicornfor this.
  3. Reverse Proxy WebServer: Our app server should be running behind a reverse proxy to prevent it from denial of service attack. Running gunicron behind a reverse proxy is recommended. This reverse proxy will also perform few additional things such as;3.1. Serve pre-gzipped static files from the application3.2 SSL offloading/termination. Read this to understand the benefits.

 

Let’s build each service step by step in docker-compose.yml file created at the root of the project. For brevity, every step will only add configs related to that step.

 

1) Create service for database

“`YAML version: ‘2’ services: db: # Service name # This is important, always restart this service if it gets stopped restart: always # Use postgres official image image: postgres: latest # Expose postgres port to be used by Web service expose:

 

2) Create service for an app

To create our app service we are going to use previously discussed (Dockerfile)[/2017/05/02/create-production-ready-docker-image] for charcha. This service will run, db migration(and hence need to linked with database service) and run a gunicorn application at 8000.

Here is the app config which needs to be added in a previously created docker-compose file.

  app:
    build: .
    # For this service run init.sh
    command: sh ./init.sh
    restart: always
    # expose port for other containers
    expose:
      - "8000"
    # Link database container
    links:
      - db:db
    # export environment variables for this container
    # NOTE: In production, value of these should be replaced with
    # ${variable} which will be provided at runtime.
    environment:
      - DJANGO_SETTINGS_MODULE=charcha.settings.production
      - DATABASE_URL=postgres://user:password@db:5432/charcha
      - DJANGO_SECRET_KEY=ljwwdojoqdjoqojwjqdoqwodq
      - LOGENTRIES_KEY=${LOGENTRIES_KEY}
    command: python manage.py migrate --no-input && gunicorn charcha.wsgi -b 0.0.0.0:8000

 

Create reverse proxy service:

To create reverse proxy service we are going to use official nginx image and will mount charcha/staticfiles folder into the nginx container. Before proceeding to create a docker-compose config for this service we need following things.

3.1 A SSL certificate:

Charcha production settings has been configured to only accept HTTPS requests. Now, instead of adding SSL certificate at App server, we will add certificate at nginx to offload SSL here. This will add performance gain. Follow these steps to create SSL certificate.

   $ mkdir -p deployment/ssl
   $ sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout deployment/ssl/nginx.key -out deployment/ssl/nginx.crt
   $ openssl dhparam -out deplyment/ssl/dhparam.pem 4096

 

3.2 An nginx config file:

  # On linking service, docker will automatically add
  # resolver for service name
  # Use upstream to resolve the service name.
   upstream backend {
       server web:8000;
   }
   server {
     # listen for HTTPS request
     listen 443 ssl;
     access_log  /var/log/nginx/access.log;
     server_name charcha.hashedin.com;
     ssl_certificate /etc/ssl/nginx.crt;
     ssl_certificate_key /etc/ssl/nginx.key;
     ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
     ssl_prefer_server_ciphers on;
     ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH";
     ssl_ecdh_curve secp384r1;
     ssl_session_cache shared:SSL:10m;
     ssl_session_tickets off;
     add_header Strict-Transport-Security "max-age=63072000; includeSubdomains";
     add_header X-Frame-Options DENY;
     add_header X-Content-Type-Options nosniff;
     ssl_dhparam /etc/ssl/dhparam.pem;
     # Serve all pre-gziped static files from its mounted volume
     location /static/ {
         gzip_static on;
         expires     max;
         add_header  Cache-Control public;
         autoindex on;
         alias /static/;
     }
     location / {
         # Set these headers to let application to know that
         # request was made over HTTPS. Gunicorn by default read
         # X-Forwarded-Proto header to read the scheme
         proxy_set_header Host $host;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_set_header X-Forwarded-Proto $scheme;
         proxy_redirect off;
         # Forward the request to upstream(App service)
         proxy_pass http://backend;
     }
   }
   server {
    # Listen for HTTP request and redirect it to HTTPS
    listen 80;
    server_name charcha.hashedin.com;
    return 301 https://$host$request_uri;
   }

Now, let’s define nginx service in docker-compose.

  nginx:
    image: nginx:latest
    restart: always
    ports:
      - 80:80
      - 443:443
    links:
      - web:web
    volumes:
      # deployment is the folder where we have added few configurations in previous step
      - ./deployment/nginx:/etc/nginx/conf.d
      - ./deployment/ssl:/etc/ssl
      # attach staticfiles(folder created by collectstatic) to /static
      - ./charcha/staticfiles:/static

 

Finally, we have completed our docker-compose and all required configuration and ready to start production like environment on a dev box. You can run the following steps to start playing with it.

  1. Run services: docker-compose up -d
  2. Verify all services are in running state docker-compose ps, you should out like;
    Name              Command              State          Ports
    -------------------------------------------------------------------------------
    charcha_db_1      docker-entrypoint.sh postgres   Up      5432/tcp
    charcha_nginx_1   nginx -g daemon off;            Up      0.0.0.0:443->443/tcp, 0.0.0.0:80->80/tcp
    charcha_web_1     sh ./init.sh                    Up      8000/tcp
    

Now, you can start accessing the application at https://charcha.hashedin.com.

IMP: charcha.hashedin.com as hostname is checked from an application, so you can’t access it from localhost. Also, at this point, you didn’t have any DNS entry for this. To workaround, a simple trick is to use your /etc/hostsfile to do the local name resolution. sudo echo "127.0.0.1 charcha.hashedin.com >> /etc/hosts"

 

Additional stuffs to help in debugging

  1. To view the logs for all services use docker-compose logs
  2. In case you want to see the logs for a particular service use docker-compose logs <service-name> eg. docker-compose logs web
  3. For Login into running container ` docker exec -it <bash/sh(depends on image used)>`

Here is the final docker-compose file

version: '2'
# define multiple services
services:
  # Web service which runs gunicron application
  web:
    # Create build using Dockerfile present in current folder
    build: .
    # For this service run init.sh
    command: sh ./init.sh
    restart: always
    # expose port for other containers
    expose:
      - "8000"
    # Link database container
    links:
      - db:db
    # export environment variables for this container
    # NOTE: In production, value of these should be replaced with
    # ${variable} which will be provided at runtime.
    environment:
      - DJANGO_SETTINGS_MODULE=charcha.settings.production
      - DATABASE_URL=postgres://user:password@db:5432/charcha
      - DJANGO_SECRET_KEY=ljwwdojoqdjoqojwjqdoqwodq
      - LOGENTRIES_KEY=${LOGENTRIES_KEY}
  nginx:
    image: nginx:latest
    restart: always
    ports:
      - 80:80
      - 443:443
    links:
      - web:web
    volumes:
      # deployment is the folder where we have added few configurations
      - ./deployment/nginx:/etc/nginx/conf.d
      - ./deployment/ssl:/etc/ssl
      - ./charcha/staticfiles:/static
  db:
    restart: always
    image: postgres:latest
    expose:
      - 5432
    environment:
      - POSTGRES_PASSWORD=password
      - POSTGRES_USER=user
      - POSTGRES_DB=charcha

Summary

In deployment with docker compose series, till now we read how to create production-ready docker image and use it with docker-compose. You can try this in production with little changes(like reading environment variables instead of hard-coding in compose file) on a single large VM.

 

In coming blogs we will further discuss gaps with docker-compose and be using ECS / Swarm /Kubernets like container management services, in a production environment to fill those gaps.

Business workflows are very common in applications and often play the most critical role. In this post, we will explore AWS SWF service to handle business workflows.

 

Before understanding the SWF, let’s be on the same page by understanding what I mean by a workflow. A Workflow is a sequence of activities which we perform to achieve a goal. Now, sequence of activities may be dynamic and decided based on some inputs, usually the output of previous activity or may be some external signal. We often represent a workflow using flow diagrams. For an example – An online taxi booking system might have a workflow for taxi booking as:

 

  1. Search for nearby taxis which matches taxi-type criteria of search.
  2. If the search gives the results, then:2.1. Send notification all taxi drivers about the order and wait for the fixed time period for confirmation.
    2.2. If the confirmation came in time, then it sends confirmation to the customer. Else after timeout send SMS to customer to retry after sometime.
  3. Else, it sends a message about non availability of the taxi and asks to retry after some time.

 

Of course, this is just the booking flow. But, in reality, this will be more complicated for complete order lifecycle. Here is the flowchart for this.

flowchart

You can easily observe the technical challenges involved in handling this simple workflow where we have to maintain the state at each steps to take next decision. Let’s see what AWS SWF provides to solve this.
AWS SWF is a reliable & scalable solution to run jobs that have parallel or sequential steps. It provides task coordination and state tracking and allows you to completely control the decision making and activity functioning.

 

Terminologies for SWF:

Following are the terminologies for SWF.

  1. Worker An application which perform some task. There are two type of workers in SWF, Decider & Activity workers. Decider Workers are responsible for performing decisions by taking the state history and returning next activity task to perform or completing the workflow execution. Decider corresponds to diamond box in flow-chart. Activity Workers are responsible for performing the actual task.
  2. Tasks: SWF interacts with workers by providing them some unit of work called task. It can be an activity task which needs to be performed by Activity Worker or decision task which needs to be performed by a decider or a lambda task, which is a special activity task that can be executed using AWS lambda function.
  3. WorkflowType: Every workflow in SWF needs to be registered by providing name and version. This is just to identify a workflow. e.g. for above discussed taxi booking system, we can have a workflow type as ‘TaxiBookingWorkflow’
  4. Domain: Provides a way to scope AWS resource within AWS account. All tasks and workflow needs to be associated with a domain.
  5. Workflow Starter: An application which kicks of the workflow execution. In our taxi booking app, a backend API handler for booking request could be the workflow starter.

 

Brief on how SWF based application works

  1. Create a domain in SWF and workflow in SWF. And then register activity tasks in the workflow.
  2. Start workflow execution.
  3. Next, initiate the decider worker which will Poll for decision tasks and find the next step to do.
  4. Start Activity workers which will poll for activity task and perform required task.

 

Here is a simple diagram explaining how SWF works.

SWFApplicationMode

So, for our above discussed taxi booking system we will,

  1. Register a domain(a name to scope all the related SWF entities), workflow-type(just an identifier to booking workflow) and activities with SWF.
  2. Create decider – python program which keeps polling decider queue and output next step.
  3. Create Activity worker – python program which keeps polling activity queue and executes task.

 

Let’s see some code in action. For Python, there is no flow framework. So, we will use low level boto3 API’s.

# swf.py
import boto3
DOMAIN = 'myGola'
DOMAIN_DESCRIPTION = 'Domain for My Taxi Booking Flow'
WORKFLOW = 'TaxiBookingFlow'
WORKFLOW_VERSION = '1.0'
# Period in string till then we need to keep the workflow execution
# history in domain. If 0 / None, then it will not retain. Max is 90 days.
WORKFLOW_RETAINTION_PERIOD = '3'
WORKFLOW_DESCRIPTION = 'Taxi Booking Workflow'
TASKLIST = 'TaxiBookingTaskList'
ACTIVITIES = ['findNearByTaxi', 'sendOrderConfirmation', 'sendRetrySMS', 'multiCastBookingDetail']
ACTIVITY_VERSION = '1.0'
def swf_client():
    """ Get SFW client object
    """
    return boto3.client("swf")
def create_workflow():
    """ Create Workflow by registering domain, workflowType and activity tasks lists
    """
     try:
        # Get SWF client
        client = swf_client()
        # Register a new Domain for the workflow. Domain will be used
        # to scope workflow resources. A single domain can have multiple workflows
        # and generally have all the workflows related to a single project.
        client.register_domain(name=DOMAIN,
            description=DOMAIN_DESCRIPTION,
            workflowExecutionRetentionPeriodInDays=WORKFLOW_RETAINTION_PERIOD)
        # Register workflowType
        client.register_workflow_type(domain=DOMAIN,
            name=WORKFLOW,version=WORKFLOW_VERSION,
            description=WORKFLOW_DESCRIPTION)
        # Register new ActivityType
        for activity in ACTIVITIES:
            session.register_activity_type(
                domain=DOMAIN, name=activity,
                version=TASK_VERSION,
                # default Max duration that a worker can take to process task
                # of this activity type. "O" / "None" means unlimited duration
                defaultTaskStartToCloseTimeout="0",
                # default maximum time before which a worker processing a task
                # of this type must report progress by calling RecordActivityTaskHeartbeat
                defaultTaskHeartbeatTimeout=0,
                # the default maximum duration that a task of this activity type can wait
                # before being assigned to a worker
                defaultTaskScheduleToStartTimeout=0,
                # default maximum duration for a task of this activity type
                defaultTaskScheduleToCloseTimeout=0)
    except Exception as e:
        # Log exception
def send_signal_to_workflow(client, driver_id, booking_id, execution_id):
    """Send signal to execution_id informing acceptance of request
    """
    try:
        client.signal_workflow_execution(
            domain=DOMAIN,
            workflowId=booking_id,
            runId=execution_id,
            signalName="OrderAccepted",
            input=driver_id)
    except:
        # Handle Error
def start_workflow_execution(client, bookingDetail):
    """Start workflow execution for the given booking detail.
    We can have multiple executions for same workflow. In our case, at least
    one for every taxi booking. For starting a workflow execution, we need to
    provide a unique execution id. We will use booking id for this.
    """
    try:
        response = client.start_workflow_execution(domain=DOMAIN,
            workflowId=bookingDetail.bookingId,
            # Which workflow to start
            workflowType={
                "name": WORKFLOW,
                "version": WORKFLOW_VERSION
            },
            # What taskList to use. This will override the default task list
            # Specified in workflow registration.
            taskList={
                "name": TASKLIST
            },
            # Input as string for the workflow
            input=bookingId)
        return response
    except Exception as e:
        # Handle error
if __name__ == '__main__':
    client = swf_client()
    create_workflow(client)

# decider.py
from swf import swf_client
from swf import DOMAIN, TASKLIST, ACTIVITIES
def poll_for_decision_task(client):
    try:
        response = client.poll_for_decision_task(
            domain=DOMAIN,
            taskList={
                "name": TASKLIST
            },
            reverseOrder=true)
        return response
    except Exception as e:
        # Handle Error
def schedule_next_activity(client, task_token, activity_id, activity_name, input=""):
    client.respond_decision_task_completed(
        taskToken=task_token,
        decisions=[
            {
                'decisionType': 'ScheduleActivityTask',
                'scheduleActivityTaskDecisionAttributes': {
                    'activityType': {
                        'name': activity_name,
                        'version': '1.0'
                    },
                    'activityId': activity_id#'findNearByTaxi-{0}'.format(booking_id)
                }
            }])
def schedule_task_complete(client, task_token, result)
    client.respond_decision_task_complete(
        taskToken=task_token,
        decisions=[
            {
              'decisionType': 'CompleteWorkflowExecution',
              'completeWorkflowExecutionDecisionAttributes': {
                'result': result
              }
            }
        ])
def start_timer(client, taskId, timer, timerId):
    client.respond_decision_task_complete(
        taskToken=taskId,
        decisions=[
            {
              'decisionType': 'StartTimer',
              'startTimerDecisionAttributes': {
                'timerId': timerId,
                # Assume that taxi driver will be given 3 minutes to accept order
                # seconds to wait before firing the timeout signal
                'startToFireTimeout': timer
              }
            }
        ]
    )
def decider(client, response):
    """Decider which takes decision for the next step
    poll_for_decision_task make long HTTP connection
    for 60 sec, within this if there is no decision task,
    then this will return a response without taskToken.
    """
    if 'taskToken' not in response:
        print "Poll timed out without returning a task"
        return False
    # Response will contain list of history events. We will use the last
    #  event from event history. For every thing SWF has event in event history
    #  so for decision task scheduling also, event history will have decision task
    #  related start / stop events, so just ignore them.
    event_history = [event for event in response['events'] if not event['eventType'].startswith('Decision')]
    last_event = event_history[-1]
    last_event_type = last_event['eventType']
    if last_event_type == 'WorkflowExecutionStarted':
        booking_id = last_event['workflowExecutionStartedEventAttributes']['input']
        schedule_next_activity(response['taskToken'],'findNearByTaxi-{0}'.format(booking_id),'findNearByTaxi')
    # Driver has accepted the booking request and hence we have
      triggered a signal in this workflow.
     # We will treat this signal as success case and stop our timer
    elif last_activity_name == 'WorkflowExecutionSignaled':
        client.respond_decision_task_complete(
                taskToken=response['taskToken'],
                decisions=[
                    {
                      'decisionType': 'CancelTimer',
                      'startTimerDecisionAttributes': {
                        'timerId': 'TimerToWaitMultiCastBookingDetailSignal-{0}'.format(booking_id),
                      }
                    },
                    {
                      'decisionType': 'CompleteWorkflowExecution',
                      'completeWorkflowExecutionDecisionAttributes': {
                        'result': 'booking confirmed'
                      }
                    }
                ]
            )
    # Last scheduled activity has been completed
    elif last_event_type == 'ActivityTaskCompleted':
        completed_activity_id = last_event['activityTaskCompletedEventAttributes']['scheduledEventId'] - 1
        last_activity_data = response['events'][completed_activity_id]
        last_activity_attrs = activity_data['activityTaskScheduledEventAttributes']
        last_activity_name = activity_attrs['activityType']['name']
        last_activity_result = last_event['activityTaskCompletedEventAttributes'].get('result')
        next_activity = None
        if last_activity_name == "findNearByTaxi":
            next_activity = "sendRetrySMS" if last_activity_result is None or last_activity_result.length == 0 else "multiCastBookingDetail"
        elif last_activity_name == "multiCastBookingDetail":
            # Wait for external signal till some time
            start_timer(response['taskToken'], "240", 'TimerToWaitMultiCastBookingDetailSignal-{0}'.format(booking_id))
            return
        elif last_activity_name in ["sendOrderConfirmation", "sendRetrySMS"]:
            # We will mark the workflow as complete here.
            next_activity = None
        if next_activity is not None:
            schedule_next_activity(response['taskToken'],'{0}-{0}'.format(next_activity, booking_id),next_activity)
        else:
            schedule_task_complete(response['taskToken'], 'booking confirmed')
if __name__ == '__main__':
    client = swf_client()
    while True:
        try:
            response = poll_for_decision_task(client)
            decider(client, response)
        except ReadTimeout:
            pass

# worker
from swf import swf_client
from swf import DOMAIN, TASKLIST, ACTIVITIES
def poll_for_activity_task(client):
    try:
        response = client.poll_for_activity_task(
            domain=DOMAIN,
            taskList={
                "name": TASKLIST
            })
        return response
    except Exception as e:
        # Log Error
def find_nearby_taxi(booking_id):
    """Get details from the booking id and find list of
    taxis nearby the location of booking request.
    """
    # Run some logic to find details
    # Place the result some where in your cache
    # Return comma-separated driver_ids
    # NOTE: We can only return string in result
    pass
def send_order_confirmation(booking_id):
    """Send push notification and SMS to the user
    informing confirmation of booking with cab details
    """
    # Add logic to send message and notification
    pass
def send_retry_sms(booking_id):
    """Send message to client informing unavailability of taxi
    at this moment
    """
    # Add logic to send retrySMS
    pass
def multi_cast_booking_detail(booking_id):
    """Send notification for new booking to all taxi drivers
    """
    # Send booking details to all the drivers by using driver_ids
    pass
activities = {
    "findNearByTaxi": "find_nearby_taxi",
    "sendOrderConfirmation": "send_order_confirmation",
    "sendRetrySMS": "send_retry_sms",
    "multiCastBookingDetail": "multi_cast_booking_detail"
}
if __name__ == '__main__':
    client = swf_client()
    while True:
        try:
            response = poll_for_activity_task(client)
            if 'activityId' in response:
                result = response['activityType']['name'](response['input'])
                client.respond_activity_task_completed(
                    taskToken=response['taskToken'],
                    result=result)
        except ReadTimeout:
            pass

# views.py
from django.http import HttpResponse
from swf import swf_client, start_workflow_execution, send_signal_to_workflow
def new_booking(request):
    """Handle new booking request
    """
    # Save booking record in DB
    booking_detail = save_booking(request)
    # start workflow execution for this order
    client = swf_client()
    execution_id = start_workflow_execution(client, booking_detail)["runId"]
    # save this executionId for later use
    save_workflow_execution(booking_id, execution_id)
    return HttpResponse(booking_id)
def accept_order(request):
    """Handle Accept booking order request from the driver's
    """
    booking_id = get_booking_id(request)
    execution_id = get_execution_id(booking_id)
    # check if order is still open
    if is_booking_open(booking_id):
        # send signal to workflow execution informing
        client = swf_client()
        send_signal_to_workflow(booking_id, execution_id)
        return HttpResponse("Success")
    else:
        return HttpResponse("Failure", 404)

 

Summary

Of-course the code sample didn’t cover the edge cases and all flows like order cancellation from customer etc. But, hopefully this should have given some idea on for what & how to use AWS SWF. The code might look more verbose because of the uses of low level APIs which I think could be improved with some higher library/framework like flow(currently available only for Java & ruby). To learn more you can check these online references

 

Docker is a great tool to containerized an application(Containers, allow to package an application with its runtime dependencies). In HashedIn, we have been using docker for both internal & external projects and have learned good lessons from them. In this article, we will discuss strategy to create production-ready docker image taking intoCharchaaccount.

Important checklist for creating a docker image

  1. Lightweight Image: Application should be packaged with a minimal set of things which is required to run the application. We should avoid putting unnecessary build/dev dependencies.
  2. Never add secrets: Your application might need various secrets like credentials to talk to S3 / database etc. These are all runtime dependencies for the application and they should never be added to docker image.
  3. Leverage docker caching: Every statement(except few ones) in Dockerfile, creates a layer(intermediate image) and to make build faster docker tries to cache these layer. We should pay attention to arrange our docker statements in a way to maximize the uses of docker cache.

Note: As per documentation

  1. Except for ADD & COPY, usually, instruction in dockerfile will be used to see matches for existing images.
  2. For the ADD and COPY instructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file. During the cache lookup, the checksum is compared against the checksum in the existing images.

Since a code is going to be changed very frequently than its dependencies, it is better to add requirements and install them before adding codebase in an image.

Dockerfile for Charcha

Let’s see dockerfile for charcha, which tries to adhere to the above-discussed checklist. Each instruction in dockerfile has been documented with inline comments which should describe the importance of the instruction.

# charcha is based on python3.6, let's choose the minimal base image for python. We will use Alipne Linux based image as they are much slimer than other linux images. python:3.6-alpine - is an official(developed /approved by docker team) python image.
FROM python:3.6-alpine
# Creating working directory as charcha. Here we will add charcha codebase.
WORKDIR /charcha
# Add your requirements first, so that we can install requirements first
# Why? Requirements are not going to change very often in comparison to code
# so, better to cache this statement and all dependencies in this layer.
ADD requirements.txt /charcha
ADD requirements /charcha/requirements
# Install system dependencies, which are required by python packages
# We are using WebPusher for push notification which uses pyelliptic OpenSSL which
# uses `ctypes.util.find_library`. `ctypes.util.find_library` seems to be broken with current version of alpine.
# `ctypes.util.find_library` make use of gcc to search for library, and hence we need this during
# runtime.
# https://github.com/docker-library/python/issues/111
RUN apk add --no-cache gcc
# Package all libraries installed as build-deps, as few of them might only be required during
# installation and during execution.
RUN apk add --no-cache --virtual build-deps \
      make \
      libc-dev \
      musl-dev \
      linux-headers \
      pcre-dev \
      postgresql-dev \
      libffi \
      libffi-dev \
      # Don't cache pip packages
      && pip install --no-cache-dir -r /charcha/requirements/production.txt \
      # Find all the library dependencies which are required by python packages.
      # This technique is being used in creation of python:alipne & slim images
      # https://github.com/docker-library/python/blob/master/3.6/alpine/Dockerfile
      && runDeps="$( \
      scanelf --needed --nobanner --recursive /usr/local \
              | awk '{ gsub(/,/, "\nso:", $2); print "so:" $2 }' \
              | sort -u \
              | xargs -r apk info --installed \
              | sort -u \
      )" \
      && apk add --virtual app-rundeps $runDeps \
      # Get rid of all unused libraries
      && apk del build-deps \
      # find_library is broken in alpine, looks like it doesn't take version of lib in consideration
      # and apk del seems to remove sim-link /usr/lib/libcrypto.so
      # Create sim-link again
      # TODO: Find a better way to do this more generically.
      && ln -s /usr/lib/$(ls /usr/lib/ | grep libcrypto | head -n1) /usr/lib/libcrypto.so
# Add charcha codebase in workdir
ADD . /charcha

Question: What will happen if we move our Add . /charcha statement up, just after WORKDIR /charcha. That way we didn’t need add requirements separately?
Ans: As discussed above, your code is going to be changed very frequently in comparison to requirements file. And since for ADD statement, docker tries to create checksum using content of files to match against its cache keys, there will be very high chance of cache miss(because of content change). Also, once the cache is invalidated, all subsequent Dockerfile commands will generate new images and the cache will not be used. And hence, even though we didn’t have updated our requirements, almost every build will end up in installing dependencies.

This dockerfile provides production-ready image, with a minimal set of dependencies. To play with this image locally you can try following steps;

  1. Build docker image: Create a docker image using above specified dockerfile.
    $ docker build --rm -t charcha:1.0 .
    

    Above command will create a docker image using the current directory as context and then tag the image as Charcha:1.0. Command also specifies to remove any intermediate images. For more information on docker build refer to this link.

    Note: docker build will be executed by docker daemon, and hence the first thing a build process does is, it sends the complete docker context(in our case,the entire content of the current directory) to the daemon. Your context path might contain some unnecessary files like .git folder, ide related files etc. which are not at all required to build the image. So, it is a best practice to add a .dockerignore file which is more like .gitignore and lists files & folders which needs to be ignored by the daemon.
    Following is the dockerignore file for charcha.

    .git
    Dockerfile
    docker-compose.yml
    Procfile
    
  2. Create a container from docker image:
    
       # This command will run shell in interactive mode for charcha container and will land you in
       # /charcha directory, because we have defined /charcha as our workdir in dockerfile.
       $ docker run -p8000:8000 -it charcha:1.0 /bin/sh
       # Running commands inside container
       /charcha $ python manage.py migrate
       /charcha $ python manage.py makemigrations charcha
       /charcha $ python manage.py runserver 0.0.0.0:8000
    

Now, charcha should be running(using local settings) in docker container and you can access charcha locally at http://localhost:8000. In coming blogs, we will discuss how to use docker-compose to do stuffs automatically, which we have done here manually and how to locally create production like environment.