One more reason to use Docker

Recently I was working on a project which includes Terraform and AWS stuff. While working on that I was using my local machine for terraform code testing and luckily everything was going fine. But when we actually want to test it for the production environment we got some issues there. Then, as usual, we started to dig into the issue and finally, we got the issue which was quite a silly one 😜. The production server Terraform version and my local development server Terraform version was not the same. 

After wasting quite a time on this issue, I decided to come up with a solution so this will never happen again.

But before jumping to the solution, let’s think is this problem was only related to Terraform or do we have faced the similar kind of issue in other scenarios as well.

Well, I guess we face a similar kind of issue in other scenarios as well. Let’s talk about some of the scenario’s first.

Suppose you have to create a CI pipeline for a project and that too with code re-usability. Now pipeline is ready and it is working fine in your project and then after some time, you have to implement the same kind of pipeline for the different project. Now you can use the same code but you don’t know the exact version of tools which you were using with CI pipeline. This will lead you to error elevation. 

Let’s take another example, suppose you are developing something in any of the programming languages. Surely that utility or program will have some dependencies as well. While installing those dependencies on the local system, it can corrupt your complete system or package manager for dependency management. A decent example is Pip which is a dependency manager of Python😉.

These are some example scenarios which we have faced actually and based on that we got the motivation for writing this blog.

The Solution

To resolve all this problem we just need one thing i.e. containers. I can also say docker as well but container and docker are two different things.

But yes for container management we use docker.

So let’s go back to our first problem the terraform one. If we have to solve that problem there are multiple ways to solve this. But we tried it to solve this using Docker.

As Docker says

Build Once and Run Anywhere

So based on this statement what we did, we created a Dockerfile for required Terraform version and stored it alongside with the code. Basically our Dockerfile looks like this:-

FROM alpine:3.8

MAINTAINER OpsTree.com

ENV TERRAFORM_VERSION=0.11.10

ARG BASE_URL=https://releases.hashicorp.com/terraform

RUN apk add --no-cache curl unzip bash \
    && curl -fsSL -o /tmp/terraform.zip ${BASE_URL}/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip \
    && unzip /tmp/terraform.zip -d /usr/bin/

WORKDIR /opstree/terraform

USER opstree

In this Dockerfile, we are defining the version of Terraform which needs to run the code.
In a similar fashion, all other above listed problem can be solved using Docker. We just have to create a Dockerfile with exact dependencies which are needed and that same file can work in various environments and projects.

To take it to the next level you can also dump a Makefile as well to make everyone life easier. For example:-

IMAGE_TAG=latest
build-image:
    docker build -t opstree/terraform:${IMAGE_TAG} -f Dockerfile .

run-container:
    docker run -itd --name terraform -v ~/.ssh:/root/.ssh/ -v ~/.aws:/root/.aws -v ${PWD}:/opstree/terraform opstree/terraform:${IMAGE_TAG}

plan-infra:
    docker exec -t terraform bash -c "terraform plan"

create-infra:
    docker exec -t terraform bash -c "terraform apply -auto-approve"

destroy-infra:
    docker exec -t terraform bash -c "terraform destroy -auto-approve"

And trust me after making this utility available the reactions of the people who will be using this utility will be something like this:-

Now I am assuming you guys will also try to simulate the Docker in multiple scenarios as much as possible.

There are a few more scenarios which yet to be explored to enhance the use of Docker if you find that before I do, please let me know.

Thanks for reading, I’d really appreciate any and all feedback, please leave your comment below if you guys have any feedback.

Cheers till the next time.

Why I love pods in Kubernetes? Part – 1

When I began my journey of learning Kubernetes, I always thought why Kubernetes has made the pod its smallest entity, why not the container. But when I started diving deep in it I realized, there is a big rationale behind it and now I thank Kubernetes for making the Pod as an only object, not containers.

After being inspired by the working of a Pod, I would like to share my experience and knowledge with you guys.

Image result for kubernetes pod memes

What exactly Pod means?

The literal meaning of pod means the peel of pea which holds the beans and following the same analogy in Kubernetes pod means a logical object which holds a container or more than one container.
The bookish definition could be – a pod represents a request to execute one or more containers on the same node.

Why Pod?

The question that needs to be raised why pod?So let me clear this, pods are considered the fundamental building blocks of Kubernetes, because all the Kubernetes workloads, like DeploymentsReplicaSets or Jobs are eventually expressed in terms of pods.

Pods are the one and only objects in Kubernetes that results in the execution of containers which means No Pod No Containers !!!

Now after the context setting over pod I would like to answer my beloved question:- Why Pod over container??

My answer is why not 🙂 

Let’s take an example, suppose you have an application which generates two types of logs one is access log and other logs are error log. Now you have to add log shipper agent, In case of the container, you will install the log shipper in the container image. Now you got another request to add application monitoring in the application. So again you have to recreate the container image with APM agent in it.
Don’t you think this is quite an untidy way to do it? Of course, it is, why I have to add these things in my application image, it makes my image quite bulky and difficult to manage.

What if I tell you that Kubernetes has its own way of dealing situations like this. 

Yup the solution is a sidecar. Just like in real life if I have a two sitter bike and I want to take 3 persons on a ride, So I will add a sidecar in my bike to take 2 persons together on the ride.
In a similar fashion, I can do the same thing with Kubernetes as well. To solve the above problem I will just create 3 containers (application, log-shipper and APM agent) in the same pod. Now the question is how they will access the data between them and how the networking magic will happen.
The answer is quite simple containers withing the pod can share Pod IP address and can listen on localhost. For volume, we can share volumes also across the containers in a pod.

The architecture would be something like this:-

Related image

Now another interesting query arises that when to use sidecar and when not.

Just as shown in the above image we should not keep application and database as a sidecar in the same pod. The reason behind it is Kubernetes does not scale a container it scales a pod. So when autoscaling will happen it scales the application as well as database which could not be required.

Instead of that, we should keep log-shippers, health-check containers and monitoring agent as a sidecar because anyhow application will scale these agents also needs to be scaled with the application.

Now I am assuming you are also madly in love with the pods.

For diving deep in the pod stay tuned for the next part of this blog Why I love pods in Kubernetes? Part – 2. In my next part, I will discuss the different phases and lifecycle of the pod and how pod makes our life really smooth.
Thanks for reading, I’d really appreciate any and all feedback, please leave your comment below if you guys have any feedback.

Cheers till the next time.

Redis Zero Downtime Cluster Migration

A few days back I came across a problem of migrating a Redis Master-Slave setup to Redis Cluster. Initially, I thought it to be a piece of cake since I have been already working on Redis, but there was a hitch, “Zero Downtime Migration”. Also, the redis was getting used as a database, not as Caching Server. So I started to think of different ways for migrating Redis Master-Slave setup to Redis Cluster and finally, I came up with an idea of migration.
Before we jump to migration, I want to give an overview regarding when we can use Redis as a database, and how to choose which setup we should go with Master-Slave or Cluster mode.

Redis as a Database

Sometimes getting data from disks can be time-consuming. In order to increase the performance, we can put the requests those either need to be served first or rapidly in Redis memory and then the Redis service there will keep rest of the data in the main database. So the whole architecture will look like this:-

Image result for redis as database

Redis Master-Slave Replication

Beginning with the explanation about Redis Master-Slave. In this phenomenon, Redis can replicate data to any number of nodes. ie. it lets the slave have the exact copy of their master. This helps in performance optimizations.

I bet now you can use Redis as a Database.

Redis Cluster

A Redis cluster is simply a data sharding strategy. It automatically partitions data across multiple Redis nodes. It is an advanced feature of Redis which achieves distributed storage and prevents a single point of failure.

Replication vs Sharding

Replication is also known as mirroring of data. In replication, all the data get copied from the master node to the slave node.

Sharding is also known as partitioning. It splits up the data by the key to multiple nodes.

As shown in the above figure,  all keys 1, 2, 3, 4 are getting stored on both machine A and B.

In sharding, the keys are getting distributed across both machine A and B. That is, the machine A will hold the 1, 3 key and machine B will hold 2, 4 key.

I guess now everyone has a good idea about Redis working mechanism. So let’s start discussing the migration of Redis.

Migration

Unfortunately, redis doesn’t have a direct way of migrating data from Redis-Master Slave to Redis Cluster. Let me explain it to you why?

We can start Redis service in either cluster mode or standalone mode. Now your solution would be that we can change the Redis Configuration value on-fly(means without restarting the Redis Service) with redis-cli. Yes, you are absolutely correct we can change the Redis configuration on-fly but unfortunately, Redis Mode(cluster or standalone) can’t be decided on-fly, for that we have to restart the service.

I guess now you guys will understand my situation :).

For migration, there are multiple ways of doing it. However, we needed to migrate the data without downtime or any interruptions to the service.

We decided the best course of action was a steps process:-

  • Firstly we needed to create a different Redis Cluster environment. The architecture of the cluster environment was something like
  • The next step was to update all the services (application) to send all the write operations to both servers(cluster and master-slave). The read commands (GET) will still go to the old setup.
  • But still, we don’t have the guarantee that all non-expirable data would make it over. So we can run a step to iterate through all of the keys and DUMP/RESTORE them into the new setup. 
  • Once the new Redis Server looks good we could make the appropriate changes to the application to point solely to the new Redis Server.

I know the all steps are easy except the second step. Fortunately, redis provides a method of key scanning through which we can scan all the key and take a dump of it and then restore it in the new Redis Server.
To achieve this I have created a python utility in which you have to define the connection details of your old Redis Server and new Redis Server.

You can find the utility here.

https://github.com/opstree/redis-migration

I have provided the detail information on using this utility in the README file itself. I guess my experience will help you guys while redis migration.

Replication or Clustering?

I know most people have a query that when should we use replication and when clustering :).

If you have more data than RAM in a single machine, use Redis Cluster to shard the data across multiple databases.

If you have less data than RAM in a machine, set up a master-slave replication with sentinel in front to handle the fai-lover.

The main idea of writing this blog was to spread information about Replication and Sharding mechanism and how to choose the right one and if mistakenly you have chosen the wrong one, how to migrate it from :).

There are multiple factors yet to be explored to enhance the flow of migration if you find that before I do, please let me know to improve this blog.

I hope I explained everything and clear enough to understand.

Thanks for reading. I’d really appreciate any and all feedback, please leave your comment below if you guys have some feedbacks.

Happy Coding!!!!

Redis Load Testing

As I mentioned in my previous blog on Redis Best Practices that in my upcoming blog I will discuss about load testing on Redis so here I am ready with my blog in which I will explain how can we measure our Redis Performance. Although there are plenty of articles out there on this similar topic, but I want to share my experience as a DevOps. Also, I want to share the methods which we are implementing at our organization.

So, Load testing What, Why?

The first thought which comes to our mind is why do we need load testing, as our environment is already working fine. Or it is working fine since the first launch.

But Hey!!! let me tell you something it’s not simple as that, as we know everything has its own limits and knowing your limits is always helpful. When we are not ready to face the problem of the increasing load, our environment can easily collapse. There is saying as well

Prevention is better than cure

In simple words, it means it’s easier to stop something bad happening in the first place than to repair the damage after it happened.

So when I decided to test the redis performance, it was not quite easy to start, there are plenty of load testing frameworks were present but which to choose? So I did the comparison between various load testing framework. Although there is already Jmeter was available which is a popular load testing tool but I found it a bit complex to learn easily and rapidly and also it requires a hell lot of resources so I have chosen an awesome Python load testing framework, Locust, which is very lightweight and easy to setup load testing framework.

The golden rule before starting the load testing is that you should have metrics or a number in your mind which you want to achieve.

So I will be using my own organization load testing utility which we have created for Redis Load or Performance Testing.

You can find the code at- https://github.com/opstree/redis-load-test

So you can simply clone the git repo like this:-

git clone https://github.com/opstree/redis-load-test.git

As Locust is a python based project so it doesn’t have long dependencies list but surely it does have some dependency. So after repo cloning, we have to install these dependencies

cd Scripts
pip3 install -r requirments.txt

Once the dependency hurdle is crossed we can move to the step in which we will connect our utility to Redis. To achieve this we have a file called redis.json in the Scripts folder of the repo, you just have to update the redis details in that file. For example:-

{
    "redis_host": "10.1.1.100",
    "redis_port": "6379",
    "redis_password": ""
}

Once you are done with connection details, kaboom you are all set to use the performance testing utility. Just go on the terminal and run this command.

locust -f redis_get_set.py

The output will be something like this

Now go and open up the URL on the browser by http://your_ip:8089

The UI page will look like this

You will have two empty blocks there:-

Number of users to simulate:- Total number of user connection request which you want to make.

Hatch Rate:- How quickly you want to spawn users.

After filling these details you can simply start swarming and you can wait until it completes its execution. Once the execution will be completed you will have this kind of details.

This page is loading data in the form of statistics but you can also see the data in a beautiful graph format on the same UI. For example-

I have also provided detailed information in the README file as well of the repo.

One of the benefits which I feel important in doing load testing is that you can set up a performance baseline according to your environment. And yes, if you are not getting the desired output of your Redis Performance you can check out our blog on Redis Best Practices and Performance Tuning here.

The main idea of writing this blog was to encourage people to know the limitations of their environment and to make it ready for any kind of challenge.

I hope I explained everything clearly enough to understand. If you do have any questions or suggestions, please feel free to ask.

Cheers Till Next Time!!!

Linux Namespaces – Part 2

Before talking about the types of namespaces we are assuming that you have gone through our First Part of Linux Namespaces, if not you can check it here.

 

Types of Namespaces

So Basically we have seven types of Linux Namespaces:-
  1. CGroups:- Basically cgroups virtualize the view of process’s cgroups in /proc/[pid]/cgroups. Whenever a process creates a new cgroup it enters in a new namespace in which all current directories become cgroup root directories of the new namespace. So we can say that it isolates cgroup root directory.
  2. IPC(Interpolation Communication):- This namespace isolates interpolation communication. For example, In Linux, we have System V IPC(A communication mechanism) and Posfix (for message queues) which allows processes to exchange data in form of communication. So in simple words, we can say that IPC namespace isolates communication.
  3. Network:- This namespace isolates systems related to the network. For example:- network devices, IP protocols, Firewall Rules (That’s why we can use the single port with single service )
  4. Mount:- This namespace isolates mount points that can be seen by processes in each namespace. In simple words, you can take an example of filesystem mounting in which we can mount only one device or partition on a mount-point.
  5. PID:- This namespace isolates the PID. (In this child processes cannot see or trace the parent process but parent process can see or trace the child processes of the namespace. Processes in different PID namespace can have same PID.)
  6. User:- This namespace isolates security related identifier like group id and user id. In simple words, we can say that the process’s group and user id has full privilege inside the namespace but not outside the namespace.
  7. UTS:- This namespace provides the isolation on hostname and domain name. It means processes has a separate copy of domain name or hostname so while changing hostname or domain name it will not affect the rest of the system.

Namespace Management

This is the most advanced topic of Linux namespaces which should be done on kernel level. For the namespace management, you have to write a C program.
For management of namespace, we have these functions available in Linux:-
  • clone():-  If we use standalone clone() it will create a new process only, but if we pass one or more flags like CLONE_NEW*, then the new namespace will be created and child process will become the member of it.
  • setns():- This allows joining existing namespace. The namespace is specified by the file descriptor referenced to process.
  • unshare():- This allows calling process to disassociate from parts of current namespace. Basically, this function works on the processes that are being shared by other’s namespace as well for ex:- mount namespace.

Redis Best Practices and Performance Tuning

One of the thing that I love about my organization is that you don’t have to do the same repetitive work, you will always get the chance to explore some new technologies. The same chance came across to me a few days back when one of our clients was facing issue with Redis.
They were using the Redis Cluster with Sentinel for which they were facing issue regarding performance, whenever the connection request was high the Redis Cluster was not able to bear the load.
Since they were using a decent configuration of the server in terms of CPU and Memory but the result was the same. So now what????
The Answer was to tune the performance.

There are plenty of Redis performance articles out there, but I wanted to share my experience as a DevOps with Redis by creating an article which will include the most essential and important stuff that is needed for a Developer or a DevOps Engineer.

So let’s get started.

 TCP-KeepAlive

Keepalive is a method to allow the same TCP connection for HTTP conversation instead of opening a new one with each new request.

In simple words, if the keepalive is off the Redis will open a new connection for every request which will slow down its performance. If the keepalive is on then Redis will use the same TCP connection for requests.

Let’s see the graph for more details. The Red Bar shows the output when keepalive is on and Blue Bar shows the output when keepalive is off

For enabling the TCP keepalive, Edit the redis configuration and update this value.

vim /etc/redis/redis.conf
# Update the value to 0
tcp-keepalive 0

Pipelining

This feature could be your lifesaver in terms of Redis Performance. Pipelining facilitates a client to send multiple requests to the server without waiting for the replies at all and finally reads the reply in a single step.

For example:-

1

You can also see in the graph as well.

Pipelining will increase the performance of redis drastically.

Max-Connection

Max-connection is the parameter in which is used to define the maximum connection limit to the Redis Server. You can set that value accordingly (Considering your server specification) with the following steps.

sudo vim /etc/rc.local

# make sure this line is just before of exit 0.
sysctl -w net.core.somaxconn=65365

This step requires the reboot if you don’t want to reboot the server execute the same sysctl command on the terminal itself.

Overcommit Memory

Overcommit memory is a kernel parameter which checks if the memory is available or not. If the overcommit memory value is 0 then there is a chance that your Redis will get OOM (Out of Memory) error. So do me a favor and change its value to 1 by using the following steps

echo 'vm.overcommit_memory = 1' >> /etc/sysctl.conf

RDB Persistence and Append Only File

RDB persistence and Append Only File options are used to persist data on disk. If you are using the cluster mode of Redis then the RDB persistence and AOF is not required. So simply comment out these lines in redis.conf

sudo vim /etc/redis/redis.conf

# Comment out these lines
save 900 1
save 300 10
save 60 10000

rdbcompression no
rdbchecksum no

appendonly no

Transparent Huge Page(THP)

Most of the people are not aware of this term. Basically, For making the translation of physical and virtual memory kernel uses the concept of paging. This feature was defined to enhance the memory mapping process but somehow it slows down the databases which are memory based (for example – in the case of Redis). To overcome this issue you can disable THP.

sudo vim /etc/rc.local # Add this line before exit 0 echo never > /sys/kernel/mm/transparent_hugepage/enabled

As graph also shows the difference in performance. The Red Bar is showing THP disabled performance and Blue Bar is showing THP disabled performance.

Some Other Basic Measures in Redis Configuration

Config Option

Value

Description

maxmemory

70% of the system

maxmemory should be 70 percent of the system so that it will not take all the resource of the server.

maxmemory-policy

volatile-lru

It adds a random key with an expiry time

loglevel

notice

Loglevel should be “notice”, so that log will not take too much resource

timeout

300

There should be a timeout value as well in redis configuration which prevents redis from spending too much time on the connection. It closes the connection of the client if it is ideal for more than 300 seconds.

So now your redis is ready to give a killer performance. In this blog, we have discussed redis best practices and performance tuning.
There are multiple factors which are yet to be explored to enhance the performance of Redis if you find that before I do, please let me know to improve this blog.

In my next blog, I will discuss around how can we do Redis Performance Testing and how we are doing it in our Organisation.

Best Practices for Writing a Shell Script

I am a lazy DevOps Engineer. So whenever I came across the same task more than 2 times I automate that. Although now we have many automation tools, still the first thing that hit into our mind for automation is bash or shell script.
After making a lot of mistakes and messy scripts :), I am sharing my experiences for writing a good shell script which not only looks good but also it will reduce the chances of error.

The things that every code should have:-
     – A minimum effort in the modification.
     – Your program should talk in itself, so you don’t have to explain it.
     – Reusability, Of course, I can’t write the same kind of script or program again and again.

I am a firm believer in learning by doing. So let’s create a problem statement for ourselves and then try to solve it via shell scripting with best practices :). I would like to have solutions in the comment section of this blog.


Problem Statement:- Write a shell script to install and uninstall a package(vim) depending on the arguments. The script should tell if the package is already installed. If no argument is passed it should print the help page.

So without wasting time let’s start for writing an awesome shell script. Here is the list of things that should always be taken care of while writing a shell script.

Lifespan of Script

If your script is procedural(each subsequent steps relies on the previous step to complete), do me a favor and add set -e in starting of the script so that the script exists on the first error. For example:-

#!/bin/bash
set -e # Script exists on the first failure
set -x # For debugging purpose

Functions

Ahha, Functions are my most favorite part of programming. There is a saying

Any fool can write code that a computer can understand. Good programmers write code that humans can understand. 

To achieve this always try to use functions and name them properly so that anyone can understand the function just by reading its name. Functions also provide the concept of re-usability. It also removes the duplicating of code, how? let’s see this

#!/bin/bash 
install_package() {
   local PACKAGE_NAME="$1"
   yum install "${PACKAGE_NAME}" -y
}
install_package "vim"

Command Sanity

Usually, scripts call other scripts or binary. When we are dealing with commands there are chances that commands will not be available on all systems. So my suggestion is to check them before proceeding.

#!/bin/bash  
check_package() {
    local PACKAGE_NAME="$1"
    if ! command -v "${PACKAGE_NAME}" > /dev/null 2>&1
    then
           printf "${PACKAGE_NAME} is not installed.\n"
    else
           printf "${PACKAGE_NAME} is already installed.\n"
    fi
}
check_package "vim"

Help Page

If you guys are familiar with Linux, you have certainly noticed that every Linux command has its help page. The same thing can be true for the script as well. It would be really helpful to include –help flag.

#!/bin/bash  
INITIAL_PARAMS="$*"
help_function() {
   {
        printf "Usage:- ./script <option>\n"
        printf "Options:\n"
        printf " -a ==> Install all base softwares\n"
        printf " -r ==> Remove base softwares\n"
    }
}
arg_checker() {
     if [ "${INITIAL_PARAMS}" == "--help" ]; then
            help_function
     fi
}
arg_checker

Logging

Logging is the most critical thing for everyone whether he is a developer, sysadmin or DevOps. Debugging seems to be impossible without logs. As we know most applications generate logs for understanding that what is happening with the application, the same practice can be implemented for shell script as well. For generating logs we have a bash utility called logger.

#!/bin/bash 
DATE=$(date)
declare DATE
check_file() {
     local FILENAME="$1"
     if ! ls "${FILENAME}" > /dev/null 2>&1
     then
            logger -s "${DATE}: ${FILENAME} doesn't exists"
     else
           logger -s "${DATE}: ${FILENAME} found successfuly"
     fi
}
check_file "/etc/passwd"

Variables

I like to name my variables in Capital letters with an underscore, In this way, I will not get confused with the function name and variable name. Never give a,b,c etc. as a variable name instead of that try to give a proper name to a variable as well just like functions.

#!/bin/bash 
# Use declare for declaring global variables
declare GLOBAL_MESSAGE="Hey, I am a global message"
# Use local for declaring local variables inside the function
message_print() {
    local LOCAL_MESSAGE="Hey, I am a local message"
    printf "Global Message:- ${GLOBAL_MESSAGE}\n"
    printf "Local Message:- ${LOCAL_MESSAGE}\n"
}
message_print

Cases

Cases are also a fascinating part of shell script. But the question is when to use this? According to me if your shell program is providing more than one functionality basis on the arguments then you should go for cases. For example:- If your shell utility provides the capability of installing and uninstalling the software.

#!/bin/bash  
print_message() {
    MESSAGE="$1"
    echo "${MESSAGE}"
}
case "$1" in
   -i|--input)
      print_message "Input Message"
      ;;
   -o|--output)
        print_message "Output Message"
        ;;
   --debug)
       print_message "Debug Message"
       ;;
    *)
      print_message "Wrong Input"
      ;;
esac

In this blog, we have covered functions, variables, the lifespan of a script, logging, help page, command sanity. I hope these topics help you in your daily life while using the shell script. If you have any feedback please let me know through comments.
Cheers Till the next Time!!!!