Tuesday, July 16, 2019

Unix File Tree Part-1




Nature has its own way to reach out for perfection and the same should be our instinct to make our creations perfect.
Dennis Ritchie, father of Unix and an esteemed computer scientist might have implied the same approach for Unix directory structure.

Why?

Before getting into the hierarchy of Unix File Tree lets discuss why we need it. The need for a directory structure arises when multiple users are handling multiple software along with their dependent files. Let me explain this with a couple of scenarios.

Scenario-1:

Consider an ideal software or package which requires multiple files to function properly.
  • Binary files
  • Configuration files
  • Log files
  • Data files
  • Metadata files during execution
  • Libraries
 For now, let's consider there is just one directory and I am keeping all of the dependent files in that directory. 

$ ls
package-1.binary  package-1.conf  package-1.data  package-1.lib  package-1.log  package-1.tmp
Another software comes in the picture which has its own dependent files.

$ ls
package-1.binary  package-1.data  package-1.log  package-2.binary  package-2.data  package-2.log
package-1.conf    package-1.lib   package-1.tmp  package-2.conf    package-2.lib   package-2.tmp

Things will get messy while dealing with various software since handling them won't be easy and will lead to a chaotic situation.


Scenario-2:

Suppose I am a system admin and managing all of the software in the above scenario-1. To make things organized I created different directories to place the dependent files.
  • Binary files --> /dir-1
  • Configuration files --> /dir-2
  • Log files --> /dir-3
  • Data files --> /dir-4
  • Meta files --> /dir-5
  • Libraries --> /dir-6
As the work gets overloaded I need more admins to support they won't be able to relate with the naming convention as I did.
To escape this situation the creator of Unix decided to follow a philosophy "Convention over Configuration".
 As the name suggest giving priority to defined convention over individual's configuration. So that everyone should be on the same page and keeping that in mind everyone else will follow.
And the simulation of the philosophy was like this

  • Binary files --> /bin
  • Configuration files --> /etc
  • Log files --> /log
  • Data files --> /var
  • Meta files --> /tmp
  • Libraries --> /lib
Which resulted in the Unix File Tree

$ tree -d -L 1
.
├── apps
├── bin
├── boot
├── dev
├── etc
├── home
├── lib
├── lib64
├── lost+found
├── media
├── mnt
├── opt
├── proc
├── root
├── run
├── sbin
├── snap
├── srv
├── sys
├── tmp
├── usr
└── var

22 directories

You might be thinking that how will Unix figure out where is the configuration file, where is the binary and rest of the stuff of the software.
Here comes the role of the PATH variable

$ echo $PATH
/home/dennis/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin

These are environment variables specifying a set of directories where executable programs are located. In general, each executing process or user session has its own PATH setting.

So now we have a proper understanding of why do we need a File Tree.
For diving deep into the significance of each one of the directory stay tuned for Unix File Tree Part-2.

Cheers!

Image Source: https://freestocks.org/fs/wp-content/uploads/2016/04/tree_trunk.jpg



Tuesday, July 9, 2019

Postfix Email Server integration with SES

Have you ever thought of setting up your web or application server with your own email server? Well, when you setup a application it is likely to have your own email server to handle incoming and outgoing mail to your domain. Before I get into my topic I assume that you got some basic knowledge of AWS. Here I am going to explain you how to setup a simple postfix email server with AWS SES to handle all your email. For any kind of more information please refer AWS SES doc. Lets put it in simple way. We have two phases in this implementation.

  1. Configure SES with Domain
  2. Configure postfix and integrate with SES on EC2
Configure SES with Domain
Amazon SES requires that you verify your email address or domain, to confirm that you own it and to prevent others from using it. When you verify an entire domain, you are verifying all email addresses from that domain, so you don't need to verify email addresses from that domain individually. For example, if you verify the domain example.com, you can send email from user1@example.com, user2@example.com, or any other user at example.com. Lets verify our domain name with SES.

  • Go to the AWS console management and click on the SES.
  • Click on the Domain availabe on left top corner.
  • Click verify new Domain. 



  • On the Verify a New Domain, for Domain, type the name of the domain that you registered using Route 53, and then choose Verify This Domain.   
  • On the Verify a New Domain dialog box, choose Use Route 53. Your Domain Verification and Email Receiving Record will be updated in Route 53.



Note

If you don't see Use Route 53 your domain may not be registered with Route 53.


  • Once verified your domain, you can use any email address from this domain as your email.
  • To establish connection between postfix and SES you will need SMTP credential.
  • Now choose the SMTP settings in same SES console.
  • Choose Create My SMTP Credential.
  • Give the user name and click create.
  • Download the credentials this will be uses when you configure server.


Configure postfix and integrate with SES on EC2

In this section you are going to install and configure postfix on EC2 instance.
    Prerequisites


  • You should have up and running EC2 machine.
  • Open port 25(SMTP) and 22(SSH) for all security group.

Lets get started

Lets login to machine using putty or ssh client. Now need to create a domain on Route53.



   Route53


  • Go to the AWS console and choose Route53.
  • Choose Hosted Zone and select your domain where you wish to configure.
  • Click on create record set to add a new record set, then select A-IPv4 address for the resource type.
  • Add subdomain name in NAME field and enter a record value that is your EC2 IP.
  • Set the desired TTL.
  • Then click on Create button.



Now we will install Postfix on our EC2 machine.

sudo apt-get update

sudo apt-get install postfix
      
Now we need to make some changes in postfix configuration file. Lets do it one by one.


To integrate our postfix with SES we need to add some more line in main.cf.



vim /etc/mailname
example.com


vim /etc/postfix/main.cf

mydestination = $myhostname, localhost.$mydomain, localhost, $mydomain

myhostname = example.com

myorigin = /etc/mailname

relayhost = [email-smtp.us-east-1.amazonaws.com]:587

smtp_sasl_auth_enable = yes

smtp_sasl_security_options = noanonymous

smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd

smtp_use_tls = yes

smtp_tls_security_level = encrypt

smtp_tls_note_starttls_offer = yes

smtp_tls_CAfile = /etc/ssl/certs/ca-certificates.crt
     

NOTE:

  Value of relay host will change depending upon SES region you use.

Comment out of the following line of master.cf file by putting # infront of it:



vim /etc/postfix/master.cf
#-o smtp_fallback_relay=

Edit the file /etc/postfix/sasl_passwd if not present please create it:


vim /etc/postfix/sasl_passwd

[email-smtp.us-west-2.amazonaws.com]:587 <IAMUSERNAME>:<PASSWORD>

NOTE: Add your SMTP username and password that you downloaded. Save and close the file and use the below command to create hashmap database.

sudo postmap /etc/postfix/sasl_passwd

Stop and Start Postfix:

sudo service postfix stop

sudo service postfix start
   

Tuesday, July 2, 2019

Speeding up Ansible Execution Part 2


MITOGEN


In the previous post, we discussed various ways to reduce the ansible-playbook execution time, those changes were mostly made in the ansible config file, by adding or adjusting certain parameters in the file. But as you may have noticed that those methods were not that effective in certain cases, while using those methods we have to be very cautious about the result as they may affect ansible performance in one way or the other.


Generally, for the slower ansible execution, the main culprit is the way ansible is executed on the hosts. It creates multiple SSH connections and does not fully utilize the available resources. To tackle this problem, MITOGEN came to rescue !!!


Mitogen is a distributed programming library for Python. The Mitogen extension is a set of plug-ins for Ansible that enable it to operate via Mitogen, vastly improving its performance and enhancing its functional capability.

We all know about the strategies in ansible – linear, free & debug., the mitogen is just defined in the strategy column of the config file, so it is just a strategy, we are not making any other changes in the config file of the ansible so it is not affecting any other parameter, it is just the way, playbooks will be executed on the hosts.

Now coming to the mitogen installation part, we just have to download this package at a particular location and make some changes in the ansible config file as shown below,

[defaults]
strategy_plugins = /path/to/mitogen/ansible_mitogen/plugins/strategy
strategy = mitogen_linear

we have to define the path where we have stored our mitogen files, and mention the strategy as “mitogen_linear”, under the default section of the config file, and we are good to go.

Now, after the Mitogen installation part, when we run our playbook, we will notice a reasonable reduction in the execution time,

Mitogen is fast because of the following reasons,


  • One connection is created per target and system logs aren’t spammed with repeated authentication events.
  • A single network roundtrip is used to execute a step whose code already exists in RAM on the target.
  • Processes are aggressively reused, avoiding the cost of invoking Python and recompiling imports, saving 300-800 ms for every playbook step.
  • Code is cached in the RAM, which further increases the speed.
  • Generally, ansible repeatedly rewrites and extracts ZIP files to temporary directories in the target hosts, mitogen also reduces these rewrites.


      All the above-mentioned features make the ansible to run faster.

      Mitogen is another extension for ansible that provides a decrease in its execution time and it is very easy to use, I think MITOGEN is very underrated and one of its kind, and we should definitely give it a try.

      I hope I have explained everything well, any suggestion/queries are highly appreciated.


Thanks !!!




Source:

https://mitogen.networkgenomics.com/ansible_detailed.html

Tuesday, June 25, 2019

Speeding up Ansible Execution Part 1



The knowledge of one of the SCM tools is a must for any DevOps engineer, ANSIBLE is one of the popular tools in this category, we all are aware of the ease that Ansible provides whether it is infra provisioning, orchestration or application deployment.

The reason for the vast popularity of Ansible is the long list of modules it provides to support any level of automation, moreover it also gives users the flexibility to create their own modules as per their requirement.

But The purpose of this blog is not to mention the features that ansible provides, but to show how we can speed up our playbook execution in Ansible, as a beginner executing ansible, is very easy and it also feels like saving a lot of time with it, but as you dive deep into it, you will come to know that running ansible playbooks will engage you for a considerable amount of time.

There are a lot of articles available on the internet on how we can speed up our ansible execution, so I have decided to sum up those articles into my blog, with the following methods, we can reduce our execution time without compromising with the overall performance of Ansible.

Before starting, I request  you guys to make a small change in your ansible configuration file (ansible.cfg), this small change will help you in tracking the time it will take for the playbook execution, and it also lists out the time is taken by each task.

Just add these lines to your ansible.cfg file under default section,

[default]
callback_whitelist = profile_tasks


Forks


When you are running your playbooks on various hosts, then you may have noticed that the number of servers where the playbook executes simultaneously is 5. You can increase this number inside the ansible.cfg file:

# ansible.cfg
forks = 10

or with a command line argument to ansible-playbook with the -f or --forks options. We can increase or decrease this value as per our requirement.
while using forks we should use "local_action" or "delegated" steps limited in number, as with higher fork value it will affect the ansible-server's performance.


Async


In ansible, each task blocks the playbook, meaning the connections stay open until the task is done on each node, which is some cases takes a lot of time, here we can use “async” for those particular tasks, with the help of this ansible will automatically move to another task without waiting for the task execution on each node.

To launch a task asynchronously, we need to specify its maximum runtime and how frequently we would like to poll for status, it's default value in 10 sec.

tasks:
- name: "name of the task"
  command: "command we want to execute" 
    async: 40
    poll: 15

The only condition is that the subsequent tasks must not have a dependency on this task.


Free Strategy 


When running Ansible playbooks, you might have noticed that the Ansible runs every task on each node one by one, it will not move to another task until a particular task is completed on each node, which will take a lot of time, in some cases.

By default, the strategy is set to “linear”, we can set it to free.

---
- hosts: "hosts/groups"
  name: "name of the playbook"
  strategy: free

It will run the playbook on each host independently, without waiting for each node to complete.

Facts gathering is the default feature while executing playbook, sometimes we don’t need it.
In those cases, we can disable facts gathering,
This has advantages in scaling Ansible in push mode with very large numbers of systems.

---
- hosts: "hosts/groups"
  name: "name of the playbook"
  gather_facts: no


Pipelining 


For each task in Ansible, there are lots of ssh connection created, which results in increasing the total execution time. Pipelining reduces the number of ssh operations required to execute a module by executing many Ansible modules without an actual file transfer. We just have to make these changes in the ansible.cfg file,

# ansible.cfg
Pipelining = True

Although this can result in a very significant performance improvement when enabled, Pipelining is disabled by default because requiretty is enabled by default for many distros.


Poll Interval


When we run any the task in Ansible, it starts polling to check if the task is completed on the host or not, we can decrease this polling interval time in ansible.cfg to increase its performance, but it will increase the CPU usage, so we need to adjust its value accordingly
We just have to adjust this the parameter in the ansible.cfg file,

internal_poll_interval=0.001


so, these are the various ways to decrease our playbook execution time in Ansible, generally we don’t use all these methods in a single setup, we use these features as per the requirement, 

The main motive of writing this blog is to determine the factors which will help in fine-tuning the Ansible performance, and there are many more factors which serves the same purpose but here I am mentioning the most important parameters among them.

I hope I have covered all the important aspects of the blog, feel free to provide your valuable feedback.

Thanks !!!

Source:

https://mitogen.networkgenomics.com/ansible_detailed.html

Tuesday, June 11, 2019

What Without Internet


What without Internet?




I had a dream a few days ago in which the existence of the internet was gone, When I woke up I thought about what would happen if there is no Internet for a day?


Sure, it would cause quite a bit of panic and uproar and it would be havoc for an organization to work without the internet, but if the internet resumed normally after 24 hours are over, things would return to normal pretty quickly.


Now, switch it off for a longer time, possibly a week or a month, that would have a more lasting impact, since, in that time, a significant number of people would find themselves unable to meet their obligations or do their business at all. This would be somewhat mitigated by the fact that the situation is a sort of a 'natural disaster', but still, those who really depend on the internet for their business would likely feel a lasting negative impact.
               
What if I say there are some organizations that work in a situation like there is no internet, yes it's right due to some sort of security reasons they don't prefer to use the public internet. Banks, space organizations, and many security agencies fall under this category.


Now, a question arises here how they manage to do regular updates and the installation of different packages in their different systems? The answer is quite simple: “the use of satellite server”.


Recently I got a task in relation to this context, in which:

1. A prerequisite here is that you don't have internet connectivity in your system but one of the systems with which you can connect has internet connectivity.
2. Setup individual satellite server in your local network.
3. Install packages and regular updates.


To do so here I prefer to use the FTP satellite server


How to implement Ftp satellite server

Pre-requisites

An Ubuntu Server, and a non-root user with sudo privileges.
The system is configured with vsftpd

Suppose we are doing the installation of Jenkins


Make a directory pkg.jenkins.io in /var/www/html/




 Contents of pkg.jenkins.io




 

 

 

 

 

 

 

 

 

 

 

Paste the host link of debian file in /etc/apt/source.list.d




 

 Run the command sudo apt-get update





 Now run the command for installation of the package

 



“ The internet made fame wack and anonymity cool ”

So, far from the above context, we have learned about setting up FTP for users with a local account. If you need to use an external authentication source, you might want to look into vsftpd's support of virtual users. This offers a rich set of options through the use of PAM, the Pluggable Authentication Modules, and is a good choice if you manage users in another system such as LDAP or Kerberos.

I hope I explained everything clearly enough to understand. If you have some better way of implementing a satellite server please help me to improve this blog.

Thanks for reading my writing. I’d really appreciate any kind of feedback in the comments.

Cheers till next time!!!!

Tuesday, June 4, 2019

Redis Zero Downtime Cluster Migration

A few days back I came across a problem of migrating a Redis Master-Slave setup to Redis Cluster. Initially, I thought it to be a piece of cake since I have been already working on Redis, but there was a hitch, "Zero Downtime Migration". Also, the redis was getting used as a database, not as Caching Server. So I started to think of different ways for migrating Redis Master-Slave setup to Redis Cluster and finally, I came up with an idea of migration.
Before we jump to migration, I want to give an overview regarding when we can use Redis as a database, and how to choose which setup we should go with Master-Slave or Cluster mode.

Redis as a Database

Sometimes getting data from disks can be time-consuming. In order to increase the performance, we can put the requests those either need to be served first or rapidly in Redis memory and then the Redis service there will keep rest of the data in the main database. So the whole architecture will look like this:-

Image result for redis as database

Redis Master-Slave Replication

Beginning with the explanation about Redis Master-Slave. In this phenomenon, Redis can replicate data to any number of nodes. ie. it lets the slave have the exact copy of their master. This helps in performance optimizations.

I bet now you can use Redis as a Database.

Redis Cluster

A Redis cluster is simply a data sharding strategy. It automatically partitions data across multiple Redis nodes. It is an advanced feature of Redis which achieves distributed storage and prevents a single point of failure.

Replication vs Sharding

Replication is also known as mirroring of data. In replication, all the data get copied from the master node to the slave node.

Sharding is also known as partitioning. It splits up the data by the key to multiple nodes.

As shown in the above figure,  all keys 1, 2, 3, 4 are getting stored on both machine A and B.

In sharding, the keys are getting distributed across both machine A and B. That is, the machine A will hold the 1, 3 key and machine B will hold 2, 4 key.


I guess now everyone has a good idea about Redis working mechanism. So let's start discussing the migration of Redis.

Migration

Unfortunately, redis doesn't have a direct way of migrating data from Redis-Master Slave to Redis Cluster. Let me explain it to you why?


We can start Redis service in either cluster mode or standalone mode. Now your solution would be that we can change the Redis Configuration value on-fly(means without restarting the Redis Service) with redis-cli. Yes, you are absolutely correct we can change the Redis configuration on-fly but unfortunately, Redis Mode(cluster or standalone) can't be decided on-fly, for that we have to restart the service.

I guess now you guys will understand my situation :).

For migration, there are multiple ways of doing it. However, we needed to migrate the data without downtime or any interruptions to the service.

We decided the best course of action was a steps process:-
  • Firstly we needed to create a different Redis Cluster environment. The architecture of the cluster environment was something like

  • The next step was to update all the services (application) to send all the write operations to both servers(cluster and master-slave). The read commands (GET) will still go to the old setup.
  • But still, we don't have the guarantee that all non-expirable data would make it over. So we can run a step to iterate through all of the keys and DUMP/RESTORE them into the new setup. 
  • Once the new Redis Server looks good we could make the appropriate changes to the application to point solely to the new Redis Server.

I know the all steps are easy except the second step. Fortunately, redis provides a method of key scanning through which we can scan all the key and take a dump of it and then restore it in the new Redis Server.
To achieve this I have created a python utility in which you have to define the connection details of your old Redis Server and new Redis Server.

You can find the utility here.

https://github.com/opstree/redis-migration

I have provided the detail information on using this utility in the README file itself. I guess my experience will help you guys while redis migration.

Replication or Clustering?

I know most people have a query that when should we use replication and when clustering :). 
If you have more data than RAM in a single machine, use Redis Cluster to shard the data across multiple databases.
If you have less data than RAM in a machine, set up a master-slave replication with sentinel in front to handle the fai-lover.

The main idea of writing this blog was to spread information about Replication and Sharding mechanism and how to choose the right one and if mistakenly you have chosen the wrong one, how to migrate it from :).

There are multiple factors yet to be explored to enhance the flow of migration if you find that before I do, please let me know to improve this blog.

I hope I explained everything and clear enough to understand.

Thanks for reading. I'd really appreciate any and all feedback, please leave your comment below if you guys have some feedbacks.

Happy Coding!!!!

Tuesday, May 28, 2019

Where there is a shell, There is a way.


Well, as a DevOps; I like to play around with shell scripts and shell commands especially on a remote system as it just adds some level of fun in it. But what's more thrilling than running shell scripts and command on the remote server, making them return the dynamic web pages or JSON from that remote system.

Yes for most of us it comes as a surprise that just like PHP, JSP, ASP shell scripts can also return us dynamic web pages but, as long time ago a wise man said: "where there is a shell there is a way".

Isn't PHP or JSP a better option for web development?

For a web developer ... yes, but as a DevOps, I want to do all possible stuff from a shell script. And it is quite useful for us to have a shell script as a server-side language for us as we all know the power of shell scripts.

Why do we need this exactly?

Isn't 'for fun' is an obvious reason. But for those who want more than that, I got some points

  • We can use it as a time series based data exporter. 
  • We might want an API that returns us the system info in the form of JSON, and we don't have access to PHP.
  • We might want to see the system information as a web page when we hit a URL.
  • It's not only limited to system info you can do whatever you want from it.
  • With bare minimum on your machine, you can get the max out of it.

Let's get started

Now let's get done with the boring part i.e. configuring Apache
Now I am assuming that Apache is installed on that system as it is needed in order to serve your web pages. So, in order to let Apache serve your script, you need to enable the CGI config by simple commands.
$ cd /etc/apache2/mods-enabled
$ sudo ln -s ../mods-available/cgi.load
and you are ready to go.
Now move to dir where you are going to put your shell scripts.
$ cd /usr/lib/cgi-bin
Once in the dir create a new file hello.sh
$ vim hello.sh
and write the following scripts
#!/bin/bash
echo "Content-type: text/html"
echo ""
echo "hello world! from shell script"
Make sure you make that file executable.
Now I think you have got the pretty much idea what your webpage is going to display.
So restart the Apache server
$ sudo systemctl restart apache2.service

Let's take it to the next level

Now let's see what else can we do, Unlike PHP or JAVA or Python we don't have any framework for shell scripts, so we might have to work a bit. But that's the fun part, right?
So let's get started

Now we are simply going to display that which user is using /usr/sbin/nologin shell

So here are some files that I created in cgi-bin directory in order to display that data as the web page

Header file
<!doctype html>
<html lang="en">
  <head>
    <!-- Required meta tags -->
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">

    <!-- Bootstrap CSS -->
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">

    <title>Hello, world!</title>
  </head>
  <body>
    <h1>All the user using /usr/sbin/nologin shell</h1>
 
 <table class="table">
  <thead>
    <tr>
      <th scope="col">Name</th>
      <th scope="col">User Id</th>
      <th scope="col">Group Id</th>
    </tr>
  </thead>
  <tbody>

Footer file

</tbody>
</table>

    <!-- Optional JavaScript -->
    <!-- jQuery first, then Popper.js, then Bootstrap JS -->
    <script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" crossorigin="anonymous"></script>
    <script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" crossorigin="anonymous"></script>
  </body>
</html>

hello.sh

#!/bin/bash
echo "Content-type: text/html"
echo ""
cat header
cat /etc/passwd | awk -F ':' '{if($7 == "/usr/sbin/nologin"){print "<tr><td>"$1"</td><td>"$3"</td><td>"$4"</td></tr>"}}'
cat footer 

So let's just see what all those files are

Header file and footer file basically contains the starter template of bootstrap that gives you a prebuild web template, and in hello.sh we are extracting those file by using cat and in the middle, we are writing a shell command in order to get the users that are using /usr/sbin/nologin shell and making a template from it using awk.

So now when you hit the same URL output will be like


Now I guess we got the base idea that how can we use a shell script to display web pages of our need. We can also use it as an API as it can return JSON as well. But it's up to the individual how well we can use it for.

Summary

So, in this blog, we saw how with bare minimum we can get most out of it. It is not limited to just some use cases it can be used to create an API which can return valuable information of system or services running on the system. With some good scripting and some tricky HTML template designing, we can achieve a lot.

Unix File Tree Part-1

Nature has its own way to reach out for perfection and the same should be our instinct to make our creations perfect. Dennis Ritchie,...