Tuesday, February 19, 2019

Git Inside Out

Git Inside-Out

Man Wearing Black and White Stripe Shirt Looking at White Printer Papers on the Wall

Git is basically a file-system where you can retrieve your content through addresses. It simply means that you can insert any kind of data into git for which Git will hand you back a unique key you can use later to retrieve that content. We would be learning #gitinsideout through this blog

The Git object model has three types: blobs (for files), trees (for folder) and commits. 

Objects are immutable (they are added but not changed) and every object is identified by its unique SHA-1 hash
A blob is just the contents of a file. By default, every new version of a file gets a new blob, which is a snapshot of the file (not a delta like many other versioning systems).
A tree is a list of references to blobs and trees.
A commit is a reference to a tree, a reference to parent commit(s) and some decoration (message, author).
Then there are branches and tags, which are typically just references to commits.

Git stores the data in our .git/objects directory.
After initialising a git repository, it automatically creates .git/objects/pack and .git/objects/info with no regular file. After pushing some files, it would reflect in the .git/objects/ folder


blob stores the content of a file and we can check its content by command
git cat-file -p <SHA for blob>
or git show <SHA for blob>


A tree is a simple object that has a bunch of pointers to blobs and other trees - it generally represents the contents of a directory or sub-directory.
We can use git ls-tree to list the content of the given tree object


The "commit" object links a physical state of a tree with a description of how we got there and why.

A commit is defined by tree, parent, author, committer, comment
All three objects ( blob,Tree,Commit) are explained in details with the help of a pictorial diagram.

Often we make changes to our code and push it to SCM. I was doing it once and made multiple changes, I was thinking it would be great if I could see the details of changes through local repository itself instead to go to a remote repository server. That pushed me to explore Git more deeply.

I just created a local remote repository with the help of git bare repository. Made some changes and tracked those changes(type, content, size etc).

Below example will help you understand the concept behind it.

Suppose we have cloned a repository named kunal:

Inside the folder where we have cloned the repository, go to the folder kunal then:

cd kunal/.git/

I have added content(hello) to readme.md and made many changes into the same repository as:
adding README.md
updating Readme.md
adding 2 files modifying one
pull request
commit(adding directory).

Go to the refer folder inside .git and take the SHA value for the master head:

This commit object we can explore further with the help of cat-file which will show the type and content of tree and commit object:

Now we can see a tree object inside the tree object. Further, we can see the details for the tree object which in turn contains a blob object as below:

Below is the pictorial representation for the same:

Pictorial Representation

More elaborated representation for the same :

Below are the commands for checking the content, type and size of objects( blob, tree and commit)

kunal@work:/home/git/test/kunal# cat README.md

We can find the details of objects( size,type,content) with the help of #git cat-file

git-cat-file:- Provide content, type or size information for repository objects

You an verify the content of commit object and its type with git cat-file as below:

kunal@work:/home/git/test/kunal/.git # cat logs/refs/heads/master

Checking the content of a blob object(README.md, kunal and sandy)

As we can see first one is adding read me , so it is giving null parent(00000...000) and its unique SHA-1 is 912a4e85afac3b737797b5a09387a68afad816d6

Below are the details that we can fetch from above SHA-1 with the help of git cat-file :

Consider one example of merge:

Created a test branch and made changes and merged it to master.


Here you can notice we have two parents because of a merge request

You can further see the content, size, type of repository #gitobjects like:


This is pretty lengthy article but I’ve tried to make it as transparent and clear as possible. Once you work through the article and understand all concepts I showed here you will be able to work with Git more effectively.
This explanation gives the details regarding tree data structure and internal storage of objects. You can check the content (differences/commits)of the files through local .git repository which stores each object with unique  SHA  hash. this would clear basically the internal working of git.
Hopefully, this blog would help you in understanding the git inside out and helps in troubleshooting things related to git.

Thursday, February 14, 2019

My stint with Runc vulnerability

Today I was given a task to set up a new QA environment. I said no issue should be done quickly as we use Docker, so I just need to provision VM run the already available QA ready docker image on this newly provisioned VM. So I started and guess what Today was not my day. I got below error while running by App image.

docker: Error response from daemon: OCI runtime create failed: container_linux.go:344: starting container process caused "process_linux.go:293: copying bootstrap data to pipe caused \"write init-p: broken pipe\"": unknown.

I figured out my Valentine's Day gone for a toss. As usual I took help of Google God to figure out what this issue is all about, after few minutes I found out a blog pretty close to issue that I was facing


Bang on I got the issue identified. There is a new runc vulnerability identified few days back.

The fix of this vulnerability was released by Docker on February 11, but the catch was that this fix makes docker incompatible with 3.13 Kernel version.

While setting up QA environment I installed latest stable version of docker 18.09.2 and since the kernel version was 3.10.0-327.10.1.el7.x86_64 thus docker was not able to function properly.

So as suggested in the blog I upgraded the Kernel version to 4.x.

rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
yum repolist
yum --enablerepo=elrepo-kernel install kernel-ml
yum repolist all
awk -F\' '$1=="menuentry " {print i++ " : " $2}' /etc/grub2.cfg
grub2-set-default 0
grub2-mkconfig -o /boot/grub2/grub.cfg
And here we go post that everything is working like a charm.

So word of caution to every even
We have a major vulnerability in docker CVE-2019-5736, for more details go through the link
As a fix, upgrade your docker to 18.09.2, as well make sure that you have Kernel 4+ as suggested in the blog.

Now I can go for my Valentine Party 👫

Tuesday, February 5, 2019

Using Ansible Dynamic Inventory with Azure can save the day for you.

As a DevOps Engineer, I always love to make things simple and convenient by automating them. Automation can be done on many fronts like infrastructure, software, build and release etc.

Ansible is primarily a software configuration management tool which can also be used as an infrastructure provisioning tool.
One of the thing that I love about Ansible is its integration with different cloud providers. This integration makes things really loosely coupled, For ex:- we don't require to manage whole information of cloud in Ansible (Like we don't need instance metadata information for provisioning it).

Ansible Inventory

Ansible uses a term called inventory to refer to the set of systems or machines that our Ansible playbook or command work against. There are two ways to manage inventory:-
  • Static Inventory
  • Dynamic Inventory
By default, the static inventory is defined in /etc/ansible/hosts in which we provide information about the target system. In most of the cloud platform when the server gets reboot then it reassigns a new public address and again we have to update that in our static inventory, so this can't be the lasting option.
Luckily Ansible supports the concept of dynamic inventory in which we have some python scripts and a .ini file through which we can provision machines dynamically without knowing its public or private address. Ansible Dynamic Inventory is fed by using external python scripts and .ini files provided by Ansible for cloud infrastructure platforms like Amazon, Azure, DigitalOcean, Rackspace.

In this blog, we will talk about how to configure dynamic inventory on the Azure Cloud Platform.

Ansible Dynamic Inventory on Azure

The first thing that always required to run anything is software and its dependencies. So let's install the software and itsdependencies first. First, we need the python modules of azure that we can install via pip.

$ pip install 'ansible[azure]'

After this, we need to download azure_rm.py

$ wget https://raw.githubusercontent.com/ansible/ansible/devel/contrib/inventory/azure_rm.py

Change the permission of file using chmod command.

$ chmod +x azure_rm.py

Then we have to log in to Azure account using azure-cli

$ az login
To sign in, use a web browser to open the page https://aka.ms/devicelogin and enter the code XXXXXXXXX to authenticate.

The az login command output will provide you a unique code which you have to enter in the webpage i.e.

As part of the best practice, we should always create an Active Directory for different services or apps to restrict privileges. Once you logged in Azure account you can create an Active Directory app for Ansible

$ az ad app create --password ThisIsTheAppPassword --display-name opstree-ansible --homepage ansible.opstree.com --identifier-uris ansible.opstree.com

Don't forget to change your password ;). Note down the appID from the output of the above command.

Once the app is created, create a service principal to associate it with.

$ az ad sp create --id appID

Replace the appID with actual app id and copy the objectID from the output of the above command.
Now we just need the subscription id and tenant id, which we can get by a simple command

$ az account show

Note down the id and tenantID from the output of the above command.

Let's assign a contributor role to service principal which is created above.

$ az role assignment create --assignee objectID --role contributor

Replace the objectID with the actual object id output.

All the azure side setup is done. Now we have to make some changes to your system.

Let's start with creating an azure home directory

$ mkdir ~/.azure 

In that directory, we have to create a credentials file

$ vim ~/.azure/credentials


Please replace the id, appID, password and tenantID with the above-noted things.

All set !!!! Now we can test it by below command

$ python ./azure_rm.py --list | jq

and the output should be like this:-

  "azure": [
  "westeurope": [
  "ansibleMasterNSG": [
  "ansiblelab": [
  "_meta": {
    "hostvars": {
      "ansibleMaster": {
        "powerstate": "running",
        "resource_group": "ansiblelab",
        "tags": {},
        "image": {
          "sku": "7.3",
          "publisher": "OpSTree",
          "version": "latest",
          "offer": "CentOS"
        "public_ip_alloc_method": "Dynamic",
        "os_disk": {
          "operating_system_type": "Linux",
          "name": "osdisk_vD2UtEJhpV"
        "provisioning_state": "Succeeded",
        "public_ip": "",
        "public_ip_name": "masterPip",
        "private_ip": "",
        "computer_name": "ansibleMaster",

Now you are ready to use Ansible in Azure with dynamic inventory. Good Luck :-)

Tuesday, January 29, 2019

Working With AWS KMS

Thousands of organizations use Amazon Web Services (AWS) to host their applications and manage their data in the cloud. The advantage of geographic availability, scalability and reliability make AWS a great choice.
Due to recent and more frequently-occurring breaches in security in a number of environments, It is necessary for us to take data protection strategy seriously.

We all can agree that Information security is always of paramount importance, whether data is stored on-premises or in the cloud.
In this article we will go through AWS KMS and how to use KMS in our existing AWS account.

AWS Key Management Service (AWS KMS) is a managed service that makes it easy for us to create, control, rotate, and use encryption keys.

It also centralizes key management, with one dashboard that offers creation, rotation, and lifecycle management functions.

AWS KMS Concept

1. Customer Master Key

Customer Master Keys (CMKs) or Master Encryption Key(MEK) are used to generate, encrypt, and decrypt the data keys(DK) that you use outside of AWS KMS to encrypt your data. This strategy is known as envelope encryption. CMKs are created in AWS KMS and never leave AWS KMS unencrypted. They can only be accessed through AWS KMS.
The master keys are protected by FIPS 140-2 validated cryptographic modules.

2. Data Keys

Data keys are encryption keys that you can use to encrypt data, including large amounts of data and other data encryption keys.
You can use AWS KMS customer master keys (CMKs) to generate, encrypt, and decrypt data keys.

3. Encrypting Data

1. First of all a Customer Master Key is created in KMS console.
2. Then to create a data key, AWS KMS uses the CMK to generate a data key. The operation returns a plaintext copy of the data key and a copy of the data key encrypted under the CMK.

3. Now we have both the Master Key and Data Key, we can use the data key to encrypt the data.
4. After using the plaintext data key to encrypt data, we remove it from memory and can store the encrypted data key with the encrypted data so it is available to decrypt the data.

4. Decrypting Data

1. To decrypt your data, pass the encrypted data key to the Decrypt operation.

2. AWS KMS uses CMK to decrypt the data key and then it returns the plaintext data key.
3. Use the plaintext data key to decrypt your data and then remove the plaintext data key from memory as soon as possible.

5. Envelope Encryption 

Envelope encryption is the practice of encrypting plaintext data with a data key, and then encrypting the data key under another key. AWS KMS uses MEK to encrypt the Data Key(DEK).
Hands On Lab: What we are going to do?

We will be Creating a Customer Master Key in AWS-KMS console and will try to upload file on S3 Using KMS Master-Key Encryption. Then try to access the encrypted file.

Step-1: Creating Master Key in AWS-KMS
1. First of all login to AWS Management console and then go to IAM Dashboard and select Encryption Keys, this will open AWS-KMS console.
2. In AWS KMS console select the Region and click on Create Key.

3. Create an Alias for KMS Master Key and add a meaningful tag.

 4. Define Key Administrative and Usage Permissions.

5. Review the Policy and click on create.

6. You can see in the KMS console a new Master Key is created.

Step-2: Create a Bucket in S3

1. Go to S3 console in AWS and click on create a Bucket.
2. Specify the Bucket name and Region and click on create.

3. Once the bucket is created try to upload some data in next step.

Step-3: Upload data to Bucket created in S3

1. Click on Upload to upload file in S3 bucket created in previous step.

2. Select the file and in the next step, define who can access the bucket and access permissions.

3. In the next step choose the storage class and Encryption method.

 4. In Encryption method select Encryption using AWS KMS master-key, and select the Master-Key generated in the earlier step for data encryption.

5. Review and click on Upload. Once uploaded verify the object properties.

6. Now try to access the uploaded data by clicking on download. You will see that you are able to download the file without any issue.

Step-4:Disable the Master key
1. Now let's disable the Master Key from KMS console and check again.

2. Now try again to access the uploaded file in S3 after disabling the MK.

Step-4:Enable the Master key

1. To enable the Master Key again go to KMS console and enable the MK.


Step-5: Try to access the S3 object with different IAM user.

1. Try to access the S3 bucket uploaded file with a different IAM user who does not have Usage access to KMS Master Key.

What's Happening Behind the Scene

1. Encryption Using KMS Customer Master Key

2. Decryption Using KMS Customer Master Key


KMS is a fully managed service because it automatically handles all of the availability, scalability, physical security, and hardware maintenance for the underlying Key Management Infrastructure (KMI).
With no up-front cost and usage-based pricing that starts at $1 per Customer.
Master Key (CMK) per month, KMS makes it easy for us to encrypt data stored in S3, EBS, RDS, Redshift, and any other AWS service that’s integrated with KMS.

Monday, January 21, 2019

Log Parsing of Windows Servers on Instance Termination

As we all know that how critical are Logs as a part of any system, they give you deep insights about your application, what your system is doing and what caused the error. Depending on how logging is configured logs may contain transaction history, timestamps and amounts debited/credited into client's account and a lot more.

On an enterprise level application, your system goes to multiple hosts, managing the logs across multiple hosts can be complicated. Debugging the error in the application across hundreds of log files on hundreds of servers can be very time consuming and complicated and not the right approach so it is always better to move the logs to a centralized location.

Lately in my company I faced a situation which I assume is a very commonly faced scenario in Amazon's Cloud where we might have to retain application logs from multiple instances behind an Auto Scaling group.  Let's assume an example for better understanding. 

Suppose your application is configured to be logging into C:\Source\Application\web\logs Directory. The Application running has variant incoming traffic, sometimes it receives requests which can be handled by 2 servers, other times it may require 20 servers to handle the traffic.

When there is a hike in traffic, Amazon Ec2's smart AutoScaling Group uses the configuration and scales from 2 server to many (According to ASG Policy) and during this phase, the application running in the newly launched Ec2's also log into C:\Source\Application\web\logs .... but when there's a drop in traffic, the ASG triggers a scale down policy, resulting to termination of instances, which also results in deletion of all the log files inside the instances launched via ASG during high traffic time.

Faced a similar situation ?  No worries, now in order to retain logs I figured out an absolute solution.

Here, in this blog, the motive is to sync the logs from dying instances at the time of their termination. This will be done using AWS Services, the goal is to trigger a Powershell Script in the instance using SSM which sync logs to S3 Bucket with sufficient information about the dying instances. For this we will require 2 things:

1) Configuring SSM agent to be able to talk to Ec2 Instances
2) Ec2 Instances being able to write to S3 Buckets

For the tutorial we will be using Microsoft Windows Server 2012 R2 Base with the AMI ID: ami-0f7af6e605e2d2db5

A Blueprint of the scenario to be understood below:

1) Configuring SSM agent to be able to talk to Ec2 Instances

SSM Agent is installed by default on Windows Server 2016 instances and instances created from Windows Server 2003-2012 R2 AMIs published in November 2016 or later. Windows AMIs published before November 2016 use the EC2Config service to process requests and configure instances.

If your instance is a Windows Server 2003-2012 R2 instance created before November 2016, then EC2Config must be upgraded on the existing instances to use the latest version of EC2Config. By using the latest EC2Config installer, you install SSM Agent side-by-side with EC2Config. This side-by-side version of SSM Agent is compatible with your instances created from earlier Windows AMIs and enables you to use SSM features published after November 2016.

This simple script can be used to update Ec2Config and then layer it with the latest version of SSM agent. This will always install AwsCli which is used to push logged archives to S3

if(!(Test-Path -Path C:\Scripts )){
mkdir C:\Tmp
cd C:/Tmp
wget https://s3.ap-south-1.amazonaws.com/asg-termination-logs/Ec2Install.exe -OutFile Ec2Config.exe
wget https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/windows_amd64/AmazonSSMAgentSetup.exe -OutFile ssmagent.exe
wget https://s3.amazonaws.com/aws-cli/AWSCLI64PY3.msi -OutFile awscli.msi
wget https://s3.amazonaws.com/aws-cli/AWSCLISetup.exe -OutFile awscli.exe
Invoke-Command -ScriptBlock {C:\Tmp\Ec2Config.exe /Ec /S /v/qn }
sleep 20
Invoke-Command -ScriptBlock {C:\Tmp\awscli.exe /Ec /S /v/qn }
sleep 20
Invoke-Command -ScriptBlock {C:\Tmp\ssmagent.exe /Ec /S /v/qn }
sleep 10
Restart-Service AmazonSSMAgent
Remove-Item C:\Tmp

An IAM Role is Required for SSM to Ec2 Instance Conversation:
IAM instance role: Verify that the instance is configured with an AWS Identity and Access Management (IAM) role that enables the instance to communicate with the Systems Manager API.
Add instance profile permissions for Systems Manager managed instances to an existing role
  • Open the IAM console at https://console.aws.amazon.com/iam/.
  • In the navigation pane, choose Roles, and then choose the existing role you want to associate with an instance profile for Systems Manager operations.
  • On the Permissions tab, choose Attach policy.
  • On the Attach policy page, select the check box next to AmazonEC2RoleforSSM, and then choose Attach policy.
Now, Navigate to Roles > and select your role. 

That should look like:

2) Ec2 Instances being able to write to S3 Buckets

An IAM Role is Required for Ec2 to be able to write to S3:

IAM instance role: Verify that the instance is configured with an AWS Identity and Access Management (IAM) role that enables the instance to communicate with the S3 API.

Add instance profile permissions for Systems Manager managed instances to an existing role
  • Open the IAM console at https://console.aws.amazon.com/iam/.
  • In the navigation pane, choose Roles, and then choose the existing role you want to associate with an instance profile for Systems Manager operations.
  • On the Permissions tab, choose Attach policy.
  • On the Attach policy page, select the check box next to AmazonS3FullAccess, and then choose Attach policy. 

That should look like:

This Powershell script saved in C:/Scripts/termination.ps1 will pick up log files from: $SourcePathWeb:
and will output logs into:
with a IP and date-stamp to recognize and identify the instances and where the logs originate from later.
Make sure that the s3 bucket name and –region and source of log files is changed according to the preferences. 

$Date=Get-Date -Format yyyy-MM-dd
$LocalIP=curl -UseBasicParsing

if((Test-Path -Path C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date )){
Remove-Item "C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date" -Force -Recurse

New-Item -path "C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date" -type directory

Add-Type -assembly "system.io.compression.filesystem"
[io.compression.zipfile]::CreateFromDirectory($SourcePathWeb, $DestFileWeb)

C:\'Program Files'\Amazon\AWSCLI\bin\aws.cmd s3 cp C:\Users\Administrator\workdir s3://terminationec2 --recursive --exclude "*.ok" --include "*" --region us-east-1

If the above settings are done fine then manually the script should produce a success suggesting output:

Check your S3, Bucket for seeing if it has synced logs to there. Now, because the focus of this blog trigger a Powershell Script in the instance using SSM which syncs the logs to S3 Bucket so we will try running the script through SSM > Run Command.

Select and run of the instances having the above script and configuration. The output should be pleasing.

The AMI used by the ASG should have the above configuration (Can be archived via created a ami from ec2 having above config and then adding it into Launch Configuration of the ASG). The ASG we have here for the tutorial is named after my last name : "group_kaien".

Now, the last and the Most important step is configuration the
Cloudwatch > Event > Rules.

Navigating to Cloudwatch>Event>Rules: Create Rule.

This would return the following JSON config:

"source": [
"detail-type": [
"EC2 Instance Terminate Successful",
"EC2 Instance-terminate Lifecycle Action"
"detail": {
"AutoScalingGroupName": [

On the right side of Targets:

SSM Run Command:
  • Document: AwsRunPowerShellScript
  • Target key: “Instanceids or tag:<Ec2TagName>
  • Target Values: <Tag Value>

 Configure parameter
  • Commands: .\termination.ps1
  • WorkingDirectory: C:\Scripts.ps1
  • ExecutionTimeout: 3600 is default
Making sure that on termination event happening, the powershell script is run and it syncs logs to S3. This is what our configuration looks like:


For more on setting up Cloudwatch Events refer :

Wait for the AutoScaling Policies to run such that new instances are created and terminated, with above configuration. The terminating instances will sync their logs S3 before they are fully terminated. Here’s the output on S3 for me after a scale down activity was done.



Now with this above, we have learned how to export logs to S3 automatically from a dying instance, with the correct date/time stamp as mentioned in the termination.ps1 script.
Hence, fulfilling the scope of the blog.
Stay tuned for more

Git Inside Out

Git Inside-Out Git is basically a file-system where you can retrieve your content through addresses. It simply means ...

Popular Posts