Recap Amrita InCTF 2019 | Part 1

Amrita InCTF 10th Edition, is an offline CTF(Capture the Flag) event hosted by Amrita university at their Amritapuri campus 10 KM away from Kayamkulam in Kerala, India. In this year’s edition two people from Opstree got invited to the final round after roughly two months of solving challenges online. The dates for the final rounds were 28th,29th and 30th December 2019. The first two days comprised of talks by various people from the industry and the third day was kept for the final competition. In the upcoming three blog series starting now, we’d like to share all the knowledge, experiences and learning from this three day event.

Talk from Cisco

The hall was full of a little more than 300 people, among which a lot were college students all the way ranging from sophomore year up till final as well as pre-final year. Also, to our surprise there were roughly 50+  school students sitting ready to compete for the final event as well. The initial talk by CISCO was refreshing and very insightful for everyone present in the room. The talk majorly focused on how technology is changing lives all around the world be it with machine learning to help doctors treat faster or be it use drones to put off fire or IoT enabled system to provide efficient irrigation at remote areas. The speakers also made a point on how learning in a broader segment of technologies and tools serves longer than in depth knowledge of limited technology.
One thing that really stuck with me was that never learn a technology just for the sake of it or for the hype around it. But learn with a thought on how it can solve a problem around us.

Talk Title: Cyberoam startup and experiences -Hemanth Patel

Hemal Patel talked about his couple of startups and how he has always learned through failures. The talk was full of experiences and it is always serene to listen to someone telling about how they failed over and over again which eventually led them to succeed at whatever they are doing today. He talked about CyberRoam which is a Sophos Company, secures organizations with its wide range of product offerings at the network gateway. The talk went on to give us an overview of how business is done along different governments all around the world and how Entrepreneurship is so much more than just tackling a problem at a business level. And the how Cyberroam ended up making the product that they have today. 

Talk on Security by Cisco – Radhika Singh and Prapanch Ramamoorthy 

This was a wide range talk about a lot of things affecting us. We’ll try to list down most of it here. 

The talk started out with exploring Free/Open WiFi. Though it has a huge benefit of wifi being free it comes with a lot of risks as well. To name a few : 

→ Sniffing

→ Snooping 

→ Spoofing

These just to mention a few ways you can be compromised over a free WiFi. 

You can read up more on it here :

The talk also presented us with facts over data, how only 1% of the total data is generated via laptops and computers, Rest all are generated by smart phones, smart TVs as other IoT devices. Hence comes a very important point of securing IoT devices. 

It was pointed out during the talk that majority of the companies worry about security over the end of the entire IoT chain i.e. over the cloud etc. But not many people are caring about the edge devices and how lack of security measures here can compromise them. 

There was this really interesting case study about of IoT devices brought down the internet for the entire US east coast and how this attack was just meant to get some more time to submit an assignment at it’s initial days. Read more on this story from 2016 here

Hackers prefer to exploit IOT devices over cloud infrastructure.

Memes apart, The talk also focused on privacy vs security and how  Google’s dns resolution encryption helps in securing DNS based internet traffic on the world wide web. 

National Critical Information Infrastructure Protection Centre(NCIIPC)

National Critical Information Infrastructure Protection Centre (NCIIPC) is an organisation of the Government of India created under Sec 70A of the Information Technology Act, 2000 (amended 2008), through a gazette notification on 16th Jan 2014, based in New Delhi, India. It is designated as the National Nodal Agency in respect of Critical Information Infrastructure Protection. 

Representatives from this organization was there to speak at the event and they talked in detail about defining what is a CII (Critical Information Infrastructure) is and how any company with such infrastructure needs to inform the government about it. 

A CII is basically any Information Infrastructure (by any financial/medical etc institute) which if compromised can affect the national security of the country. And attacking any such infrastructure is an act of terrorism as defined by the article 66F in the IT Act,2018. 

They talked about some of the threats they deal with at the national level. They particularly talked about how BGP routing protocol which works on trust was compromised lately to route all Indian traffic via Pakistan servers/routers.

Image result for darknet dark web internet

One more interesting talk was about the composition of Internet.

How we think that the internet we see would comprise of 90% of the total internet but in reality it’s just 4%, bummer right? .  Deep web is the one which comprises of 90% of the total internet and as a matter of fact that no one completely knows about the DarkNet and it’s volume. Hence even the numbers mentioned above are as good as a guess.

 This was a very insightful talk and put a lot of things in perspective.

Digital Forensics – Beyond the Good Ol’ Data Recovery by Ajith Ravindran 

Image result for data forensics

This talk by Ajith Ravindran mainly focused on Computer forensics, which is the application of investigation and analysis techniques to gather and preserve evidence from a particular computing device in a way that is suitable for presentation in a court of law.

The majority of tips and tricks shared were about getting data from Windows based machines even after it is deleted from the system and how such data can be retrieved in order to show as proof for crimes.

Some of the tricks talked about are mentioned below : 

The prefetch files in Windows  gives us the list of files and executables last accessed and the number of times executed. 

Userassist allows investigators to see what programs were recently executed on a system.

Shellbags list down files that are accessed via a user at least once.

Master file table enables us to get a list of all the files in the system, or even entered the system via network of USB drives.

$usrnjrnl gives us information regarding all user activities in past 1-2 days.

Hiberfil.sys is a file the system creates when the computer goes into hibernation mode. Hibernate mode uses the Hiberfil.sys file to store the current state (memory) of the PC on the hard drive and the file is used when Windows is turned back on.

This was all from day 1 talk, Come back on next Tuesday for talks from Day 2. And as the final segment of this series we’ll be updating about attack/defense and jeopardy CTF experience.

Stay Tuned, Happy Blogging!

Log Everything as JSON

Logging and monitoring are like Tony Stark and his Iron Man suit, the two will go together. Similarly, logging and monitoring work best together because they complement each other well.

For many years, logs have been an essential part of troubleshooting application and infrastructure performance. But over the period of time we have realized that logs are not only meant for troubleshooting purposes, they can also be used for business dashboards visualization and performance analysis.

So logging application data in a file is great, but we need more.

Why JSON logging is best framework?

For understanding the greatness of the JSON logging framework, let’s understand this conversation between Anuj(A system Engineer) and Kartik(A Business Analyst).


A few days later Kartik complains that Web Interface is broken. Anuj scratches his head and takes a look at the logs and realizes that Developer has added an extra field to the log lines broked his custom parser.

I am sure anyone can face a similar kind of situation.

In this case, if Developer has designed the application to write logs as JSON, it would be a piece of cake for Anuj to create a parser for that because then he has to search fields on the basis of the JSON key and it doesn’t matter how many new fields are getting added in the logline.

The biggest benefit of logging in JSON is that it has a structured format. This makes possible to analyze application logs just like Big Data. It’s not just readable, but a database that can be queried for each and every field. Also, every programming language can parse it.

Magic with JSON logging

Recently, we have created a sample Golang application to get Code Build, Code Test and Deployment phase experience with Golang Applications. So while writing this application we have incorporated the functionality to write logs in JSON.
The sample logs are something like this:-

And while integrating ELK for logs analysis, the only parsing line we have to add in logstash is:-

 
filter {
    json {
        source => "message"
    }
}

After this, we don’t require any further parsing and we can add as many fields in the log file.

As you can see I have all fields available in Kibana like:- employee name, employee city and for this, we do not have to add some complex parsing in logstash or in any other tool. Also, I can create a beautiful Business Dashboard with this data.

Application Repository Link:-
https://github.com/opstree/Opstree-Go-WebApp

Conclusion

It will not take too long to migrate from text logging to JSON logging as there are multiple programming language log drivers are available. I am sure JSON logging will provide more flexibility to your current logging system.
If your organization is using any Log Management platform like Splunk, ELK, etc. I think JSON logging could be a companion of it.

Some of the popular logging drivers which support JSON output are:-

Golang:- https://github.com/sirupsen/logrus
Python:- https://github.com/thangbn/json-logging-python
Java:- https://howtodoinjava.com/log4j2/log4j-2-json-configuration-example/
PHP:- https://github.com/nekonomokochan/php-json-logger

I hope now we have a good understanding of JSON logging. So now it’s time to choose your logging wisely.


That’s all I have, thanks for reading, I’d really appreciate any and all feedback, please leave your comment below if you guys have any feedback or any queries.

Cheers till next time!!

Image Reference:- https://www.google.com/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwi0jJzD8t_mAhVOzDgGHc6ODNQQjB16BAgBEAM&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DbPTZ43nM688&psig=AOvVaw0N3k7-bSq4OD4oMxPxYPrN&ust=1577881894317756

Redis Cluster: Setup, Sharding and Failover Testing

Watching cluster sharding and failover management is as gripping as visualizing a robotic machinery work.

My last blog on Redis Cluster was primarily focussed on its related concepts and requirements. I would highly recommend to go through the concepts first to have better understanding.

Here, I will straight forward move to its setup along with the behaviour of cluster when I intentionally turned down one Redis service on one of the node.
Let’s start from the scratch.

Redis Setup

Here, I will follow the approach of a 3-node Redis Cluster with Redis v5.0 on all the three CentOS 7.x nodes.

Setup Epel Repo

wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
rpm -ivh epel-release-latest-7.noarch.rpm

Setup Remi Repo

yum install http://rpms.remirepo.net/enterprise/remi-release-7.rpm
yum --enablerepo=remi install redis

redis-server --version
Redis server v=5.0.5 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=619d60bfb0a92c36

3-Node Cluster Prerequisites

While setting up Redis cluster on 3 nodes, I will be following the strategy of having 3 master nodes and 3 slave nodes with one master and one slave running on each node serving redis at different ports. As shown in the diagram Redis service is running on Port 7000 and Port 7001

  • 7000 port will serve Redis Master
  • 7001 port will serve Redis Slave

Directory Structure

We need to design the directory structure to server both redis configurations.

tree /etc/redis
/etc/redis
`-- cluster
    |-- 7000
    |   `-- redis_7000.conf
    `-- 7001
        `-- redis_7001.conf

Redis Configuration

Configuration file for Redis service 1

cat /etc/redis/cluster/7000/redis_7000.conf
port 7000
dir /var/lib/redis/7000/
appendonly yes
protected-mode no
cluster-enabled yes
cluster-node-timeout 5000
cluster-config-file /etc/redis/cluster/7000/nodes_7000.conf
pidfile /var/run/redis_7000.pid

Configuration file for Redis service 2

cat /etc/redis/cluster/7001/redis_7001.conf
port 7001
dir /var/lib/redis/7001
appendonly yes
protected-mode no
cluster-enabled yes
cluster-node-timeout 5000
cluster-config-file /etc/redis/cluster/7001/nodes_7001.conf
pidfile /var/run/redis_7001.pid

Redis Service File

As we are managing multiple service on a single instance, we need to update service file for easier management of redis services.

Service management file for Redis service 1

cat /etc/systemd/system/redis_7000.service
[Unit]
Description=Redis persistent key-value database
After=network.target

[Service]
ExecStart=/usr/bin/redis-server /etc/redis/cluster/7000/redis_7000.conf --supervised systemd
ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown
Type=notify
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=0755
LimitNOFILE=65535

[Install]
WantedBy=multi-user.target

Service management file for Redis service 2

cat /etc/systemd/system/redis_7001.service
[Unit]
Description=Redis persistent key-value database
After=network.target

[Service]
ExecStart=/usr/bin/redis-server /etc/redis/cluster/7001/redis_7001.conf --supervised systemd
ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7001 shutdown
Type=notify
User=redis
Group=root
RuntimeDirectory=/etc/redis/cluster/7001
RuntimeDirectoryMode=0755
LimitNOFILE=65535

[Install]
WantedBy=multi-user.target

Redis Service Status

Master Service
systemctl status redis_7000.service 
● redis_7000.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7000.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-09-25 08:14:15 UTC; 30min ago
  Process: 2902 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown (code=exited, status=0/SUCCESS)
 Main PID: 2917 (redis-server)
   CGroup: /system.slice/redis_7000.service
           └─2917 /usr/bin/redis-server *:7000 [cluster]
systemd[1]: Starting Redis persistent key-value database...
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=2917, just started
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 # Configuration loaded
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 * supervised by systemd, will signal readiness
systemd[1]: Started Redis persistent key-value database.
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.754 * No cluster configuration found, I'm ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.756 * Running mode=cluster, port=7000.
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.756 # Server initialized
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.756 * Ready to accept connections
Slave Service
systemctl status redis_7001.service
● redis_7001.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7001.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-09-25 08:14:15 UTC; 30min ago
  Process: 2902 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7001 shutdown (code=exited, status=0/SUCCESS)
 Main PID: 2919 (redis-server)
   CGroup: /system.slice/redis_7001.service
           └─2919 /usr/bin/redis-server *:7001 [cluster]
systemd[1]: Starting Redis persistent key-value database...
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=2917, just started
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 # Configuration loaded
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 * supervised by systemd, will signal readiness
systemd[1]: Started Redis persistent key-value database.
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.754 * No cluster configuration found, I'm ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.756 * Running mode=cluster, port=7001.
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.756 # Server initialized
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.756 * Ready to accept connections

Redis Cluster Setup

Redis itself provides cli tool to setup cluster.
In the current 3 node scenario, I opt 7000 port on all node to serve Redis master and 7001 port to serve Redis slave.

redis-cli --cluster create 172.19.33.7:7000 172.19.42.44:7000 172.19.45.201:7000 172.19.33.7:7001 172.19.42.44:7001 172.19.45.201:7001 --cluster-replicas 1

The first 3 address will be the master and the next 3 address will be the slaves. It will be a cross node replication, say, Slave of any Mater will reside on a different node and the cluster-replicas define the replication factor, i.e each master will have 1 slave.

>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 172.19.42.44:7001 to 172.19.33.7:7000
Adding replica 172.19.45.201:7001 to 172.19.42.44:7000
Adding replica 172.19.33.7:7001 to 172.19.45.201:7000
M: ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000
   slots:[0-5460] (5461 slots) master
M: 314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000
   slots:[5461-10922] (5462 slots) master
M: 19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000
   slots:[10923-16383] (5461 slots) master
S: 896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001
   replicates 19a2c81b7f489bec35eed474ae8e1ad787327db6
S: 89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001
   replicates ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
S: 20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001
   replicates 314038a48bda3224bad21c3357dbff8305735d72
Can I set the above configuration? (type 'yes' to accept):

A dry run will showcase the cluster setup and ask for confirmation.

Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
..
>>> Performing Cluster Check (using node 172.19.33.7:7000)
M: ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: 20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001
   slots: (0 slots) slave
   replicates 314038a48bda3224bad21c3357dbff8305735d72
M: 314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
M: 19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001
   slots: (0 slots) slave
   replicates ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
S: 896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001
   slots: (0 slots) slave
   replicates 19a2c81b7f489bec35eed474ae8e1ad787327db6
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

Check Cluster Status

Connect to any of the cluster node to check the status of cluster.

redis-cli -c -h 172.19.33.7 -p 7000
172.19.33.7:7000> cluster nodes
20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001@17001 slave 314038a48bda3224bad21c3357dbff8305735d72 0 1569402961000 6 connected
314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000@17000 master - 0 1569402961543 2 connected 5461-10922
19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000@17000 master - 0 1569402960538 3 connected 10923-16383
ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000@17000 myself,master - 0 1569402959000 1 connected 0-5460
89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001@17001 slave ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 0 1569402960000 5 connected
896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001@17001 slave 19a2c81b7f489bec35eed474ae8e1ad787327db6 0 1569402959936 4 connected

Redis cluster itself manages the cross node replication, as seen in the above screen, 172.19.42.44:7000 master is associated with 172.19.45.201:7001 slave.

Data Sharding

There are 16384 slots. These slots are divided by the number of servers.
If there are 3 servers; 1, 2 and 3 then

  • Server 1 contains hash slots from 0 to 5500.
  • Server 2 contains hash slots from 5501 to 11000.
  • Server 3 contains hash slots from 11001 to 16383.
redis-cli -c -h 172.19.33.7 -p 7000
172.19.33.7:7000> set a 1
-> Redirected to slot [15495] located at 172.19.45.201:7000
OK
172.19.45.201:7000> set b 2
-> Redirected to slot [3300] located at 172.19.33.7:7000
OK
172.19.33.7:7000> set c 3
-> Redirected to slot [7365] located at 172.19.42.44:7000
OK
172.19.42.44:7000> set d 4
-> Redirected to slot [11298] located at 172.19.45.201:7000
OK
172.19.45.201:7000> get b
-> Redirected to slot [3300] located at 172.19.33.7:7000
"2"
172.19.33.7:7000> get a
-> Redirected to slot [15495] located at 172.19.45.201:7000
"1"
172.19.45.201:7000> get c
-> Redirected to slot [7365] located at 172.19.42.44:7000
"3"
172.19.42.44:7000> get d
-> Redirected to slot [11298] located at 172.19.45.201:7000
"4"
172.19.45.201:7000>

Redis Cluster Failover

Stop Master Service

Let’s stop the Redis master service on Server 3.

systemctl stop redis_7000.service
systemctl status redis_7000.service
● redis_7000.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7000.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Wed 2019-09-25 09:32:37 UTC; 23s ago
  Process: 3232 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown (code=exited, status=0/SUCCESS)
  Process: 2892 ExecStart=/usr/bin/redis-server /etc/redis/cluster/7000/redis_7000.conf --supervised systemd (code=exited, status=0/SUCCESS)
 Main PID: 2892 (code=exited, status=0/SUCCESS)

Cluster State (Failover)

While checking the cluster status, Redis master service running on server 3 at port 7000 is shown fail and disconnected.

At the same moment its respective slave gets promoted to master which is running on port 7001 on server 1.

redis-cli -c -h 172.19.33.7 -p 7000
172.19.45.201:7000> CLUSTER NODES
314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000@17000 master,fail - 1569403957138 1569403956000 2 disconnected
ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000@17000 master - 0 1569404037252 1 connected 0-5460
896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001@17001 slave 19a2c81b7f489bec35eed474ae8e1ad787327db6 0 1569404036248 4 connected
89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001@17001 slave ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 0 1569404036752 5 connected
20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001@17001 master - 0 1569404036000 7 connected 5461-10922
19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000@17000 myself,master - 0 1569404035000 3 connected 10923-16383

Restarting Stopped Redis

Now we will check the behaviour of cluster once we fix or restart the redis service that we intentionally turned down earlier.

systemctl start redis_7000.service
systemctl status redis_7000.service
● redis_7000.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7000.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-09-25 09:35:12 UTC; 8s ago
  Process: 3232 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown (code=exited, status=0/SUCCESS)
 Main PID: 3241 (redis-server)
   CGroup: /system.slice/redis_7000.service
           └─3241 /usr/bin/redis-server *:7000 [cluster]

Cluster State (Recovery)

Finally, all redis service are back in running state. The master service that we turned down and restarted has now become slave to its promoted master.

redis-cli -c -h 172.19.33.7 -p 7000
172.19.45.201:7000> CLUSTER NODES 314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000@17000 slave 20ab4b30f3d6d25045909c6c33ab70feb635061c 0 1569404162565 7 connected ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000@17000 master - 0 1569404162000 1 connected 0-5460 896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001@17001 slave 19a2c81b7f489bec35eed474ae8e1ad787327db6 0 1569404163567 4 connected 89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001@17001 slave ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 0 1569404163000 5 connected 20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001@17001 master - 0 1569404162000 7 connected 5461-10922 19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000@17000 myself,master - 0 1569404161000 3 connected 10923-16383

It’s not done yet, further we can explore around having a single endpoint to point from the application. I will am currently working on that and soon will come up with the solution.
Apart from this monitoring the Redis Cluster will also be a major aspect to look forward.
Till then get your hands dirty playing around the Redis Cluster setup and failover.

Reference links:
Image: Google image search (blog.advids.co)

One more reason to use Docker

Recently I was working on a project which includes Terraform and AWS stuff. While working on that I was using my local machine for terraform code testing and luckily everything was going fine. But when we actually want to test it for the production environment we got some issues there. Then, as usual, we started to dig into the issue and finally, we got the issue which was quite a silly one 😜. The production server Terraform version and my local development server Terraform version was not the same. 

After wasting quite a time on this issue, I decided to come up with a solution so this will never happen again.

But before jumping to the solution, let’s think is this problem was only related to Terraform or do we have faced the similar kind of issue in other scenarios as well.

Well, I guess we face a similar kind of issue in other scenarios as well. Let’s talk about some of the scenario’s first.

Suppose you have to create a CI pipeline for a project and that too with code re-usability. Now pipeline is ready and it is working fine in your project and then after some time, you have to implement the same kind of pipeline for the different project. Now you can use the same code but you don’t know the exact version of tools which you were using with CI pipeline. This will lead you to error elevation. 

Let’s take another example, suppose you are developing something in any of the programming languages. Surely that utility or program will have some dependencies as well. While installing those dependencies on the local system, it can corrupt your complete system or package manager for dependency management. A decent example is Pip which is a dependency manager of Python😉.

These are some example scenarios which we have faced actually and based on that we got the motivation for writing this blog.

The Solution

To resolve all this problem we just need one thing i.e. containers. I can also say docker as well but container and docker are two different things.

But yes for container management we use docker.

So let’s go back to our first problem the terraform one. If we have to solve that problem there are multiple ways to solve this. But we tried it to solve this using Docker.

As Docker says

Build Once and Run Anywhere

So based on this statement what we did, we created a Dockerfile for required Terraform version and stored it alongside with the code. Basically our Dockerfile looks like this:-

FROM alpine:3.8

MAINTAINER OpsTree.com

ENV TERRAFORM_VERSION=0.11.10

ARG BASE_URL=https://releases.hashicorp.com/terraform

RUN apk add --no-cache curl unzip bash \
    && curl -fsSL -o /tmp/terraform.zip ${BASE_URL}/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip \
    && unzip /tmp/terraform.zip -d /usr/bin/

WORKDIR /opstree/terraform

USER opstree

In this Dockerfile, we are defining the version of Terraform which needs to run the code.
In a similar fashion, all other above listed problem can be solved using Docker. We just have to create a Dockerfile with exact dependencies which are needed and that same file can work in various environments and projects.

To take it to the next level you can also dump a Makefile as well to make everyone life easier. For example:-

IMAGE_TAG=latest
build-image:
    docker build -t opstree/terraform:${IMAGE_TAG} -f Dockerfile .

run-container:
    docker run -itd --name terraform -v ~/.ssh:/root/.ssh/ -v ~/.aws:/root/.aws -v ${PWD}:/opstree/terraform opstree/terraform:${IMAGE_TAG}

plan-infra:
    docker exec -t terraform bash -c "terraform plan"

create-infra:
    docker exec -t terraform bash -c "terraform apply -auto-approve"

destroy-infra:
    docker exec -t terraform bash -c "terraform destroy -auto-approve"

And trust me after making this utility available the reactions of the people who will be using this utility will be something like this:-

Now I am assuming you guys will also try to simulate the Docker in multiple scenarios as much as possible.

There are a few more scenarios which yet to be explored to enhance the use of Docker if you find that before I do, please let me know.

Thanks for reading, I’d really appreciate any and all feedback, please leave your comment below if you guys have any feedback.

Cheers till the next time.

Unix File Tree Part-2

For those who have surfed straight to this blog, please check out the previous part of this series Unix File Tree Part-1 and those who have stayed tuned for this part, welcome back.In the previous part, we discussed the philosophy and the need for file tree. In this part, we will dive deep into the significance of each directory.

Image result for horizontal file tree linux

Dayum!! that’s a lot of stuff to gulp at once, we’ll kick out things one after the other.

Major directories

Let’s talk about the crucial directories which play a major role.

  • /bin: When we started crawling on Linux this helped us to get on our feet yes, you read it right whether you want to copy any file, move it somewhere, create a directory, find out date, size of a file, all sorts of basic operations without which the OS won’t even listen to you (Linux yawning meanwhile) happens because of the executables present in this directory. Most of the programs in /bin are in binary format, having been created by a C compiler, but some are shell scripts in modern systems.
  • /etc: When you want things to behave the way you want, you go to /etc and put all your desired configuration there (Imagine if your girlfriend has an /etc life would have been easier). whether it is about various services or daemons running on your OS it will make sure things are working the way you want them to.
  • /var: He is the guy who has kept an eye over everything since the time you have booted the system (consider him like Heimdall from Thor). It contains files to which the system writes data during the course of its operation. Among the various sub-directories within /var are /var/cache (contains cached data from application programs), /var/games(contains variable data relating to games in /usr), /var/lib (contains dynamic data libraries and files), /var/lock (contains lock files created by programs to indicate that they are using a particular file or device), /var/log (contains log files), /var/run (contains PIDs and other system information that is valid until the system is booted again) and /var/spool (contains mail, news and printer queues).
  • /proc: You can think of /proc just like thoughts in your brain which are illusions and virtual. Being an illusionary file system it does not exist on disk instead, the kernel creates it in memory. It is used to provide information about the system (originally about processes, hence the name). If you navigate to /proc The first thing that you will notice is that there are some familiar-sounding files, and then a whole bunch of numbered directories. The numbered directories represent processes, better known as PIDs, and within them, a command that occupies them. The files contain system information such as memory (meminfo), CPU information (cpuinfo), and available filesystems.
  • /opt: It is like a guest room in your house where the guest stayed for prolong period and became part of your home. This directory is reserved for all the software and add-on packages that are not part of the default installation.
  • /usr: In the original Unix implementations, /usr was where the home directories of the users were placed (that is to say, /usr/someone was then the directory now known as /home/someone). In current Unixes, /usr is where user-land programs and data (as opposed to ‘system land’ programs and data) are. The name hasn’t changed, but its meaning has narrowed and lengthened from “everything user related” to “user usable programs and data”. As such, some people may now refer to this directory as meaning ‘User System Resources’ and not ‘user’ as was originally intended.

Potato or Potaaato what is the difference? 

We’ll be discussing those directories which confuse us always, which have almost a similar purpose but still are in separate locations and when asked about them we go like ummmm…….

/bin vs /usr/bin vs /sbin vs /usr/local/bin

This might get almost clear out when I explained the significance of /usr in the above paragraph. Since Unix designers planned /usr to be the local directories of individual users so it contained all of the sub-directories like /usr/bin, /usr/sbin, /usr/local/bin. But the question remains the same how the content is different?

/usr/bin:

  • /usr/bin is a standard directory on Unix-like operating systems that contains most of the executable files that are not needed for booting or repairing the system. 
  • A few of the most commonly used are awk, clear, diff, du, env, file, find, free, gzip, less, locate, man, sudo, tail, telnet, time, top, vim, wc, which, and zip.

/usr/sbin:

  • The /usr/sbin directory contains non-vital system utilities that are used after booting.
  • This is in contrast to the /sbin directory, whose contents include vital system utilities that are necessary before the /usr directory has been mounted (i.e., attached logically to the main filesystem). 
  • A few of the more familiar programs in /usr/sbin are adduser, chroot, groupadd, and userdel. 
  • It also contains some daemons, which are programs that run silently in the background, rather than under the direct control of a user, waiting until they are activated by a particular event or condition such as crond and sshd.

I hope I have covered most of the directories which you might come across frequently and your questions must have been answered.
Now that we know about the significance of each UNIX directory, It’s time to use them wisely the way they are supposed to be.
Please feel free to reach me out for any suggestions.
Goodbye till next time!

References: https://www.tldp.org/LDP/Linux-Filesystem-Hierarchy/html/usr.htmlhttps://askubuntu.com/questions/130186/what-is-the-rationale-for-the-usr-directoryhttps://askubuntu.com/questions/308045/differences-between-bin-sbin-usr-bin-usr-sbin-usr-local-bin-usr-localhttp://index-of.es/Varios-2/How%20Linux%20Works%20What%20Every%20Superuser%20Should%20Know.pdf
https://imgflip.com/memegenerator

Speeding up Ansible Execution Part 2

MITOGEN


In the previous post, we discussed various ways to reduce the ansible-playbook execution time, those changes were mostly made in the ansible config file, by adding or adjusting certain parameters in the file. But as you may have noticed that those methods were not that effective in certain cases, while using those methods we have to be very cautious about the result as they may affect ansible performance in one way or the other.


Generally, for the slower ansible execution, the main culprit is the way ansible is executed on the hosts. It creates multiple SSH connections and does not fully utilize the available resources. To tackle this problem, MITOGEN came to rescue !!!

Mitogen is a distributed programming library for Python. The Mitogen extension is a set of plug-ins for Ansible that enable it to operate via Mitogen, vastly improving its performance and enhancing its functional capability.

We all know about the strategies in ansible – linear, free & debug., the mitogen is just defined in the strategy column of the config file, so it is just a strategy, we are not making any other changes in the config file of the ansible so it is not affecting any other parameter, it is just the way, playbooks will be executed on the hosts.

Now coming to the mitogen installation part, we just have to download this package at a particular location and make some changes in the ansible config file as shown below,

[defaults]

strategy_plugins = /path/to/mitogen/ansible_mitogen/plugins/strategystrategy = mitogen_linear

we have to define the path where we have stored our mitogen files, and mention the strategy as “mitogen_linear”, under the default section of the config file, and we are good to go.

Now, after the Mitogen installation part, when we run our playbook, we will notice a reasonable reduction in the execution time,

Mitogen is fast because of the following reasons,

  • One connection is created per target and system logs aren’t spammed with repeated authentication events.
  • A single network roundtrip is used to execute a step whose code already exists in RAM on the target.
  • Processes are aggressively reused, avoiding the cost of invoking Python and recompiling imports, saving 300-800 ms for every playbook step.
  • Code is cached in the RAM, which further increases the speed.
  • Generally, ansible repeatedly rewrites and extracts ZIP files to temporary directories in the target hosts, mitogen also reduces these rewrites.

      All the above-mentioned features make the ansible to run faster.

      Mitogen is another extension for ansible that provides a decrease in its execution time and it is very easy to use, I think MITOGEN is very underrated and one of its kind, and we should definitely give it a try.

      I hope I have explained everything well, any suggestion/queries are highly appreciated.


Thanks !!!

Source:

https://mitogen.networkgenomics.com/ansible_detailed.html

Speeding up Ansible Execution Part 1

The knowledge of one of the SCM tools is a must for any DevOps engineer, ANSIBLE is one of the popular tools in this category, we all are aware of the ease that Ansible provides whether it is infra provisioning, orchestration or application deployment.
The reason for the vast popularity of Ansible is the long list of modules it provides to support any level of automation, moreover it also gives users the flexibility to create their own modules as per their requirement.
But The purpose of this blog is not to mention the features that ansible provides, but to show how we can speed up our playbook execution in Ansible, as a beginner executing ansible, is very easy and it also feels like saving a lot of time with it, but as you dive deep into it, you will come to know that running ansible playbooks will engage you for a considerable amount of time.
There are a lot of articles available on the internet on how we can speed up our ansible execution, so I have decided to sum up those articles into my blog, with the following methods, we can reduce our execution time without compromising with the overall performance of Ansible.
Before starting, I request  you guys to make a small change in your ansible configuration file (ansible.cfg), this small change will help you in tracking the time it will take for the playbook execution, and it also lists out the time is taken by each task.
Just add these lines to your ansible.cfg file under default section,

[default]

callback_whitelist = profile_tasks

Forks

When you are running your playbooks on various hosts, then you may have noticed that the number of servers where the playbook executes simultaneously is 5. You can increase this number inside the ansible.cfg file:
# ansible.cfg

forks = 10


or with a command line argument to ansible-playbook with the -f or –forks options. We can increase or decrease this value as per our requirement.
while using forks we should use “local_action” or “delegated” steps limited in number, as with higher fork value it will affect the ansible-server’s performance.

Async

In ansible, each task blocks the playbook, meaning the connections stay open until the task is done on each node, which is some cases takes a lot of time, here we can use “async” for those particular tasks, with the help of this ansible will automatically move to another task without waiting for the task execution on each node.
To launch a task asynchronously, we need to specify its maximum runtime and how frequently we would like to poll for status, it’s default value in 10 sec.
tasks:

– name: “name of the task”  

command: “command we want to execute”     

async: 40    

poll: 15
The only condition is that the subsequent tasks must not have a dependency on this task.

Free Strategy 

When running Ansible playbooks, you might have noticed that the Ansible runs every task on each node one by one, it will not move to another task until a particular task is completed on each node, which will take a lot of time, in some cases.
By default, the strategy is set to “linear”, we can set it to free.

– hosts: “hosts/groups”  

name: “name of the playbook”  

strategy: free


It will run the playbook on each host independently, without waiting for each node to complete.
Facts gathering is the default feature while executing playbook, sometimes we don’t need it.
In those cases, we can disable facts gathering, This has advantages in scaling Ansible in push mode with very large numbers of systems.

– hosts: “hosts/groups”  

name: “name of the playbook”  

gather_facts: no

Pipelining 

For each task in Ansible, there are lots of ssh connection created, which results in increasing the total execution time. Pipelining reduces the number of ssh operations required to execute a module by executing many Ansible modules without an actual file transfer. We just have to make these changes in the ansible.cfg file,
# ansible.cfg Pipelining = True
Although this can result in a very significant performance improvement when enabled, Pipelining is disabled by default because requiretty is enabled by default for many distros.

Poll Interval

When we run any the task in Ansible, it starts polling to check if the task is completed on the host or not, we can decrease this polling interval time in ansible.cfg to increase its performance, but it will increase the CPU usage, so we need to adjust its value accordingly We just have to adjust this the parameter in the ansible.cfg file,
internal_poll_interval=0.001

so, these are the various ways to decrease our playbook execution time in Ansible, generally we don’t use all these methods in a single setup, we use these features as per the requirement, 
The main motive of writing this blog is to determine the factors which will help in fine-tuning the Ansible performance, and there are many more factors which serves the same purpose but here I am mentioning the most important parameters among them.
I hope I have covered all the important aspects of the blog, feel free to provide your valuable feedback.
Thanks !!!

Source:

https://mitogen.networkgenomics.com/ansible_detailed.html