Perfect Spot Instance’s Imperfections | part-II

Hello friends, if you are reading this blog, I assume that you have gone through the first part of my blog “ https://blog.opstree.com/2019/11/05/perfect-spot-instances-imperfections-part-i/ “. However, if you haven’t, I suggest you to go through the link before reading this blog.

Now let’s recall the concept of first part of this blog that we are going to implement in this part.

We will create all the components related to this project(as shown in figure above). We will also go through how best we can create spot fleet request wisely by choosing the parameters that fits for our purpose and prone to less interruption. I assume that you have previously created VPC, Subnets(at least two), Internet Gateway and have associated it with VPC, Target Group, AMI(nginx server running at 80), Launch Configuration with the same AMI, ASG with launch configuration you created, and associating the tags for the instance(on-demand), also associating the target group you created, Load Balancer(listening on http protocol and directing to target group you created), route53(optional -where one address is mapped to Load Balancer DNS name).

I have made my server public if you want you can make them private. After this we are going to create IAM role, Spot fleet request, Lambda function, and then Cloudwatch rules for our purpose.

 

Let’s create IAM Role

We are going to create two IAM roles. One for Lambda  and other for Spot Fleet Request.

  1. Go to IAM console. Select Roles from left side navigation pane.
  2. Click on Create role and follow the screenshot below.

Here we are creating Role for Lambda function.

  1. Click on Next:Permissions.
  2. Check on Administrator access to allow lambda to access all AWS services.
  1. Click on Next:Tags.
  2. Add tags if you want, then click on Next:Review.
  3. Type your Role Name and then click on Create role. Role is created now.
  4. Now again come to IAM console, select Roles from navigation pane.
  5. Click on Create Role and then follow screenshot below.

This role is for spot fleet request.

  1. Follow the steps from 3 to 7. 

So, by this point we have two IAM Roles let’s say Lambda_role and Fleet_role . 

Now we are going to create interesting part which deals with the process to place spot fleet request and also explains each and every related factors which powers us to customize our spot fleet requests according to your needs wisely.

Create Spot Fleet Request 


Move to EC2 dashboard,click on Spot Requests. Then click on Request Spot Instances.

  1. You will see something like below:

Load Balancing Workloads: For instances of same size in any Availability zone.
Flexible Workloads: For instances of any size in any Availability zone.
Big Data Workloads: For instances of same size in single Availability zone.

Now decide among these three options based on your requirements and if you are just learning then leave the choice to default.

The last option ‘Defined duration workloads’ is a bit different. This option provides you a new way to reserve spot instances but for a maximum of 6 hours only with the choices vary from 1 to 6  hours. This will ensure that you will not be interrupted for 1 to 6 hours after you opted to run your workloads for defined duration and because you won’t be interrupted you will have to pay slightly higher price than the spot instances.

So, for this option AWS has another pricing category called Spot Block. Under this model, pricing is based on the requested duration and the available capacity, and is typically 30% to 45% less than On-Demand, with an additional 5% off during non-peak hours for the region. Observe the differences below.

Let us start with a brief comparison of categories into which we can launch our instances.

Instance TypesSpot Instance PriceSpot Block Price for
1 hour
Spot Block Price for
6 hours
a1.medium$0.0084 per Hour$0.012 per Hour$0.016 per Hour
a1.large$0.0168 per Hour$0.025 per Hour$0.033 per Hour
c5.large$0.0392 per Hour$0.047 per Hour$0.059 per Hour
c1.medium$0.013 per Hour$0.065 per Hour $0.085 per Hour
t2.micro$0.0035 per Hour$0.005 per Hour$0.007 per Hour

2. Next we will configure our spot instances.

Launch Template: If you have ‘Launch Template’ you can select it from here. One advantage of using launch template is you will have an option for choosing a part of your total capacity as on-demand instances. If you don’t have launch template you can go with AMI and specify all other parameters.
AMI: Don’t have launch template?? Choose AMI but with this option you won’t be having features to choose a part of your total capacity as on-demand instances.
Minimum Compute Unit: Specify how much capacity you need either in terms of vCPU and memory or as instance types. As the name suggests this is the minimum capacity that we need for our purpose. AWS will choose similar instances based on this option.
Then, you will have options to choose vpc, Availability zone, key-pair name. On additional configuration you can choose security groups, IAM instance profile, user data, tags and many more configurations.

3. Next section is for defining target capacity.

Total target capacity: Specify how much capacity is needed. If you have chosen launch template then you can specify how much of total capacity you want as on-demand.
Maintain target capacity: Once selecting this feature AWS will always maintain your target capacity.Suppose AWS takes your one instance back then under this option it will automatically place request for one spot-instances in order to maintain target capacity. Selecting this option allows you to modify target capacity even after spot-fleet request has been created.
Maintain target cost for Spot: This option allows you to set the maximum hourly price you want to pay for spot instances and it’s optional.

4. Next section is about customizing your spot fleet request.

If you don’t care much about how AWS is going to fulfill your request leave the things to default and proceed to next step, or else uncheck the ‘Apply recommendations’ field which will appear like below: 

As i have chosen c3.large as minimum compute unit, AWS chooses similar instances like c3.large or it can even choose instances with more memory than specified but not lesser under almost similar prices. You can also choose more instances by selecting Select instance types, this will strengthen your probability of getting spot instances. Now let’s see how!

Suppose you ordered for only t2.small instance. You will get that only when this instance is available in the Availability Zone you specified. So to increase your chances, you should specify maximum Availability Zone and maximum instance types which is required for your workload. This will increase the number of pools and ultimately the chances of success.

     Picture showing the instance pool an Availability Zone can contain.

Instance Pools: You can say it a bag full of same instance types.
Every Availability Zone contains instance pools. So suppose you choose two availability zones and a total of three instances types. You, then, will have 6 instance pools three from each availability zone. Maximum the number of instance pools, maximum is the chance of getting spot instances.
Fleet allocation strategy: AWS allows you to choose allocation strategy i.e., the strategy through which your capacity is going to be fulfilled. One thing to be considered is that AWS will try to evenly distribute your capacity across your specified availability zones.
Lowest price: Instances will be made available to you from the pools with the Lowest Price. Suppose you choose 2 Availability Zones and your capacity is 6 then 3 instances from each Availability Zone comes from pools with lowest cost.
Diversified across n number of pools: Suppose your capacity is 20 and 2 Availability Zones are selected and also you have chosen diversified across 2 instance pools. So 10 instances from each Availability Zone will be provided with further restriction to choose 2 pools to fulfill the capacity of 10 from each Availability Zone. This will make at least one pool available for you even if the other pool is unavailable and hence reducing the risk of interruption.
Capacity Optimized: This option provides you the instances among those pools which are highly available. Suppose your capacity is 20 and 2 Availability Zones are selected. So 10 instances from each Availability Zone will be provided from the pools which is highly available and is going to be highly available in future too.

5. Next section is for choosing the price and other additional configurations.

First of all, Uncheck the Apply defaults. Then, customize your additional settings under the heads:

IAM fleet role: This role allows you to provide tags to your spot instances. Choose the default one or you can create your own.
Maximum price: This option allows you to choose default maximum price which is on-demand price or to set your maximum price you want to pay for an instance per hour.
Next you can specify your spot fleet request validation. Check Terminate instances on request expire.
Load balancing: If you check Receive traffic from one or more load balancer you will get to choose the classic load balancer under which you want to launch your instance or you can choose the target group under which you want to launch your instance.

This is how you can help yourself customize spot fleet request based on your requirements. Lastly i will advise you to visit the link below once when you are on to deciding your suited instance types. After visiting this link you can make estimates of your saving.

https://aws.amazon.com/ec2/spot/instance-advisor/

 

Let’s create Lambda function for our purpose

We are going to create two lambda function for our purpose. Follow the steps below:

  1. Create an IAM role ready to allow lambda function to modify ASG, i prefer to ready an IAM role with admin permission because we are going to require IAM role many times throughout this project.
  2. Head to AWS Lambda console. On the left side navigation pane click on function.
  3. Click on Create function tab.
  4. Leave the default selection Author from scratch.
  5. Enter function name of your choice.
  6. Select Runtime as Python 3.7 .
  7. Expand Choose or create an execution role and select the role you have created for this project.
  8. Click on Create function
  9. Scroll down and put the below code on lambda_function.py.

    Copy code from below

import json
import boto3

def lambda_handler(event, context):
    # TODO implement
    client = boto3.client('autoscaling')
    response = client.set_desired_capacity(
        AutoScalingGroupName='spot',
        DesiredCapacity=1
    )

This code increases the desired capacity of ASG with name ‘spot’ to 1.

This is triggered when AWS issues interruption notice. As a result this launches an on-demand instance to maintain total capacity of two.


10. Upon the upper right corner click on save to save the function and come again to AWS Lambda console and one more time select function and then Create function.

11. Give your function name and execution role same as did previously and select Create function to write code to maintain spot capacity to two always.

12. Refer following snap.

Copy code from below

import boto3         ## Python sdk
import json

## In this part code is checking if number of spot instances is >= 2,
##then set ASG desired capacity to 0 to terminate running on-demand instances running.
def asg_cap(fleet, od):
    print('in function',fleet)
    print('in function',od)
    if fleet >= 2 and od > 0:
        client = boto3.client('autoscaling')
        response = client.set_desired_capacity(
            AutoScalingGroupName='spot',
            DesiredCapacity=0
        )

##Beginning of the execution
def lambda_handler(event, context):
    cnt = 0
    ec3 = boto3.resource('ec2')
    fleet = 0
    od = 0
    instancename = []
    fleet_ltime = []
    od_ltime = []
    for instance in ec3.instances.all():    ##looping all instances
        print (instance.id)
        print (instance.state)
        print (instance.tags)
        print (instance.launch_time)
        abc = instance.tags                ##get tags of all instances
        ab = instance.state                ##get state of all instances
        print (ab['Name'])
        if ab['Name'] == 'running':        ## checks for the instances whose state is running 
            cnt += 1
            for tags in abc:
                if tags["Key"] == 'Name':  ## checks for tag key is 'Name'
                    instancename.append(tags["Value"])
                    inst = tags["Value"]
                    print (inst)
                    if inst == 'fleet':    ## checks if tag key 'Name' has value 'fleet'. Change 'fleet' to your own tag name       
                        fleet += 1
                        fleet_ltime.append(instance.launch_time)
                    if inst == 'Test':     ## checks if tag key 'Name' has value 'Test'. Change 'Test' to your own tag name
                        od += 1
                        od_ltime.append(instance.launch_time)
                    
    print('Total number of running instances: ', cnt)
    print(instancename)
    print('Number of spot instances: ', fleet)
    print('Number of on-demand instances: ', od)
    print('Launch time of Fleet: ', fleet_ltime)
    print('Launch time of on-demand: ', od_ltime)
    
    if od > 0:
        dt_od = od_ltime[0]
    else:
        dt_od = '0'
        
    if fleet > 1:
        dt_spot = fleet_ltime[0]
        dt_spot1 = fleet_ltime[1]
    elif (fleet > 0) and (fleet < 2):
        dt_spot = fleet_ltime[0]
        dt_spot1 = '0'
    else:
        dt_spot = '0'
        dt_spot1 = '0'
        
        
    if dt_od != '0':
        if dt_spot != '0':
            if dt_od > dt_spot:
                if dt_spot1 != '0':
                    if dt_od > dt_spot1:
                        print('On-Demand instance is Launched')
                        # do nothing
                    else:
                        print('Spot instance is Launched')
                        asg_cap(fleet, od)
                else:
                    print('Only 1 spot instance exist')
            else:
                print('1Spot instance is Launched')
                asg_cap(fleet, od)
        else:
            print('No spot instance exist')
    else:
        print('No On-Demand instance exist')
        
        
    ## modify the spot fleet request capacity to two    
    client1 = boto3.client('ec2')
    response = client1.modify_spot_fleet_request(
        SpotFleetRequestId='sfr-92b7b2f1-163b-498a-ae7c-7bd1b4fdb227', ##replace with your spot fleet rquest 
        TargetCapacity=2
    )

13. Save this function.

Till now we have created two lambda function and now we are going to create Cloudwatch Rules which will call lambda function on interruption and state change to running of ec2 instances on our behalf.

Let’s create Cloudwatch Rules.

Steps to create Cloudwatch Rules.

  1. Go to Cloudwatch console.
  2. Select Rules from left side navigation pane.
  3. Click on Create Rule.
  4. Follow the screenshot below.

We are creating rules here to trigger on interruption notice by AWS.

On left side there is Targets Area, there select Lambda function and then select the first function which is increasing the desired capacity of ASG to 1.
Save the first Rule.

  1. Now let’s create another cloudwatch rule. 
  2. Create Rule. Follow screenshot.

After this add target as Lambda function and function name to the one which we created secondly a little lengthy one.

Save the second Rule.

Now to verify this automation go to spot fleet request you created with target capacity two. Select that request, click Action tab and click on Modify capacity and replace 2 with 1 there. This will terminate one spot instance and before that it is going to send interruption notice. Observe the changes on Auto scaling group, instance, and spot request dashboards. Wait for couple of minutes, if everything is right and according to our plan then again you will be having two spot instances under your bag.

If you are not having two spot instances at the end, then something is not right. You need to cross-check to verify:

  1. Check the name of the Auto scaling group you created. Copy it’s name and now go to the first lambda function you created which increase the desired capacity of ASG to 1 and check if the function contain the correct name of Auto scaling group, if not paste the name you copied against AutoScalingGroupName section.
  2. Check the tag Name value of spot instances and note somewhere. Also check the tag Name of On-demand instance which you configured while creating Auto scaling group and note that too. Now go to second lambda function you created and go to line number 40, here tag Name value of spot instance is given under single quotes, check if that matches with yours. Now at line number 43 check if tag Name of on-demand instance matches with yours.
  3. Go to Spot request and copy the spot fleet request id you created and go to line number 94 of the second lambda function and make sure the id under single quotes matches with yours.

Now test again, hopefully it will work now. If still you are facing problem or you have not created manually all the above stuffs then no need to worry.

I have created terraform code which will create the whole infra needed for this project. However after running this terraform code successfully you will have to make some changes so that your infra functions properly and for that you need to follow the steps below. Below is the link of github repository, clone the repo and post that follow the steps stated below:

Link: https://github.com/sah-shatrujeet/infra_spot_fleet_terraform.git

  1. Make sure you have terraform version 0.12.8 installed. 
  2. You must have AWS CLI configured too. 
  3. Before running the terraform code go to the folder where you have cloned the repo, then go to infra-spot/infra/infra and open vars.tf on your favourite editor.
  4. Go through the files and change the default set values as per your choice.
  5. Run the terraform apply by moving into the folder infra-spot/infra/infra .  Follow last three step to assure your infra is going to automate properly.
  6. Go to the Lambda function console, select first_function and check if the code contains same Autoscaling group name as the name of the Autoscaling group created with terraform. If not match the same with yours.
  7. On lambda function console select second_function and repeat the previous step and also check the Tag Name of the on-demand and spot instances, they are ‘Test’ and ‘fleet’ respectively in the terraform code. However you should make sure that the Tag Name of on-demand and spot instances matches with the tag mentioned in code.
  8. Lastly head on to spot request and note the request id and match with second_function SpotFleetRequestId part.

I believe terraform will do everything right for you. If you are still facing problems, i will be happy to resolve your queries.

 

Good to know

AWS reports shows that the average frequency of interruptions across all regions and instance types is <5% .

For any instance types on-demand price is maximum and we can bid at a maximum of 10 times of on-demand price.

When you will implement spot instances automation for your project then you will come across different scenario, you might need to monitor more events and trigger action based on that. Unfortunately, cloudwatch do not have all the events of AWS covered. But AWS CloudTrail solve this. CloudTrail knows and covered everything you performed on AWS. To use this on cloudwatch you will have to enable CloudTrail and then you can make a rule with service EC2 and Event Type AWS API Call via CloudTrail and then can add any specific operation that is not present as cloudwatch events. However i recommend you to first go through every details about CloudTrail before implementing that. If you have any queries implementing CLoudTrail, you can ask on comment section. I will be happy to help you.

While implementing spot instances for Database you may configure your spot instances such that volume of instance will not be deleted upon spot instance termination assuring that you are not going to loose any data.

Conclusion

With smart automation and monitoring we can have our production server on spot instances with guaranteed failover and high availability. However one who don’t want to run into any risk or the one who has no proper idea or resource  to automate the interruption can plan :

  1. Half or a portion of  the total production server on spot instances.
  2. Development server on spot instances without any worry.
  3. QA server over  spot instances.
  4. For more capacity prefer spot instances to ease the load on other main server.
  5. Prefer irregular short-term task on spot instances.

 

 

Perfect Spot Instance’s Imperfections | part-I

In this blog I am going to share my opinion on spot instances and why we should go for it. While I was going thorough the category(on-demand, reserved, and spot) that AWS provides to launch our instances into, I found spot instances very fascinating and a little challenging.

What I found about spot instances is that they are normal ec2 instances. But what makes it different from the other two(on-demand, reserved)? What strategy do AWS uses for spot instances to make it cheaper than other two and why? Let’s know about these first.

With AWS continuously expanding their region and Availability Zones in their region, they are left with huge amount of unused capacity. How AWS take advantage of their unused capacity? AWS floats its spare capacity on market on a very low base price and allows us to bid on instances and the person with the highest bidding price is provided with the instance, however the price that person pays is only market price i.e if market price is $1 for t2.micro instance and you places a bid of $2 on t2.micro then you will get that instance but the price you will pay is market price i.e $1 only. Interesting?? Let’s bring more fascinating things by comparing the prices of the all three.

Discounts      Types                               Details
0%On-demand InstancesNo commitment from your side.
You pay the most.
Costs fixed price per hour.
40%-60%Reserved Instances1 year or 3 year commitment from your side.
You save money from that commitment.
Costs per plan.
60%-90%Spot InstancesNo commitment from AWS side.
Ridiculously inexpensive.
Costs based on availability.

With this information you must be thinking to try out spot instances at least once. Since we know that every interesting thing comes with a price, spot instances too have a downside “AWS can take back spot instances from you anytime”. Upset? Don’t be, cause this blog built with the purpose of overcoming its downside only. After all you won’t mind spending your 5-10 minutes only for saving in dollars.

Let’s start then…

Now you know that AWS is ready to give their huge spare capacity in the prices of our choice but with a promise to take the capacity back when they want, giving us a warning of two minute before interruption. We can manage the interruption wisely and the proof is some of the organization are already taking full advantage of the spot instances.

Before we go to the core concept let’s build some required concept that will help us to understand core concept with ease.

Spot Instance v/s Spot Fleet

With normal spot instance request, you place a bid for a specific instance type in anyone or specific Availability Zone and hope you get it.

With spot fleets, you can request a number of different instance types that meet your requirements. Additionally, you can spread your spot fleet bid across multiple Availability Zones to increase the likelihood of getting your capacity fulfilled.

Interruption Notice

When AWS take our spot instance back they provide interruption notice 2 mins before so that we can perform some actions.

Next, it is also necessary to know about Cloudwatch Rules. AWS provide the event type on the basis of which you can perform actions like triggering lambda function, sending notice over mail or sms etc.

One of the event types that you can monitor is ec2 state change to running.

Now with this much knowledge we are good to go. And I will show you how you can automate the interruption to avoid the risk of downtime.

This is the main diagram stating all the components that are used to automate the interruption. Now suppose one of the spot instances has been interrupted and AWS is going to take that spot  instance back. Let’s see what happens then.

When AWS is to take one spot instance back, AWS will give interruption notice upon which a cloudwatch rule is created to monitor the interruption notice and then; 

Lambda function is triggered. And then;

Lambda function increases the desired capacity of Auto scaling group to 1. Due to which an on-demand instance gets launched into Target Group and the interrupted spot instance gets terminated. 

Now when on-demand instance is launched and its state changes to running then,

Another cloudwatch rule monitor that change and

Cloudwatch will trigger another lambda function, and then:

Lambda function will modify the spot fleet request capacity to 2 which was previously 1, this will launch a spot instance in the same Target Group and now we will have

1 more spot instance is being launched and when its state changes to running  then,

Again cloudwatch rule comes in action upon state change to running of just launched spot instance. Then

This will again trigger associated lambda function and

Lambda function will set the desired capacity of ASG to 0 again due to which the on-demand instance under the target group will get terminated. And finally we will be left with the following:

Again we are at the same place i.e, 2 spot instances get maintained under the Target Group always.

Note: For the purpose of demonstration, I have taken two instances initially, however you can have any number of  instances. You can customize this according to your needs and constraint. The whole infra is automated with terraform which will create and link everything presented above. Link to clone the repo is provided at the second part of this article.

Are you excited to implement this concept? I am equally excited to share the real implementation with you. With the next part coming very soon, I want you to try the implementation by yourself. In the second part I will help you to implement the whole concept. See you soon…

Redis Cluster: Setup, Sharding and Failover Testing

Watching cluster sharding and failover management is as gripping as visualizing a robotic machinery work.

My last blog on Redis Cluster was primarily focussed on its related concepts and requirements. I would highly recommend to go through the concepts first to have better understanding.

Here, I will straight forward move to its setup along with the behaviour of cluster when I intentionally turned down one Redis service on one of the node.
Let’s start from the scratch.

Redis Setup

Here, I will follow the approach of a 3-node Redis Cluster with Redis v5.0 on all the three CentOS 7.x nodes.

Setup Epel Repo

wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
rpm -ivh epel-release-latest-7.noarch.rpm

Setup Remi Repo

yum install http://rpms.remirepo.net/enterprise/remi-release-7.rpm
yum --enablerepo=remi install redis

redis-server --version
Redis server v=5.0.5 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=619d60bfb0a92c36

3-Node Cluster Prerequisites

While setting up Redis cluster on 3 nodes, I will be following the strategy of having 3 master nodes and 3 slave nodes with one master and one slave running on each node serving redis at different ports. As shown in the diagram Redis service is running on Port 7000 and Port 7001

  • 7000 port will serve Redis Master
  • 7001 port will serve Redis Slave

Directory Structure

We need to design the directory structure to server both redis configurations.

tree /etc/redis
/etc/redis
`-- cluster
    |-- 7000
    |   `-- redis_7000.conf
    `-- 7001
        `-- redis_7001.conf

Redis Configuration

Configuration file for Redis service 1

cat /etc/redis/cluster/7000/redis_7000.conf
port 7000
dir /var/lib/redis/7000/
appendonly yes
protected-mode no
cluster-enabled yes
cluster-node-timeout 5000
cluster-config-file /etc/redis/cluster/7000/nodes_7000.conf
pidfile /var/run/redis_7000.pid

Configuration file for Redis service 2

cat /etc/redis/cluster/7000/redis_7000.conf
port 7001
dir /var/lib/redis/7001
appendonly yes
protected-mode no
cluster-enabled yes
cluster-node-timeout 5000
cluster-config-file /etc/redis/cluster/7001/nodes_7001.conf
pidfile /var/run/redis_7001.pid

Redis Service File

As we are managing multiple service on a single instance, we need to update service file for easier management of redis services.

Service management file for Redis service 1

cat /etc/systemd/system/redis_7000.service
[Unit]
Description=Redis persistent key-value database
After=network.target

[Service]
ExecStart=/usr/bin/redis-server /etc/redis/cluster/7000/redis_7000.conf --supervised systemd
ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown
Type=notify
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=0755
LimitNOFILE=65535

[Install]
WantedBy=multi-user.target

Service management file for Redis service 2

cat /etc/systemd/system/redis_7001.service
[Unit]
Description=Redis persistent key-value database
After=network.target

[Service]
ExecStart=/usr/bin/redis-server /etc/redis/cluster/7001/redis_7001.conf --supervised systemd
ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7001 shutdown
Type=notify
User=redis
Group=root
RuntimeDirectory=/etc/redis/cluster/7001
RuntimeDirectoryMode=0755
LimitNOFILE=65535

[Install]
WantedBy=multi-user.target

Redis Service Status

Master Service
systemctl status redis_7000.service 
● redis_7000.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7000.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-09-25 08:14:15 UTC; 30min ago
  Process: 2902 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown (code=exited, status=0/SUCCESS)
 Main PID: 2917 (redis-server)
   CGroup: /system.slice/redis_7000.service
           └─2917 /usr/bin/redis-server *:7000 [cluster]
systemd[1]: Starting Redis persistent key-value database...
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=2917, just started
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 # Configuration loaded
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 * supervised by systemd, will signal readiness
systemd[1]: Started Redis persistent key-value database.
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.754 * No cluster configuration found, I'm ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.756 * Running mode=cluster, port=7000.
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.756 # Server initialized
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.756 * Ready to accept connections
Slave Service
systemctl status redis_7001.service
● redis_7001.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7001.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-09-25 08:14:15 UTC; 30min ago
  Process: 2902 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7001 shutdown (code=exited, status=0/SUCCESS)
 Main PID: 2919 (redis-server)
   CGroup: /system.slice/redis_7001.service
           └─2919 /usr/bin/redis-server *:7001 [cluster]
systemd[1]: Starting Redis persistent key-value database...
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=2917, just started
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 # Configuration loaded
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 * supervised by systemd, will signal readiness
systemd[1]: Started Redis persistent key-value database.
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.754 * No cluster configuration found, I'm ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.756 * Running mode=cluster, port=7001.
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.756 # Server initialized
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.756 * Ready to accept connections

Redis Cluster Setup

Redis itself provides cli tool to setup cluster.
In the current 3 node scenario, I opt 7000 port on all node to serve Redis master and 7001 port to serve Redis slave.

redis-cli --cluster create 172.19.33.7:7000 172.19.42.44:7000 172.19.45.201:7000 172.19.33.7:7001 172.19.42.44:7001 172.19.45.201:7001 --cluster-replicas 1

The first 3 address will be the master and the next 3 address will be the slaves. It will be a cross node replication, say, Slave of any Mater will reside on a different node and the cluster-replicas define the replication factor, i.e each master will have 1 slave.

>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 172.19.42.44:7001 to 172.19.33.7:7000
Adding replica 172.19.45.201:7001 to 172.19.42.44:7000
Adding replica 172.19.33.7:7001 to 172.19.45.201:7000
M: ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000
   slots:[0-5460] (5461 slots) master
M: 314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000
   slots:[5461-10922] (5462 slots) master
M: 19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000
   slots:[10923-16383] (5461 slots) master
S: 896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001
   replicates 19a2c81b7f489bec35eed474ae8e1ad787327db6
S: 89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001
   replicates ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
S: 20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001
   replicates 314038a48bda3224bad21c3357dbff8305735d72
Can I set the above configuration? (type 'yes' to accept):

A dry run will showcase the cluster setup and ask for confirmation.

Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
..
>>> Performing Cluster Check (using node 172.19.33.7:7000)
M: ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: 20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001
   slots: (0 slots) slave
   replicates 314038a48bda3224bad21c3357dbff8305735d72
M: 314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
M: 19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001
   slots: (0 slots) slave
   replicates ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
S: 896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001
   slots: (0 slots) slave
   replicates 19a2c81b7f489bec35eed474ae8e1ad787327db6
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

Check Cluster Status

Connect to any of the cluster node to check the status of cluster.

redis-cli -c -h 172.19.33.7 -p 7000
172.19.33.7:7000> cluster nodes
20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001@17001 slave 314038a48bda3224bad21c3357dbff8305735d72 0 1569402961000 6 connected
314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000@17000 master - 0 1569402961543 2 connected 5461-10922
19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000@17000 master - 0 1569402960538 3 connected 10923-16383
ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000@17000 myself,master - 0 1569402959000 1 connected 0-5460
89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001@17001 slave ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 0 1569402960000 5 connected
896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001@17001 slave 19a2c81b7f489bec35eed474ae8e1ad787327db6 0 1569402959936 4 connected

Redis cluster itself manages the cross node replication, as seen in the above screen, 172.19.42.44:7000 master is associated with 172.19.45.201:7001 slave.

Data Sharding

There are 16384 slots. These slots are divided by the number of servers.
If there are 3 servers; 1, 2 and 3 then

  • Server 1 contains hash slots from 0 to 5500.
  • Server 2 contains hash slots from 5501 to 11000.
  • Server 3 contains hash slots from 11001 to 16383.
redis-cli -c -h 172.19.33.7 -p 7000
172.19.33.7:7000> set a 1
-> Redirected to slot [15495] located at 172.19.45.201:7000
OK
172.19.45.201:7000> set b 2
-> Redirected to slot [3300] located at 172.19.33.7:7000
OK
172.19.33.7:7000> set c 3
-> Redirected to slot [7365] located at 172.19.42.44:7000
OK
172.19.42.44:7000> set d 4
-> Redirected to slot [11298] located at 172.19.45.201:7000
OK
172.19.45.201:7000> get b
-> Redirected to slot [3300] located at 172.19.33.7:7000
"2"
172.19.33.7:7000> get a
-> Redirected to slot [15495] located at 172.19.45.201:7000
"1"
172.19.45.201:7000> get c
-> Redirected to slot [7365] located at 172.19.42.44:7000
"3"
172.19.42.44:7000> get d
-> Redirected to slot [11298] located at 172.19.45.201:7000
"4"
172.19.45.201:7000>

Redis Cluster Failover

Stop Master Service

Let’s stop the Redis master service on Server 3.

systemctl stop redis_7000.service
systemctl status redis_7000.service
● redis_7000.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7000.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Wed 2019-09-25 09:32:37 UTC; 23s ago
  Process: 3232 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown (code=exited, status=0/SUCCESS)
  Process: 2892 ExecStart=/usr/bin/redis-server /etc/redis/cluster/7000/redis_7000.conf --supervised systemd (code=exited, status=0/SUCCESS)
 Main PID: 2892 (code=exited, status=0/SUCCESS)

Cluster State (Failover)

While checking the cluster status, Redis master service running on server 3 at port 7000 is shown fail and disconnected.

At the same moment its respective slave gets promoted to master which is running on port 7001 on server 1.

redis-cli -c -h 172.19.33.7 -p 7000
172.19.45.201:7000> CLUSTER NODES
314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000@17000 master,fail - 1569403957138 1569403956000 2 disconnected
ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000@17000 master - 0 1569404037252 1 connected 0-5460
896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001@17001 slave 19a2c81b7f489bec35eed474ae8e1ad787327db6 0 1569404036248 4 connected
89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001@17001 slave ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 0 1569404036752 5 connected
20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001@17001 master - 0 1569404036000 7 connected 5461-10922
19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000@17000 myself,master - 0 1569404035000 3 connected 10923-16383

Restarting Stopped Redis

Now we will check the behaviour of cluster once we fix or restart the redis service that we intentionally turned down earlier.

systemctl start redis_7000.service
systemctl status redis_7000.service
● redis_7000.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7000.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-09-25 09:35:12 UTC; 8s ago
  Process: 3232 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown (code=exited, status=0/SUCCESS)
 Main PID: 3241 (redis-server)
   CGroup: /system.slice/redis_7000.service
           └─3241 /usr/bin/redis-server *:7000 [cluster]

Cluster State (Recovery)

Finally, all redis service are back in running state. The master service that we turned down and restarted has now become slave to its promoted master.

redis-cli -c -h 172.19.33.7 -p 7000
172.19.45.201:7000> CLUSTER NODES 314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000@17000 slave 20ab4b30f3d6d25045909c6c33ab70feb635061c 0 1569404162565 7 connected ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000@17000 master - 0 1569404162000 1 connected 0-5460 896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001@17001 slave 19a2c81b7f489bec35eed474ae8e1ad787327db6 0 1569404163567 4 connected 89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001@17001 slave ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 0 1569404163000 5 connected 20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001@17001 master - 0 1569404162000 7 connected 5461-10922 19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000@17000 myself,master - 0 1569404161000 3 connected 10923-16383

It’s not done yet, further we can explore around having a single endpoint to point from the application. I will am currently working on that and soon will come up with the solution.
Apart from this monitoring the Redis Cluster will also be a major aspect to look forward.
Till then get your hands dirty playing around the Redis Cluster setup and failover.

Reference links:
Image: Google image search (blog.advids.co)

Redis Cluster: Architecture, Replication, Sharding and Failover

Speed fascinates everyone, but only if its under control.

It is well said and a proven fact that everyone needs to implement a cache at some point in their application lifecycle, and this has become our requirement too.

During the initial phase we placed Redis in a Master Slave mode with next phase involving Sentinal setup to withstand Master failover. I would like to throw some light on their architecture along with pros and cons so I can put emphasis on why I finally migrated to Redis Cluster.

Redis Master/Slave

Redis replication is a very simple to use and configure master-slave replication  that allows slave Redis servers to be exact copies of master servers.

What forced me to look for Redis Sentinel

When using Master-Slave architecture

  • There will be only one Master with multiple slaves for replication.
  • All write goes to Master, which creates more load on master node.
  • If Master goes down, the whole architecture is prone to SPOF (Single point of failure).
  • M-S architecture does not helps in scaling, when your user base grows.
  • So we need a process to Monitor Master in case of failure or shutdown, that is Sentinel.

Redis Sentinel

Initial Setup
Failover Handling

I was still concerned about the below Sharding of data for best performance

Concept of Redis Cluster

“A query that used to take an hour can run in seconds on cache”.

Redis Cluster is an active-passive cluster implementation that consists of master and slave nodes. The cluster uses hash partitioning to split the key space into 16,384 key slots, with each master responsible for a subset of those slots. 

Each slave replicates a specific master and can be reassigned to replicate another master or be elected to a master node as needed. 

Ports Communication

Each node in a cluster requires two TCP ports. 

  • One port is used for client connections and communications. This is the port you would configure into client applications or command line tools. 
  • Second required port is reserved for node-to-node communication that occurs in a binary protocol and allows the nodes to discuss configuration and node availability.

Failover

When a master fails or is found to be unreachable by the majority of the cluster as determined by the nodes communication via the gossip port, the remaining masters hold a vote and elect one of the failing masters’ slaves to take its place. 

Rejoining The Cluster

When the failing master eventually rejoins the cluster, it will join as a slave and begin to replicate another master.

Sharding

Redis sharded data automatically into the servers.
Redis has a concept of hash slot in order to split data. All the data are divided into slots.
There are 16384 slots. These slots are divided by the number of servers.

If there are 3 servers; A, B and C then

  • Server 1 contains hash slots from 0 to 5500.
  • Server 2 contains hash slots from 5501 to 11000.
  • Server 3 contains hash slots from 11001 to 16383.

6 Node M/S Cluster

In a 6 node cluster mode, 3 nodes will be serving as a master and the 3 node will be their respective slave.

Here, Redis service will be running on port 6379 on all servers in the cluster. Each master server is replicating the keys to its respective redis slave node assigned during cluster creation process.

3 Node M/S Cluster

In a 3 node cluster mode, there will be 2 redis services running on each server on different ports. All 3 nodes will be serving as a master with redis slave on cross nodes.

Here, two redis services will be running on each server on two different ports and each master is replicating the keys to its respective redis slave running on other node.

WHAT IF Redis Goes Down

1 node goes down in a 6 node Redis Cluster

If one of the node goes down in Redis 6-node cluster setup, its respective slave will be promoted as master.

In above example, master Server3 goes down and it slave Server6 is promoted as master.

1 node goes down in a 3 node Redis Cluster

If one of the node goes down in Redis 3-node cluster setup, its respective slave running on the separate node will be promoted to master.

In above example, Server 3 goes down and slave running on Server1 is promoted to master.

Redis service goes down on one of the 3 node Redis Cluster

If redis service goes down on one of the node in Redis 3-node cluster setup, its respective slave will be promoted as master.

Conclusion

Although, this methodology will prevent Redis Cluster in partial Failover scenarios only, but if we want full failover we need to look for Disaster Recovery techniques as well.

Well this implementation helped me having a sound sleep while thinking of Redis availability, sharding and performance.

Enough of reading, eager to know how this all works when it comes to implementation. Don’t worry, my next blog Redis Cluster: Setup, Sharding and Failover Testing will be guiding you through the process.

Enjoy happy and safe DIWALI

One more reason to use Docker

Recently I was working on a project which includes Terraform and AWS stuff. While working on that I was using my local machine for terraform code testing and luckily everything was going fine. But when we actually want to test it for the production environment we got some issues there. Then, as usual, we started to dig into the issue and finally, we got the issue which was quite a silly one 😜. The production server Terraform version and my local development server Terraform version was not the same. 

After wasting quite a time on this issue, I decided to come up with a solution so this will never happen again.

But before jumping to the solution, let’s think is this problem was only related to Terraform or do we have faced the similar kind of issue in other scenarios as well.

Well, I guess we face a similar kind of issue in other scenarios as well. Let’s talk about some of the scenario’s first.

Suppose you have to create a CI pipeline for a project and that too with code re-usability. Now pipeline is ready and it is working fine in your project and then after some time, you have to implement the same kind of pipeline for the different project. Now you can use the same code but you don’t know the exact version of tools which you were using with CI pipeline. This will lead you to error elevation. 

Let’s take another example, suppose you are developing something in any of the programming languages. Surely that utility or program will have some dependencies as well. While installing those dependencies on the local system, it can corrupt your complete system or package manager for dependency management. A decent example is Pip which is a dependency manager of Python😉.

These are some example scenarios which we have faced actually and based on that we got the motivation for writing this blog.

The Solution

To resolve all this problem we just need one thing i.e. containers. I can also say docker as well but container and docker are two different things.

But yes for container management we use docker.

So let’s go back to our first problem the terraform one. If we have to solve that problem there are multiple ways to solve this. But we tried it to solve this using Docker.

As Docker says

Build Once and Run Anywhere

So based on this statement what we did, we created a Dockerfile for required Terraform version and stored it alongside with the code. Basically our Dockerfile looks like this:-

FROM alpine:3.8

MAINTAINER OpsTree.com

ENV TERRAFORM_VERSION=0.11.10

ARG BASE_URL=https://releases.hashicorp.com/terraform

RUN apk add --no-cache curl unzip bash \
    && curl -fsSL -o /tmp/terraform.zip ${BASE_URL}/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip \
    && unzip /tmp/terraform.zip -d /usr/bin/

WORKDIR /opstree/terraform

USER opstree

In this Dockerfile, we are defining the version of Terraform which needs to run the code.
In a similar fashion, all other above listed problem can be solved using Docker. We just have to create a Dockerfile with exact dependencies which are needed and that same file can work in various environments and projects.

To take it to the next level you can also dump a Makefile as well to make everyone life easier. For example:-

IMAGE_TAG=latest
build-image:
    docker build -t opstree/terraform:${IMAGE_TAG} -f Dockerfile .

run-container:
    docker run -itd --name terraform -v ~/.ssh:/root/.ssh/ -v ~/.aws:/root/.aws -v ${PWD}:/opstree/terraform opstree/terraform:${IMAGE_TAG}

plan-infra:
    docker exec -t terraform bash -c "terraform plan"

create-infra:
    docker exec -t terraform bash -c "terraform apply -auto-approve"

destroy-infra:
    docker exec -t terraform bash -c "terraform destroy -auto-approve"

And trust me after making this utility available the reactions of the people who will be using this utility will be something like this:-

Now I am assuming you guys will also try to simulate the Docker in multiple scenarios as much as possible.

There are a few more scenarios which yet to be explored to enhance the use of Docker if you find that before I do, please let me know.

Thanks for reading, I’d really appreciate any and all feedback, please leave your comment below if you guys have any feedback.

Cheers till the next time.

Tuning Of ElasticSearch Cluster

Related image

Store, Search And Analyse!

Scenario

The first thing which comes in mind when I hear about logging solutions in my infrastructure is ELK (Elasticsearch, Logstash, Kibana).
But, what happens when logs face an upsurge in the quantity and hamper performance, which, in Elasticsearch words, we may also call “A Fall Back”
We need to get control of situation, and optimize our setup. For which, we require a need for tuning the Elasticsearch

What Is ElasticSearch?

It is a java based, open-source project build over Apache Lucene and released under the license of Apache. It has the ability to store, search and analyse document files in diverse format.

A Bit Of History

Image result for history drawing

Shay Banon was the founder of compass project, thought of need to create a scalable search engine which could support other languages than java.
Therefore, he started to build a whole new project which was the 3rd version of compass using JSON and HTTP interface. The first version of which was released in 2010.

ElasticSearch Cluster

Elasticsearch is a java based project which runs on Java Virtual Machines, wherein each JVM server is considered to be an elasticsearch node. In order to support scalability, elasticsearch holds up the concept of cluster in which multiple nodes runs on one or more host machines which can be grouped together into a cluster which has a unique name.
These clustered nodes holds up the entire data in the form of documents and provides the functionality of indexing and search of those documents.

Types Of Nodes:-

  • Master Eligible-Node
    Masters are meant for cluster/admin operations like allocation, state maintenance, index/alias creation, etc
  • Data Node
    Data nodes hold data and perform data-related operations such as CRUD, search, and aggregations.
  • Ingest Node
    Ingest nodes are able to apply an ingest pipeline to a document in order to transform and enrich the document before indexing.
  • Tribe Node
    It is a special type of coordinating node that can connect to multiple clusters and perform search and other operations across all connected clusters.
Image result for nodes in elasticsearch cluster

Shards and Replicas

  • Shards: Further dividing index into multiple entities are called shards
  • Replicas: Making one or more copies of the index’s shards called as replica shards or simple replicas

By default in Elasticsearch every index is allocated with 5 primary shards and single replica of each shard. That means for every index there will be 5 primary shards and replication of each will result in total of 10 shards per index.

Image result for shards and replicas in elasticsearch cluster

Types Of Tuning in ElasticSearch:- 

Index Performance Tuning

  • Use Bulk Requests
  • Use multiple workers/threads to send data to Elasticsearch
  • Unset or increase the refresh interval
  • Disable refresh and replicas for initial loads
  • Disable swapping
  • Give memory to the file-system cache
  • Use faster hardware
  • Indexing buffer size ( Increase the size of the indexing buffer – JAVA Heap Size )

Search Performance Tuning

  • Give memory to the file-system cache
  • Use faster hardware
  • Document modeling (documents should not be joined)
  • Search as few fields as possible
  • Pre-index data (give values to your search)
  • Shards might help with throughput, but not always
  • Use index sorting to speed up conjunctions
  • Warm up the file-system cache (index.store.preload)

Why Is ElasticSearch Tuning Required?

Elasticsearch gives you moderate performance for search and injection of logs maintaining a balance. But when the service utilization or service count within the infrastructure grows, logs grow in similar proportion. One could easily scale the cluster vertically, but that would increase the cost.
Instead, you can tune the cluster as per the requirement(Search or Injection) while maintaining the cost constrains.

Tune-up

How to handle 20k logs ingestion per sec?
For such high data volume ingestion into elastic search cluster, you would be somehow compromising the search performance.Starting step is to choose the right compute system for the requirement, prefer high compute for memory over CPU. We are using m5.2xlarge(8 CPU/32 GB) as data nodes and t3.medium (2 CPU/ 4 GB) for master.

Elasticsearch Cluster Size
Master – 3 (HA – To avoid the split-brain problem) or 1 (NON-HA)
Data Node – 2

Configure JVM
The optimal or minimal configuration for JVM heap size for the cluster is 50% of the memory of the server.
File: jvm.option
Path: /etc/elasticsearch/

             - Xms16g
             - Xmx16g

Update system file size and descriptors

             - ES_HEAP_SIZE=16g
             - MAX_OPEN_FILES=99999
             - MAX_LOCKED_MEMORY=unlimited


Dynamic APIs for tuning index performance
With respect to the index tuning performance parameters, below mentioned are the dynamic APIs (Zero downtime configuration update) for tuning the parameters.

Updating Translog
Translog is included in every shard which maintains the persistence of every log by recording every non-committed index operation.
Changes that happen after one commit and before another will be lost in the event of process exit or hardware failure.
To prevent this data loss, each shard has a transaction log or write-ahead log associated with it. Any index or delete operation is written to the translog after being processed by the internal Lucene index.

async – In the event of hardware failure, all acknowledged writes since the last automatic commit will be discarded.

Setting translog to async will increase the index write performance, but do not guarantee data recovery in case of hardware failure. 

 curl -H "Content-Type: application/json" -XPUT "localhost:9200/_all/_settings?timeout=180s" -d ' 
{

"index.translog.durability" : "async"
}'

Timeout
Adjust the time period of operation with respect to the number of indexes. Larger number of indexes, higher would be the timeout value.

Number of Replicas to minimal

In case there is a requirement of ingestion of data in large amount (same scenario as we have), we should set the replica to ‘0‘. This is risky as loss of any shard will cause a data loss as no replica set exist for it. But also at the same time index performance will significantly increase as the document has to be index just once, without replica.
After you are done with the load ingestion, you can revert back the same setting.

curl -H "Content-Type: application/json" -XPUT "localhost:9200/_all/_settings?timeout=180s" -d ' 
{                        
"number_of_replicas": 0 
}'

Increase the Refresh Interval
Making the indexes available for search is the operation called as refresh, and that’s a costly operation in terms of resources. Calling it too frequently can compromise the index write performance.

The Default settings for elasticsearch is to refresh the indexes every second for which the the search request is consecutive in the last 30 seconds.
This is the most appropriate configuration if you have no or very little search traffic and want to optimize for indexing speed.

In case, if your index participate in frequent search requests, in this scenario Elasticsearch will refresh the index every second. If you can bear the expense to increase the amount of time between when a document gets indexed and when it becomes visible, increasing the index.refresh_interval to a grater value, e.g. 300s(5 mins), might help improve indexing performance.

curl -H "Content-Type: application/json" -XPUT 'localhost:9200/_all/_settings?timeout=180s' -d
'{"index" : 
                 {    "refresh_interval" : "300s"    }              
}'

Decreasing number of shards
Changing the number of shards can be achieved by _shrink and _split APIs. As the name suggests, to increase the number of shards split can be used and shrink for decrease.
By default in Elasticsearch every index is allocated with 5 primary shards and single replica of each shard. That means for every index there will be 5 primary shards and replication of each will result in total of 10 shards per index.

curl -H "Content-Type: application/json" -XPUT "localhost:9200/_all/_settings?timeout=180s" -d '
 { 
     "number_of_shards": 1
 }'

When Logstash is Input
Logstash provides the following configurable options for tuning pipeline performance:

  • Pipeline Workers
  • Pipeline Batch Size
  • Pipeline Batch Delay 
  • Profiling the Heap

Configuring the above parameters would help in increasing the injection rate (index performance), as the above parameters work for feeding in elasticsearch.

Summary

ElasticSearch tuning is very complex and critical task as it can give some serious damage to your cluster or break down the whole. So be careful while modifying any parameters on production environment.

ElasticSearch tuning can be extensively used to add values to the logging system, also meeting the cost constrains.

Happy Searching!

References: https://www.elastic.co
Image References: https://docs.bonsai.io/article/122-shard-primer  https://innerlives.org/2018/10/15/image-magic-drawing-the-history-of-sorcery-ritual-and-witchcraft/

The Concept Of Data At Rest Encryption In MySql

Word “data” is very crucial since early 2000 and within a span of these 2 decades is it becoming more crucial. According to Forbes Google believe that in future every organisation will lead to becoming a data company. Well, when it comes to data, security is one of the major concerns that we have to face. 

We have several common techniques to store data in today’s environment like MySql, Oracle, MsSql, Cassandra, Mongo etc and these techs will keep on changing in future. But according to DataAnyz, MySql Still has a 33% share of the market. So here we are with a technique to secure our MySQL data.

Before getting more into this article, let us know what are possible combined approaches to secure MySQL data 

  1. Mysql Server hardening
  2. Mysql Application-level hardening
  3. Mysql data encryption at transit
  4. Mysql data at rest encryption
  5. Mysql Disk Encryption

You may explore all the approaches but in this article, we will understand the concept of Mysql data at encryption and hands-on too.

The concept of  “Data at Rest Encryption”  in MySQL was introduced in Mysql 5.7 with the initial support of InnoDB storage engine only and with the period it has evolved significantly. So let’s understand about “Data at Rest Encryption” in MySQL 

What is “Data at Rest Encryption”  in MySql?

The concept of  “data at rest encryption” uses two-tier encryption key architecture, which used below two keys 

  1. Tablespace keys: This is an encrypted key which is stored  in the tablespace header 
  2. Master Key: the Master key is used to decrypt the tablespace keys

So let’s Understand its working

Let’s say we have a running MySQL with InnoDB storage engine and tablespace is encrypted using a key, referred as table space key. This key is then encrypted using a master key and stored in the tablespace header 

Now when a request is made to access MySQL data, InnoDB use master key to decrypt tablespace key present tablespace header. After getting decrypted tablespace key, the tablespace is decrypted and make is available to perform read/write operations

Note: The decrypted version of a tablespace key never changes, but the master key can be rotated.

Data at rest encryption implemented using keyring file plugin to manage and encrypt the master key

After understanding the concept of encryption and decryption below are few Pros and Cons for using  DRE

Pros:

  • A strong Encryption of AES 256 is used to encrypt the InnoDB tables
  • It is transparent to all applications as we don’t need any application code, schema, or data type changes
  • Key management is not done by DBA.
  • Keys can be securely stored away from the data and key rotation is very simple.

Cons:

  • Encrypts only  InnoDB tables
  • Can’t encrypt  binary logs, redo logs, relay logs on unencrypted slaves, slow log, error log, general log, and audit log

Though we can’t encrypt binary logs, redo logs, relay logs on Mysql 5.7 but MariaDB has implemented this with a mechanism to encrypt undo/redo logs, binary logs/relay logs, etc.  by enabling few flags in MariaDB Config File

innodb_sys_tablespace_encrypt=ON
innodb_temp_tablespace_encrypt=ON
innodb_parallel_dblwr_encrypt=ON
innodb_encrypt_online_alter_logs=ON
innodb_encrypt_tables=FORCE
encrypt_binlog=ON
encrypt_tmp_files=ON

However, there are some limitations 

Let’s Discuss its problem/solutions and few solutions to them

  1. Running MySQL on a host will have access from root user and the MySQL user and both of them may access key file(keyring file) present on the same system. For this problem, we may have our keys on mount/unmount drive which can be unmounted after restarting MySQL.
  2. Data will not be in encrypted form when it will get loaded onto the RAM and can be dumped and read
  3. If MySQL is restarted with skip-grant-tables then again it’s havoc but this can be eliminated using an unmounted drive for keyring
  4.  As tablespace key remains the same so our security relies on Master key rotation which can be used  to save our master key 

NOTE: Do not to lose the master key file, as we cant decrypt data and will suffer data loss

Doing Is Learning, so let’s try 

As a prerequisite, we need a machine with MySQL server up and running Now for data at rest encryption to work we need to enable 

Enable file per table on with the help of the configuration file.  

 
[root@mysql ~]#  vim /etc/my.cnf
[mysqld]

innodb_file_per_table=ON

Along with the above parameter, enable keyring plugin and keyring path. This parameter should always be on the top in configuration so that it will get load initially when MySQL starts up. Keyring plugin is already installed in MySQL server we just need to enable it. 

[root@mysql ~]#  vim /etc/my.cnf
[mysqld]
early-plugin-load=keyring_file.so
keyring_file_data=/var/lib/mysql/keyring-data/keyring
innodb_file_per_table=ON

And save the file with a restart to MySQL

[root@mysql ~]#  systemctl restart mysql

We can check for the enabled plugin and verify our configuration.

mysql> SELECT plugin_name, plugin_status FROM INFORMATION_SCHEMA.PLUGINS WHERE plugin_name LIKE 'keyring%';
+--------------+---------------+
| plugin_name  | plugin_status |
+--------------+---------------+
| keyring_file | ACTIVE        |
+--------------+---------------+
1 rows in set (0.00 sec)


verify that we have a running keyring plugin and its location

mysql>  show global variables like '%keyring%';
+--------------------+-------------------------------------+
| Variable_name      | Value                 |
+--------------------+-------------------------------------+
| keyring_file_data  | /var/lib/mysql/keyring-data/keyring |
| keyring_operations | ON                                  |
+--------------------+-------------------------------------+
2 rows in set (0.00 sec)

Verify that we have enabled file per table 

MariaDB [(none)]> show global variables like 'innodb_file_per_table';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_file_per_table | ON   |
+-----------------------+-------+
1 row in set (0.33 sec)

Now we will test our set up by creating a test DB with a table and insert some value to the table using below commands 

mysql> CREATE DATABASE test_db;
mysql> CREATE TABLE test_db.test_db_table (id int primary key auto_increment, payload varchar(256)) engine=innodb;
mysql> INSERT INTO test_db.test_db_table(payload) VALUES('Confidential Data');

After successful test data creation, run below command from the Linux shell to check whether you’re able to read InnoDB file for your created table i.e. Before encryption

Along with that, we see that our keyring file is also empty before encryption is enabled

[root@mysql ~]#  strings /var/lib/mysql/test_db/test_db_table.ibd
infimum
supremum
Confidential DATA

 

At this point of time if we try to check our keyring file we will not find anything

[root@mysql ~]#  cat /var/lib/mysql/keyring
[root@mysql ~]# 

Now let’s encrypt our table with below command and check our InnoDB file and keyring file content.

mysql> ALTER TABLE test_db.test_db_table encryption='Y';
[root@mysql ~] strings /var/lib/mysql/test_db/test_db_table.ibd
0094ca6d-7ba9-11e9-b0d0-0800275716d42QMw

The above content clear that file data is not readable and table space is encrypted. As previously oy keyring file data was absent/empty, so now it must be having some data.

 Note: Please look  master Key and time stamp(we will implement key rotation )

[root@mysql ~]  cat /var/lib/mysql/keyring-data/keyring
Keyring file version:1.0?0 INNODBKey-0094ca6d-7ba9-11e9-b0d0-0800275716d4-2AES???_gd?7m>0??nz??8M??7Yʹ:ll8@?0 INNODBKey-0094ca6d-7ba9-11e9-b0d0-0800275716d4-1AES}??x?$F?z??$???:??k?6y?YEOF
[root@mysql ~] ls -ltr /var/lib/mysql/keyring-data/keyring
-rw-r----- 1 mysql mysql 283 Sep 18 16:48 /var/lib/mysql/keyring-data/keyring

With known security concern for the compromised master key, we may use the master key rotation technique from time to time to save our key.

mysql> alter instance rotate innodb master key;
Query OK, 0 rows affected (0.00 sec)

After this command, we realise that our key timestamp is changed now and we have a new key. 

[root@mysql ~] ls -ltr /var/lib/mysql/keyring-data/keyring
-rw-r----- 1 mysql mysql 411 Sep 18 18:17 /var/lib/mysql/keyring-data/keyring

Some Useful Commands

Below are some helpful commands we may use in an encrypted system 

1. List All the tables with encryption enabled 

mysql> SELECT * FROM information_schema.tables WHERE create_options LIKE '%ENCRYPTION="Y"%' \G;
*************************** 1. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: sample_db
TABLE_NAME: test_db_table
TABLE_TYPE: BASE TABLE
ENGINE: InnoDB
VERSION: 10
ROW_FORMAT: Dynamic
TABLE_ROWS: 0
AVG_ROW_LENGTH: 0
DATA_LENGTH: 16384
MAX_DATA_LENGTH: 0
INDEX_LENGTH: 0
DATA_FREE: 0
AUTO_INCREMENT: 2
CREATE_TIME: 2019-09-18 16:46:34
UPDATE_TIME: 2019-09-18 16:46:34
CHECK_TIME: NULL
TABLE_COLLATION: latin1_swedish_ci
CHECKSUM: NULL
CREATE_OPTIONS: ENCRYPTION="Y"
TABLE_COMMENT: 
1 row in set (0.02 sec)

ERROR: 
No query specified

2. Encrypt Tables in a Database 

mysql> ALTER TABLE db.t1 ENCRYPTION='Y';

3. Disable encryption for an InnoDB table

mysql> ALTER TABLE t1 ENCRYPTION='N';

Conclusion : 

You can encrypt data at rest by using keyring plugin and we can control and manage it by master key rotation. Creating an encrypted Mysql data file setup is as simple as firing a few simple commands. Using an encrypted system is also transparent to services, applications, and users with minimal impact of system resources. Further with Encryption of data at rest, we may also implement encryption in transit. 

I hope you found this article informative and interesting. I’d really appreciate any and all feedback.