EFK 7.4.0 Stack on Kubernetes. (Part-1)

INTRODUCTION

In this article, we will learn how to set up a complete stack for your Kubernetes environment, its a one stop solution for Logging, Monitoring, Alerting & Authentication. This kind of solution allows your team to gain visibility over your infrastructure and each application.

So, what is the EFK Stack? “EFK” is the acronym for three open source projects: Elasticsearch, Fluentd, and Kibana. Elasticsearch is a search and analytics engine. Fluentd is a server‑side data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a “stash” like Elasticsearch. Kibana lets users visualize data with charts and graphs in Elasticsearch.

The Elastic Stack is the next evolution of the EFK Stack.

Overview of EFK Stack

To achieve this, we will be using the EFK stack version 7.4.0 composed of Elastisearch, Fluentd, Kibana, Metricbeat, Hearbeat, APM-Server, and ElastAlert on a Kubernetes environment. This article series will walk-through a standard Kubernetes deployment, which, in my opinion, gives a overall better understanding of each step of installation and configuration.

PREREQUISITES

Before you begin with this guide, ensure you have the following available to you:

  • A Kubernetes 1.10+ cluster with role-based access control (RBAC) enabled
    • Ensure your cluster has enough resources available to roll out the EFK stack, and if not scale your cluster by adding worker nodes. We’ll be deploying a 3-Pod Elasticsearch cluster each master & data node (you can scale this down to 1 if necessary).
    • Every worker node will also run a Fluentd &,Metricbeat Pod.
    • As well as a single Pod of Kibana, Hearbeat, APM-Server & ElastAlert.
  • The kubectl command-line tool installed on your local machine, configured to connect to your cluster.
    Once you have these components set up, you’re ready to begin with this guide.
  • For Elasticsearch cluster to store the data, create the StorageClass in your appropriate cloud provider. If doing the on-premise deployment then use the NFS for the same.
  • Make sure you have applications running in your K8s Cluster to see the complete functioning of EFK Stack.

Step 1 – Creating a Namespace

Before we start deployment, we will create the namespace. Kubernetes lets you separate objects running in your cluster using a “virtual cluster” abstraction called Namespaces. In this guide, we’ll create a logging namespace into which we’ll install the EFK stack & it’s components.
To create the logging Namespace, use the below yaml file.

#logging-namespace.yaml
kind: Namespace
apiVersion: v1
metadata:
  name: logging

Step 2 – Elasticsearch StatefulSet Cluster

To setup a monitoring stack first we will deploy the elasticsearch, this will act as Database to store all the data (metrics, logs and traces). The database will be composed of three scalable nodes connected together into a Cluster as recommended for production.

Here we will enable the x-pack authentication to make the stack more secure from potential attackers.

Also, we will be using the custom docker image which has elasticsearch-s3-repository-plugin installed and required certs. This will be required in future for Snapshot Lifecycle Management (SLM).

Note: Same Plugin can be used to take snapshots to AWS S3 and Alibaba OSS.

1. Build the docker image from below Docker file

FROM docker.elastic.co/elasticsearch/elasticsearch:7.4.0
USER root
ARG OSS_ACCESS_KEY_ID
ARG OSS_SECRET_ACCESS_KEY
RUN elasticsearch-plugin install --batch repository-s3
RUN elasticsearch-keystore create
RUN echo $OSS_ACCESS_KEY_ID | /usr/share/elasticsearch/bin/elasticsearch-keystore add --stdin s3.client.default.access_key
RUN echo $OSS_SECRET_ACCESS_KEY | /usr/share/elasticsearch/bin/elasticsearch-keystore add --stdin s3.client.default.secret_key
RUN elasticsearch-certutil cert -out config/elastic-certificates.p12 -pass ""
RUN chown -R elasticsearch:root config/

Now let’s build the image and push to your private container registry.

docker build -t elasticsearch-s3oss:7.4.0 --build-arg OSS_ACCESS_KEY_ID=<key> --build-arg OSS_SECRET_ACCESS_KEY=<ID> .

docker push <registerypath>/elasticsearch-s3oss:7.4.0

2. Setup the ElasticSearch master node:

The first node of the cluster we’re going to setup is the master which is responsible of controlling the cluster.

The first k8s object, we’ll create a headless Kubernetes service called elasticsearch-master-svc.yaml that will define a DNS domain for the 3 Pods. A headless service does not perform load balancing or have a static IP.

#elasticsearch-master-svc.yaml
apiVersion: v1
 kind: Service
 metadata:
   namespace: logging 
   name: elasticsearch-master
   labels:
     app: elasticsearch
     role: master
 spec:
   clusterIP: None
   selector:
     app: elasticsearch
     role: master
   ports:
     - port: 9200
       name: http
     - port: 9300
       name: node-to-node

Next, part is a StatefulSet Deployment for master node ( elasticsearch-master.yaml ) which describes the running service (docker image, number of replicas, environment variables and volumes).

#elasticsearch-master.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  namespace: logging
  name: elasticsearch-master
  labels:
    app: elasticsearch
    role: master
spec:
  serviceName: elasticsearch-master
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
      role: master
  template:
    metadata:
      labels:
        app: elasticsearch
        role: master
    spec:
      affinity:
        # Try to put each ES master node on a different node in the K8s cluster
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - elasticsearch
                  - key: role
                    operator: In
                    values:
                      - master
                topologyKey: kubernetes.io/hostname
      # spec.template.spec.initContainers
      initContainers:
        # Fix the permissions on the volume.
        - name: fix-the-volume-permission
          image: busybox
          command: ['sh', '-c', 'chown -R 1000:1000 /usr/share/elasticsearch/data']
          securityContext:
            privileged: true
          volumeMounts:
            - name: data
              mountPath: /usr/share/elasticsearch/data
        # Increase the default vm.max_map_count to 262144
        - name: increase-the-vm-max-map-count
          image: busybox
          command: ['sysctl', '-w', 'vm.max_map_count=262144']
          securityContext:
            privileged: true
        # Increase the ulimit
        - name: increase-the-ulimit
          image: busybox
          command: ['sh', '-c', 'ulimit -n 65536']
          securityContext:
            privileged: true

      # spec.template.spec.containers
      containers:
        - name: elasticsearch
          image: <registery-path>/elasticsearch-s3oss:7.4.0
          ports:
            - containerPort: 9200
              name: http
            - containerPort: 9300
              name: transport
          resources:
            requests:
              cpu: 0.25
            limits:
              cpu: 1
              memory: 1Gi
          # spec.template.spec.containers[elasticsearch].env
          env:
            - name: network.host
              value: "0.0.0.0"
            - name: discovery.seed_hosts
              value: "elasticsearch-master.logging.svc.cluster.local"
            - name: cluster.initial_master_nodes
              value: "elasticsearch-master-0,elasticsearch-master-1,elasticsearch-master-2"
            - name: ES_JAVA_OPTS
              value: -Xms512m -Xmx512m
            - name: node.master
              value: "true"
            - name: node.ingest
              value: "false"
            - name: node.data
              value: "false"
            - name: search.remote.connect
              value: "false"           
            - name: cluster.name
              value: prod
            - name: node.name
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
         # parameters to enable x-pack security.
            - name: xpack.security.enabled
              value: "true"
            - name: xpack.security.transport.ssl.enabled
              value: "true"
            - name: xpack.security.transport.ssl.verification_mode
              value: "certificate"
            - name: xpack.security.transport.ssl.keystore.path
              value: elastic-certificates.p12
            - name: xpack.security.transport.ssl.truststore.path
              value: elastic-certificates.p12
          # spec.template.spec.containers[elasticsearch].volumeMounts
          volumeMounts:
            - name: data
              mountPath: /usr/share/elasticsearch/data

      # use the secret if pulling image from private repository
      imagePullSecrets:
        - name: prod-repo-sec
  # Here we are using the cloud storage class to store the data, make sure u have created the storage-class as pre-requisite.
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes:
      - ReadWriteOnce
      storageClassName: elastic-cloud-disk
      resources:
        requests:
          storage: 20Gi

Now, apply the these files to K8s cluster to deploy elasticsearch master nodes.

$ kubectl apply -f elasticsearch-master.yaml \
                   elasticsearch-master-svc.yaml

3. Setup the ElasticSearch data node:

The second node of the cluster we’re going to setup is the data which is responsible of hosting the data and executing the queries (CRUD, search, aggregation).

Here also, we’ll create a headless Kubernetes service called elasticsearch-data-svc.yaml that will define a DNS domain for the 3 Pods.

#elasticsearch-data-svc.yaml
apiVersion: v1
kind: Service
metadata:
  namespace: logging 
  name: elasticsearch
  labels:
    app: elasticsearch
    role: data
spec:
  clusterIP: None
  selector:
    app: elasticsearch
    role: data
  ports:
    - port: 9200
      name: http
    - port: 9300
      name: node-to-node

Next, part is a StatefulSet Deployment for data node elasticsearch-data.yaml , which describes the running service (docker image, number of replicas, environment variables and volumes).

#elasticsearch-data.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  namespace: logging 
  name: elasticsearch-data
  labels:
    app: elasticsearch
    role: data
spec:
  serviceName: elasticsearch-data
  # This is number of nodes that we want to run
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
      role: data
  template:
    metadata:
      labels:
        app: elasticsearch
        role: data
    spec:
      affinity:
        # Try to put each ES data node on a different node in the K8s cluster
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - elasticsearch
                  - key: role
                    operator: In
                    values:
                      - data
                topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 300
      # spec.template.spec.initContainers
      initContainers:
        # Fix the permissions on the volume.
        - name: fix-the-volume-permission
          image: busybox
          command: ['sh', '-c', 'chown -R 1000:1000 /usr/share/elasticsearch/data']
          securityContext:
            privileged: true
          volumeMounts:
            - name: data
              mountPath: /usr/share/elasticsearch/data
        # Increase the default vm.max_map_count to 262144
        - name: increase-the-vm-max-map-count
          image: busybox
          command: ['sysctl', '-w', 'vm.max_map_count=262144']
          securityContext:
            privileged: true
        # Increase the ulimit
        - name: increase-the-ulimit
          image: busybox
          command: ['sh', '-c', 'ulimit -n 65536']
          securityContext:
            privileged: true
      # spec.template.spec.containers
      containers:
        - name: elasticsearch
          image: <registery-path>/elasticsearch-s3oss:7.4.0
          imagePullPolicy: Always
          ports:
            - containerPort: 9200
              name: http
            - containerPort: 9300
              name: transport
          resources:
            limits:
              memory: 4Gi
          # spec.template.spec.containers[elasticsearch].env
          env:
            - name: discovery.seed_hosts
              value: "elasticsearch-master.logging.svc.cluster.local"
            - name: ES_JAVA_OPTS
              value: -Xms3g -Xmx3g
            - name: node.master
              value: "false"
            - name: node.ingest
              value: "true"
            - name: node.data
              value: "true"
            - name: cluster.remote.connect
              value: "true"
            - name: cluster.name
              value: prod
            - name: node.name
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: xpack.security.enabled
              value: "true"
            - name: xpack.security.transport.ssl.enabled
              value: "true"  
            - name: xpack.security.transport.ssl.verification_mode
              value: "certificate"
            - name: xpack.security.transport.ssl.keystore.path
              value: elastic-certificates.p12
            - name: xpack.security.transport.ssl.truststore.path
              value: elastic-certificates.p12 
          # spec.template.spec.containers[elasticsearch].volumeMounts
          volumeMounts:
            - name: data
              mountPath: /usr/share/elasticsearch/data

      # use the secret if pulling image from private repository
      imagePullSecrets:
        - name: prod-repo-sec

# Here we are using the cloud storage class to store the data, make sure u have created the storage-class as pre-requisite.
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes:
      - ReadWriteOnce
      storageClassName: elastic-cloud-disk
      resources:
        requests:
          storage: 50Gi

Now, apply these files to K8s Cluster to deploy elasticsearch data nodes.

$ kubectl apply -f elasticsearch-data.yaml \
                   elasticsearch-data-svc.yaml

4. Generate a X-Pack password and store in a k8s secret:

We enabled the x-pack security module above to secure our cluster, so we need to initialize the passwords. Execute the following command which runs the program bin/elasticsearch-setup-passwords within the data node container (any node would work) to generate default users and passwords.

$ kubectl exec $(kubectl get pods -n logging | grep elasticsearch-data | sed -n 1p | awk '{print $1}') \
    -n monitoring \
    -- bin/elasticsearch-setup-passwords auto -b

Changed password for user apm_system
PASSWORD apm_system = uF8k2KVwNokmHUomemBG

Changed password for user kibana
PASSWORD kibana = DBptcLh8hu26230mIYc3

Changed password for user logstash_system
PASSWORD logstash_system = SJFKuXncpNrkuSmVCaVS

Changed password for user beats_system
PASSWORD beats_system = FGgIkQ1ki7mPPB3d7ns7

Changed password for user remote_monitoring_user
PASSWORD remote_monitoring_user = EgFB3FOsORqOx2EuZNLZ

Changed password for user elastic
PASSWORD elastic = 3JW4tPdspoUHzQsfQyAI

Note the elastic user password and we will add into a k8s secret (efk-pw-elastic) which will be used by another stack components to connect elasticsearch data nodes for data ingestion.

$ kubectl create secret generic efk-pw-elastic \
    -n logging \
    --from-literal password=3JW4tPdspoUHzQsfQyAI

Step 3 – Kibana Setup

To launch Kibana on Kubernetes, we’ll create a configMap kibana-configmap,to provide a config file to our deployment with all the required properties, Service called kibana, and a Deployment consisting of one Pod replica. You can scale the number of replicas depending on your production needs, and Ingress which helps to routes outside traffic to Service inside the cluster. You need an Ingress controller for this step.

#kibana-configmap.yaml 
apiVersion: v1
kind: ConfigMap
metadata:
  name: kibana-configmap
  namespace: logging
data:
  kibana.yml: |
    server.name: kibana
    server.host: "0"
    # Optionally can define dashboard id which will launch on main Kibana Page.
    kibana.defaultAppId: "dashboard/781b10c0-09e2-11ea-98eb-c318232a6317"
    elasticsearch.hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
    elasticsearch.username: ${ELASTICSEARCH_USERNAME}
    elasticsearch.password: ${ELASTICSEARCH_PASSWORD}
---
#kibana-service.yaml 
apiVersion: v1
kind: Service
metadata:
  namespace: logging
  name: kibana
  labels:
    app: kibana
spec:
  selector:
    app: kibana
  ports:
    - port: 5601
      name: http
---
#kibana-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: logging 
  name: kibana
  labels:
    app: kibana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
        - name: kibana
          image: docker.elastic.co/kibana/kibana:7.4.0
          ports:
            - containerPort: 5601
          env:
            - name: SERVER_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: SERVER_HOST
              value: "0.0.0.0"
            - name: ELASTICSEARCH_HOSTS
              value: http://elasticsearch.logging.svc.cluster.local:9200
            - name: ELASTICSEARCH_USERNAME
              value: kibana
            - name: ELASTICSEARCH_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: elasticsearch-pw-elastic
                  key: password
            - name: XPACK_MONITORING_ELASTICSEARCH_USEARNAME
              value: elastic
            - name: XPACK_MONITORING_ELASTICSEARCH_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: efk-pw-elastic
                  key: password
          volumeMounts:
          - name: kibana-configmap
            mountPath: /usr/share/kibana/config
      volumes:
      - name: kibana-configmap
        configMap:
          name: kibana-configmap
---
#kibana-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: kibana
  namespace: logging
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  # Specify the tls secret.
  tls:
  - secretName: prod-secret
    hosts:
    - kibana.example.com
   
  rules:
  - host: kibana.example.com
    http:
      paths:
      - path: /
        backend:
          serviceName: kibana
          servicePort: 5601

Now, let’s apply these files to deploy Kibana to K8s cluster.

$ kubectl apply  -f kibana-configmap.yaml \
                 -f kibana-service.yaml \
                 -f kibana-deployment.yaml \
                 -f kibana-ingress.yaml

Now, Open the Kibana with the domain name  https://kibana.example.com in your browser, which we have defined in our Ingress or user can expose the kiban service on Node Port and access the dashboard.

Now, login with username elastic and the password generated before and stored in a secret (efk-pw-elastic) and you will be redirected to the index page:

Last, create the separate admin user to access the kibana dashboard with role superuser.

Finally, we are ready to use the ElasticSearch + Kibana stack which will serve us to store and visualize our infrastructure and application data (metrics, logs and traces).

Next steps

In the following article [Collect Logs with Fluentd in K8s. (Part-2)], we will learn how to install and configure fluentd to collect the logs.

Prometheus-Alertmanager integration with MS-teams

As we know monitoring our infrastructure is one of the critical components of infrastructure management, which ensures the proper functioning of our applications and infrastructure. But it is of no use if we are not getting notifications for alarms and threats in our system. As a better practice, if we enable all of the notifications in a common work-space, it would be very helpful for our team to track the status and performance of our infrastructure.

Last week, all of a sudden my company chose to migrate from slack to MS-teams as a common chatroom. Which meant, now, notifications would also be configured to MS-teams. If you had search a bit, you will find that there isn’t any direct configuration for Ms-teams in alert manager as slack does. As a DevOps engineer I didn’t stop and looked beyond for more solutions and I found out that we need some proxy in between ALERTMANAGER and MS-teams for forwarding alerts and I proceeded to configure those.

There are a couple of tools, which we can use as a proxy, but I preferred to use prometheus-msteams, for a couple of reasons.

  • Well-structured documentation.
  • Easy to configure.
  • We have more control in hand, can customise alert notification and you can also configure to send notifications to multiple channels on MS-teams. Besides well-described documentation.
    I still faced some challenges and took half of the day of mine.

How it works?

Firstly, Prometheus sends an alert to ALERTMANAGER on basis of rules we configured in the Prometheus server. For instance, if memory usages of the server are more than 90%, it will generate an alert, and this alert will send to ALERTMANAGER by the Prometheus server. Afterward, ALERTMANGER will send this alert to prometheus-msteams which in turn send this alert in JSON format to MS-teams’s channel.

How to Run and Configure prometheus-msteams

We have multiple options to run prometheus-msteams

  1. Running on standalone Server (Using Binary)
  2. Running as a Docker Container

Running on Server

Firstly, you need to download the binary, click here to download the binary from the latest releases.

When you execute the binary with help on your system, you can see multiple options with description, which help us to run prometheus-msteams just like man-pages.

prometheus-msteams server --help

you can run promethues-msteams service as follow.

./prometheus-msteams server \
    -l localhost \
    -p 2000 \
    -w "Webhook of MS-teams channel"

Above options explanation

  • -l: On which address prometheus-msteams going to listen, the default address is “0.0.0.0”. In the above example, prometheus-msteams listening on the localhost.
  • -p: On which port prometheus-msteams going to listen, the default port is 2000
  • -w: The incoming webhook of MS-teams channel we are going to insert here.

Now you know how to run prometheus-msteams on the server, let’s configure it with ALERTMANAGER.

Step 1 (Creating Incoming Webhook)

Create a channel in Ms-teams where you want to send alerts. Click on connectors(found connectors in options of the channel), and then search for ‘incoming webhook’ connector, from where you can create a webhook of this channel. Incoming webhook is used to send notification from external services to track the activities.

Step 2 (Run prometheus-msteams)

Till now, you have an incoming webhook of a channel where you want to send the notification. After that, you need to setup prometheus-msteams, and run it.

To have more options in the future you can use config.yml to provide webhook. So that you can give multiple webhooks to send alerts to multiple channels in MS-teams in future if you need it.

$ sudo nano /opt/promethues-msteams/config.yml

Add webhooks as shown below. if you want to add another webhook, you can add right after first webhook.

connectors:
  - alert_channel: "WEBHOOK URL"

The next step is to add a template for custom notification.

$ sudo nano /opt/prometheus-msteams/card.tmpl

Copy the following content in your file, or you can modify the following template as per your requirements. This template can be customized and uses the Go Templating Engine.

{{ define "teams.card" }}
{
  "@type": "MessageCard",
  "@context": "http://schema.org/extensions",
  "themeColor": "{{- if eq .Status "resolved" -}}2DC72D
                 {{- else if eq .Status "firing" -}}
                    {{- if eq .CommonLabels.severity "critical" -}}8C1A1A
                    {{- else if eq .CommonLabels.severity "warning" -}}FFA500
                    {{- else -}}808080{{- end -}}
                 {{- else -}}808080{{- end -}}",
  "summary": "Prometheus Alerts",
  "title": "Prometheus Alert ({{ .Status }})",
  "sections": [ {{$externalUrl := .ExternalURL}}
  {{- range $index, $alert := .Alerts }}{{- if $index }},{{- end }}
    { 
      "facts": [
        {{- range $key, $value := $alert.Annotations }}
        {
          "name": "{{ reReplaceAll "_" "\\\\_" $key }}",
          "value": "{{ reReplaceAll "_" "\\\\_" $value }}"
        },
        {{- end -}}
        {{$c := counter}}{{ range $key, $value := $alert.Labels }}{{if call $c}},{{ end }}
        {
          "name": "{{ reReplaceAll "_" "\\\\_" $key }}",
          "value": "{{ reReplaceAll "_" "\\\\_" $value }}"
        }
        {{- end }}
      ],
      "markdown": true
    }
    {{- end }}
  ]
}
{{ end }}

Create prometheus-msteams user, and use --no-create-home and --shell /bin/false to restrict this user log into the server.

$ sudo useradd --no-create-home --shell /bin/false prometheus-msteams

Create a service file to run prometheus-msteams as service with the following command.

$ sudo nano /etc/systemd/system/prometheus-msteams.service

The service file tells systemd to run prometheus-msteams as the prometheus-msteams user, with the configuration file located /opt/promethues-msteams/config.yml, and template file located in the same directory.

Copy the following content into prometheus-msteams.service file.

[Unit]
Description=Prometheus-msteams
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus-msteams
Group=prometheus-msteams
Type=simple
ExecStart=/usr/local/bin/prometheus-msteams server -l localhost -p 2000 --config /opt/prometheus-msteams/config.yml --template-file /opt/prometheus-msteams/card.tmpl

[Install]
WantedBy=multi-user.target

promethues-msteams listen on localhost on 2000 port, and you have to provide configuration file and template also.

To use the newly created service, reload systemd.

$ sudo systemctl daemon-reload

Now Start promethues-msteams.

$ sudo systemctl start prometheus-msteams.service

Check, whether the service is running or not.

$ sudo systemctl status prometheus-msteams

Lastly, enable the service to start on the boot.

$ sudo systemctl enable prometheus-msteams

Now, prometheus-msteams is up and running, we can configure ALERTMANAGER to send alerts to prometheus-msteams.

Step 3(Configure ALERTMANAGER)

Open alertmanager.yml file in your favorite editor.

$ sudo vim /etc/alertmanager/alertmanager.yml

you can configure ALERTMANAGER as shown below.

global:
  resolve_timeout: 5m

templates:
  - '/etc/alertmanager/*.tmpl'

receivers:
- name: alert_channel
  webhook_configs:
  - url: 'http://localhost:2000/alert_channel'
    send_resolved: true

route:
  group_by: ['critical','severity']
  group_interval: 5m
  group_wait: 30s
  receiver: alert_channel
  repeat_interval: 3h

In the above configuration, ALERTMANAGER is sending alerts to prometheus-msteams, which is listening on localhost, and we pass send_resolved, which will send resolved alerts.

The critical alert to MS-teams will look like below.

When alert resolved, it will look like below.

Note: The logs of prometheus-msteams created in /var/log/syslog file. In this file you will find every notification send by prometheus-msteams. Apart from this, if something went wrong, and you are not getting notification, you can debug in syslog file

As Docker Container

you can also run prometheus-msteams as container in your system. All configuration files of prometheus-msteams going to be the same, you just need to run the following command.

docker run -d -p 2000:2000 \
    --name="promteams"  \
    -v /opt/prometheus-msteams/config.yml:/tmp/config.yml \
    -e CONFIG_FILE="/tmp/config.yml" \
    -v /opt/prometheus-msteams/card.tmpl:/tmp/card.tmpl \
    -e TEMPLATE_FILE="/tmp/card.tmpl" \
    docker.io/bzon/prometheus-msteams:v1.1.4

Now that you are all set to get alerts in MS-teams channel, you can see that it isn’t as difficult as you originally thought. Ofcourse, this is not the only way to get alerts on MS-teams. You can always use different tool like prome2teams, etc. With this, I think we are ready to move ahead and explore other monitoring tools as well.

I hope this blog post explains everything clearly. I would really appreciate to get feedback in comments.

One more reason to use Docker – part II

Hey guys we are back with one more reason to use docker part II. I hope you must have explored our previous blog in this series of one more reason to use docker, if not I would suggest you to read that one also.

As we discussed in our previous blog, docker can be used in multiple scenarios, it all depends on the use case of what you want to do with it. It’s a shipping container that can run anything you want to run inside it.

It can either be database, elasticsearch, proxy, scheduled job or be an application.

Running binaries or trying out a new software

As a developer or devops, you are always trying out some software or the other. Many a times you must have struggled while installing new software package onto your machine or setting up different environment for running application.
Suppose I have written a terraform infra code for my infra setup, to test it I would require to install terraform binary. On top of that, I also need an SCM like ansible on the same machine to test/run my ansible role which cannot be run without proper python settings!

Oh God !! so much struggle to install these binaries on system 😥. And if you are a developer you have to go through set of jira processes 😜 will have to request infra team for a new machine for POC of a new software with required package.

The problem doesn’t end here, your system will get piled up with unused binaries once you are done with POC. It will be a whole different task to clean that up.

It’s not always a pleasant experience to set things up after downloading the software. Time is of essence and sometimes all we are looking for is to fire a few commands and that’s it. The Docker model is a super simplistic way of running software binaries, which behind the scene takes care of getting the image and running it for you.

It’s not just about new software. Consider an example that you want to spin up a Database Server (MySQL) quickly on your laptop. Or setup a Redis Server for your team. Docker makes this dead simple. Want to run MySQL Server? All you need to do is :

docker run -d -p 3306:3306 tutum/mysql

Now in case of terraform you simply need to pull latest docker image for terraform and you are good to test/run your terraform code:

docker pull hashicorp/terraform

# run the binary from a container.

docker run -i -t hashicorp/terraform plan main.tf

It’s that simple, You could save hours of your time. It’s use case is not limited to this, we can use it in case of Demos !!

In our organisation, we generally have weekend SHOA(Saturday Hands-On Activity) sessions, which is a great platform for learning and sharing knowledge in an interactive way.
We use docker for showcasing demo, as Docker images are an ideal way to package and demo your tool or application. Also, it’s a great way for conducting hands on workshop. Normally participants get stuck while setting up tools instead of doing the real workshop agenda. Using docker will save time and participants will be able to use that time to learn what they intended to learn out of the workshop.

That’s the power of docker fire it and you are On! I hope you guys liked it and will also try to use Docker in your multiple use cases.

As we discussed in our last blog, there are many scenarios of using docker which are yet to be explored.

Thanks for reading, I’d really appreciate your feedback. Cheers till the next time.

Image source: https://whyjava.files.wordpress.com/2017/05/docker.png?w=555

ERROR HANDLING IN ANSIBLE

INTRODUCTION –

Managing errors is one of the major challenges while working with any code, the same goes with ansible. It has its own ways of managing errors, whenever ansible encounters an error it stops the execution by default like most of the programming languages and throws an error, and in most cases, these errors leave the hosts in the undesirable state.

To avoid servers from undergoing into an undesirable state while execution, Ansible came up with the various ways by providing options like ignore_errors, any_errors_fatal, and many more such options. But these parameters are constrained to particular cases and can’t be used everywhere. Also, this way we are not managing the errors, we are just playing safe !!!

BLOCKS COMES TO THE RESCUE !!!

Ansible came up with the solution to this problem using “BLOCKS”. Blocks are the tasks that are logically grouped together to serve the purpose. Consider BLOCKS as the “try” parameter used for exception handling in most of the programming languages. We define those task in blocks which are more prone to cause errors.

EXAMPLE-Let’s take the example of Apache2 installation on the Ubuntu to understand better, here we will be using the apt and service module to install and start the Apache2 service respectively. These two tasks will be defined under a single block and the playbook for the same will look something like

---
- name: exception handling in ansible
  hosts: web
  become: true
  tasks:
    - name: install and start the apache2 service
      block:
        - name: install apache2
          apt: 
            name: apache2
            update_cache: true
            state: present
   
        - name: enable and restart the apache2 service
          service:
            name: apache2
            enabled: true
            state: restarted
...

Here you can see that the multiple tasks are defined under a single block, this way it will be easier to manage the code. As the single block can be managed more easily than individual tasks.

RESCUE & ALWAYS –

Now comes the error handling part, like I have already mentioned that the code which is more prone to throw errors is defined under blocks and in case if the block fails we have the option of “rescue” and “always” which is more or less similar to “catch” and “finally” when compared to other programming languages.

Now, let’s consider due to some reasons the above block fails, in those cases, we can introduce “rescue & always” to manage errors. For the sake of this example, we are printing the message to understand better, although we can use any module in such cases. Now the updated playbook with rescue and always with the debug module will look something like this,

---
- name: exception handling in ansible
  hosts: web
  become: true
  tasks:
    - name: install and start the apache2 service
      block:
        - name: install apache2
          apt: 
            name: apache2
            update_cache: true
            state: present
   
        - name: enable and restart the apache2 service
          service:
            name: apache2
            enabled: true
            state: restarted

      rescue: 
        - debug: 
            msg: "Hi!!! I will be executed when the block fails"
 
      always:
        - debug: 
             msg: "Hi!!! I will always execute."
...

Here, Also I have created a situation so that the task defined in the block will fail and automatically the tasks defined in “rescue” will be executed( print message is our case). Also, the tasks defined in “always” will be executed every time. Now, after running the playbook the output will look something like-

Here, we can see that one of the tasks in the block fails which leads to failure of the whole block resulting in the calling of rescue and always. And the task defined in rescue and always is executed( message is printed on the console output ).

I hope this post clears out how the playbook got executed successfully, in-spite of the errors. This way it will be more easy for the users to write efficient and error-free playbooks.

Ansible has once again proven its worth!!!

Cheers !!! 

Ref- https://docs.ansible.com/ansible/latest/user_guide/playbooks_blocks.html

Perfect Spot Instance’s Imperfections | part-I

In this blog I am going to share my opinion on spot instances and why we should go for it. While I was going thorough the category(on-demand, reserved, and spot) that AWS provides to launch our instances into, I found spot instances very fascinating and a little challenging.

What I found about spot instances is that they are normal ec2 instances. But what makes it different from the other two(on-demand, reserved)? What strategy do AWS uses for spot instances to make it cheaper than other two and why? Let’s know about these first.

With AWS continuously expanding their region and Availability Zones in their region, they are left with huge amount of unused capacity. How AWS take advantage of their unused capacity? AWS floats its spare capacity on market on a very low base price and allows us to bid on instances and the person with the highest bidding price is provided with the instance, however the price that person pays is only market price i.e if market price is $1 for t2.micro instance and you places a bid of $2 on t2.micro then you will get that instance but the price you will pay is market price i.e $1 only. Interesting?? Let’s bring more fascinating things by comparing the prices of the all three.

Discounts      Types                               Details
0%On-demand InstancesNo commitment from your side.
You pay the most.
Costs fixed price per hour.
40%-60%Reserved Instances1 year or 3 year commitment from your side.
You save money from that commitment.
Costs per plan.
60%-90%Spot InstancesNo commitment from AWS side.
Ridiculously inexpensive.
Costs based on availability.

With this information you must be thinking to try out spot instances at least once. Since we know that every interesting thing comes with a price, spot instances too have a downside “AWS can take back spot instances from you anytime”. Upset? Don’t be, cause this blog built with the purpose of overcoming its downside only. After all you won’t mind spending your 5-10 minutes only for saving in dollars.

Let’s start then…

Now you know that AWS is ready to give their huge spare capacity in the prices of our choice but with a promise to take the capacity back when they want, giving us a warning of two minute before interruption. We can manage the interruption wisely and the proof is some of the organization are already taking full advantage of the spot instances.

Before we go to the core concept let’s build some required concept that will help us to understand core concept with ease.

Spot Instance v/s Spot Fleet

With normal spot instance request, you place a bid for a specific instance type in anyone or specific Availability Zone and hope you get it.

With spot fleets, you can request a number of different instance types that meet your requirements. Additionally, you can spread your spot fleet bid across multiple Availability Zones to increase the likelihood of getting your capacity fulfilled.

Interruption Notice

When AWS take our spot instance back they provide interruption notice 2 mins before so that we can perform some actions.

Next, it is also necessary to know about Cloudwatch Rules. AWS provide the event type on the basis of which you can perform actions like triggering lambda function, sending notice over mail or sms etc.

One of the event types that you can monitor is ec2 state change to running.

Now with this much knowledge we are good to go. And I will show you how you can automate the interruption to avoid the risk of downtime.

This is the main diagram stating all the components that are used to automate the interruption. Now suppose one of the spot instances has been interrupted and AWS is going to take that spot  instance back. Let’s see what happens then.

When AWS is to take one spot instance back, AWS will give interruption notice upon which a cloudwatch rule is created to monitor the interruption notice and then; 

Lambda function is triggered. And then;

Lambda function increases the desired capacity of Auto scaling group to 1. Due to which an on-demand instance gets launched into Target Group and the interrupted spot instance gets terminated. 

Now when on-demand instance is launched and its state changes to running then,

Another cloudwatch rule monitor that change and

Cloudwatch will trigger another lambda function, and then:

Lambda function will modify the spot fleet request capacity to 2 which was previously 1, this will launch a spot instance in the same Target Group and now we will have

1 more spot instance is being launched and when its state changes to running  then,

Again cloudwatch rule comes in action upon state change to running of just launched spot instance. Then

This will again trigger associated lambda function and

Lambda function will set the desired capacity of ASG to 0 again due to which the on-demand instance under the target group will get terminated. And finally we will be left with the following:

Again we are at the same place i.e, 2 spot instances get maintained under the Target Group always.

Note: For the purpose of demonstration, I have taken two instances initially, however you can have any number of  instances. You can customize this according to your needs and constraint. The whole infra is automated with terraform which will create and link everything presented above. Link to clone the repo is provided at the second part of this article.

Are you excited to implement this concept? I am equally excited to share the real implementation with you. With the next part coming very soon, I want you to try the implementation by yourself. In the second part I will help you to implement the whole concept. See you soon…

Redis Cluster: Setup, Sharding and Failover Testing

Watching cluster sharding and failover management is as gripping as visualizing a robotic machinery work.

My last blog on Redis Cluster was primarily focussed on its related concepts and requirements. I would highly recommend to go through the concepts first to have better understanding.

Here, I will straight forward move to its setup along with the behaviour of cluster when I intentionally turned down one Redis service on one of the node.
Let’s start from the scratch.

Redis Setup

Here, I will follow the approach of a 3-node Redis Cluster with Redis v5.0 on all the three CentOS 7.x nodes.

Setup Epel Repo

wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
rpm -ivh epel-release-latest-7.noarch.rpm

Setup Remi Repo

yum install http://rpms.remirepo.net/enterprise/remi-release-7.rpm
yum --enablerepo=remi install redis

redis-server --version
Redis server v=5.0.5 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=619d60bfb0a92c36

3-Node Cluster Prerequisites

While setting up Redis cluster on 3 nodes, I will be following the strategy of having 3 master nodes and 3 slave nodes with one master and one slave running on each node serving redis at different ports. As shown in the diagram Redis service is running on Port 7000 and Port 7001

  • 7000 port will serve Redis Master
  • 7001 port will serve Redis Slave

Directory Structure

We need to design the directory structure to server both redis configurations.

tree /etc/redis
/etc/redis
`-- cluster
    |-- 7000
    |   `-- redis_7000.conf
    `-- 7001
        `-- redis_7001.conf

Redis Configuration

Configuration file for Redis service 1

cat /etc/redis/cluster/7000/redis_7000.conf
port 7000
dir /var/lib/redis/7000/
appendonly yes
protected-mode no
cluster-enabled yes
cluster-node-timeout 5000
cluster-config-file /etc/redis/cluster/7000/nodes_7000.conf
pidfile /var/run/redis_7000.pid

Configuration file for Redis service 2

cat /etc/redis/cluster/7000/redis_7000.conf
port 7001
dir /var/lib/redis/7001
appendonly yes
protected-mode no
cluster-enabled yes
cluster-node-timeout 5000
cluster-config-file /etc/redis/cluster/7001/nodes_7001.conf
pidfile /var/run/redis_7001.pid

Redis Service File

As we are managing multiple service on a single instance, we need to update service file for easier management of redis services.

Service management file for Redis service 1

cat /etc/systemd/system/redis_7000.service
[Unit]
Description=Redis persistent key-value database
After=network.target

[Service]
ExecStart=/usr/bin/redis-server /etc/redis/cluster/7000/redis_7000.conf --supervised systemd
ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown
Type=notify
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=0755
LimitNOFILE=65535

[Install]
WantedBy=multi-user.target

Service management file for Redis service 2

cat /etc/systemd/system/redis_7001.service
[Unit]
Description=Redis persistent key-value database
After=network.target

[Service]
ExecStart=/usr/bin/redis-server /etc/redis/cluster/7001/redis_7001.conf --supervised systemd
ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7001 shutdown
Type=notify
User=redis
Group=root
RuntimeDirectory=/etc/redis/cluster/7001
RuntimeDirectoryMode=0755
LimitNOFILE=65535

[Install]
WantedBy=multi-user.target

Redis Service Status

Master Service
systemctl status redis_7000.service 
● redis_7000.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7000.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-09-25 08:14:15 UTC; 30min ago
  Process: 2902 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown (code=exited, status=0/SUCCESS)
 Main PID: 2917 (redis-server)
   CGroup: /system.slice/redis_7000.service
           └─2917 /usr/bin/redis-server *:7000 [cluster]
systemd[1]: Starting Redis persistent key-value database...
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=2917, just started
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 # Configuration loaded
redis-server[2917]: 2917:C 25 Sep 2019 08:14:15.752 * supervised by systemd, will signal readiness
systemd[1]: Started Redis persistent key-value database.
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.754 * No cluster configuration found, I'm ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.756 * Running mode=cluster, port=7000.
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.756 # Server initialized
redis-server[2917]: 2917:M 25 Sep 2019 08:14:15.756 * Ready to accept connections
Slave Service
systemctl status redis_7001.service
● redis_7001.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7001.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-09-25 08:14:15 UTC; 30min ago
  Process: 2902 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7001 shutdown (code=exited, status=0/SUCCESS)
 Main PID: 2919 (redis-server)
   CGroup: /system.slice/redis_7001.service
           └─2919 /usr/bin/redis-server *:7001 [cluster]
systemd[1]: Starting Redis persistent key-value database...
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=2917, just started
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 # Configuration loaded
redis-server[2919]: 2917:C 25 Sep 2019 08:14:15.752 * supervised by systemd, will signal readiness
systemd[1]: Started Redis persistent key-value database.
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.754 * No cluster configuration found, I'm ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.756 * Running mode=cluster, port=7001.
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.756 # Server initialized
redis-server[2919]: 2917:M 25 Sep 2019 08:14:15.756 * Ready to accept connections

Redis Cluster Setup

Redis itself provides cli tool to setup cluster.
In the current 3 node scenario, I opt 7000 port on all node to serve Redis master and 7001 port to serve Redis slave.

redis-cli --cluster create 172.19.33.7:7000 172.19.42.44:7000 172.19.45.201:7000 172.19.33.7:7001 172.19.42.44:7001 172.19.45.201:7001 --cluster-replicas 1

The first 3 address will be the master and the next 3 address will be the slaves. It will be a cross node replication, say, Slave of any Mater will reside on a different node and the cluster-replicas define the replication factor, i.e each master will have 1 slave.

>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 172.19.42.44:7001 to 172.19.33.7:7000
Adding replica 172.19.45.201:7001 to 172.19.42.44:7000
Adding replica 172.19.33.7:7001 to 172.19.45.201:7000
M: ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000
   slots:[0-5460] (5461 slots) master
M: 314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000
   slots:[5461-10922] (5462 slots) master
M: 19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000
   slots:[10923-16383] (5461 slots) master
S: 896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001
   replicates 19a2c81b7f489bec35eed474ae8e1ad787327db6
S: 89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001
   replicates ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
S: 20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001
   replicates 314038a48bda3224bad21c3357dbff8305735d72
Can I set the above configuration? (type 'yes' to accept):

A dry run will showcase the cluster setup and ask for confirmation.

Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
..
>>> Performing Cluster Check (using node 172.19.33.7:7000)
M: ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: 20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001
   slots: (0 slots) slave
   replicates 314038a48bda3224bad21c3357dbff8305735d72
M: 314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
M: 19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
S: 89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001
   slots: (0 slots) slave
   replicates ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2
S: 896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001
   slots: (0 slots) slave
   replicates 19a2c81b7f489bec35eed474ae8e1ad787327db6
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

Check Cluster Status

Connect to any of the cluster node to check the status of cluster.

redis-cli -c -h 172.19.33.7 -p 7000
172.19.33.7:7000> cluster nodes
20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001@17001 slave 314038a48bda3224bad21c3357dbff8305735d72 0 1569402961000 6 connected
314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000@17000 master - 0 1569402961543 2 connected 5461-10922
19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000@17000 master - 0 1569402960538 3 connected 10923-16383
ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000@17000 myself,master - 0 1569402959000 1 connected 0-5460
89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001@17001 slave ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 0 1569402960000 5 connected
896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001@17001 slave 19a2c81b7f489bec35eed474ae8e1ad787327db6 0 1569402959936 4 connected

Redis cluster itself manages the cross node replication, as seen in the above screen, 172.19.42.44:7000 master is associated with 172.19.45.201:7001 slave.

Data Sharding

There are 16384 slots. These slots are divided by the number of servers.
If there are 3 servers; 1, 2 and 3 then

  • Server 1 contains hash slots from 0 to 5500.
  • Server 2 contains hash slots from 5501 to 11000.
  • Server 3 contains hash slots from 11001 to 16383.
redis-cli -c -h 172.19.33.7 -p 7000
172.19.33.7:7000> set a 1
-> Redirected to slot [15495] located at 172.19.45.201:7000
OK
172.19.45.201:7000> set b 2
-> Redirected to slot [3300] located at 172.19.33.7:7000
OK
172.19.33.7:7000> set c 3
-> Redirected to slot [7365] located at 172.19.42.44:7000
OK
172.19.42.44:7000> set d 4
-> Redirected to slot [11298] located at 172.19.45.201:7000
OK
172.19.45.201:7000> get b
-> Redirected to slot [3300] located at 172.19.33.7:7000
"2"
172.19.33.7:7000> get a
-> Redirected to slot [15495] located at 172.19.45.201:7000
"1"
172.19.45.201:7000> get c
-> Redirected to slot [7365] located at 172.19.42.44:7000
"3"
172.19.42.44:7000> get d
-> Redirected to slot [11298] located at 172.19.45.201:7000
"4"
172.19.45.201:7000>

Redis Cluster Failover

Stop Master Service

Let’s stop the Redis master service on Server 3.

systemctl stop redis_7000.service
systemctl status redis_7000.service
● redis_7000.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7000.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Wed 2019-09-25 09:32:37 UTC; 23s ago
  Process: 3232 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown (code=exited, status=0/SUCCESS)
  Process: 2892 ExecStart=/usr/bin/redis-server /etc/redis/cluster/7000/redis_7000.conf --supervised systemd (code=exited, status=0/SUCCESS)
 Main PID: 2892 (code=exited, status=0/SUCCESS)

Cluster State (Failover)

While checking the cluster status, Redis master service running on server 3 at port 7000 is shown fail and disconnected.

At the same moment its respective slave gets promoted to master which is running on port 7001 on server 1.

redis-cli -c -h 172.19.33.7 -p 7000
172.19.45.201:7000> CLUSTER NODES
314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000@17000 master,fail - 1569403957138 1569403956000 2 disconnected
ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000@17000 master - 0 1569404037252 1 connected 0-5460
896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001@17001 slave 19a2c81b7f489bec35eed474ae8e1ad787327db6 0 1569404036248 4 connected
89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001@17001 slave ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 0 1569404036752 5 connected
20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001@17001 master - 0 1569404036000 7 connected 5461-10922
19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000@17000 myself,master - 0 1569404035000 3 connected 10923-16383

Restarting Stopped Redis

Now we will check the behaviour of cluster once we fix or restart the redis service that we intentionally turned down earlier.

systemctl start redis_7000.service
systemctl status redis_7000.service
● redis_7000.service - Redis persistent key-value database
   Loaded: loaded (/etc/systemd/system/redis_7000.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-09-25 09:35:12 UTC; 8s ago
  Process: 3232 ExecStop=/bin/redis-cli -h 127.0.0.1 -p 7000 shutdown (code=exited, status=0/SUCCESS)
 Main PID: 3241 (redis-server)
   CGroup: /system.slice/redis_7000.service
           └─3241 /usr/bin/redis-server *:7000 [cluster]

Cluster State (Recovery)

Finally, all redis service are back in running state. The master service that we turned down and restarted has now become slave to its promoted master.

redis-cli -c -h 172.19.33.7 -p 7000
172.19.45.201:7000> CLUSTER NODES 314038a48bda3224bad21c3357dbff8305735d72 172.19.42.44:7000@17000 slave 20ab4b30f3d6d25045909c6c33ab70feb635061c 0 1569404162565 7 connected ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 172.19.33.7:7000@17000 master - 0 1569404162000 1 connected 0-5460 896b2a7195455787b5d8a50966f1034c269c0259 172.19.33.7:7001@17001 slave 19a2c81b7f489bec35eed474ae8e1ad787327db6 0 1569404163567 4 connected 89206df4f41465bce81f44e25e5fdfa8566424b8 172.19.42.44:7001@17001 slave ff3e4300bec02ed4bd1be9af5d83a5b44249c2b2 0 1569404163000 5 connected 20ab4b30f3d6d25045909c6c33ab70feb635061c 172.19.45.201:7001@17001 master - 0 1569404162000 7 connected 5461-10922 19a2c81b7f489bec35eed474ae8e1ad787327db6 172.19.45.201:7000@17000 myself,master - 0 1569404161000 3 connected 10923-16383

It’s not done yet, further we can explore around having a single endpoint to point from the application. I will am currently working on that and soon will come up with the solution.
Apart from this monitoring the Redis Cluster will also be a major aspect to look forward.
Till then get your hands dirty playing around the Redis Cluster setup and failover.

Reference links:
Image: Google image search (blog.advids.co)

Redis Cluster: Architecture, Replication, Sharding and Failover

Speed fascinates everyone, but only if its under control.

It is well said and a proven fact that everyone needs to implement a cache at some point in their application lifecycle, and this has become our requirement too.

During the initial phase we placed Redis in a Master Slave mode with next phase involving Sentinal setup to withstand Master failover. I would like to throw some light on their architecture along with pros and cons so I can put emphasis on why I finally migrated to Redis Cluster.

Redis Master/Slave

Redis replication is a very simple to use and configure master-slave replication  that allows slave Redis servers to be exact copies of master servers.

What forced me to look for Redis Sentinel

When using Master-Slave architecture

  • There will be only one Master with multiple slaves for replication.
  • All write goes to Master, which creates more load on master node.
  • If Master goes down, the whole architecture is prone to SPOF (Single point of failure).
  • M-S architecture does not helps in scaling, when your user base grows.
  • So we need a process to Monitor Master in case of failure or shutdown, that is Sentinel.

Redis Sentinel

Initial Setup
Failover Handling

I was still concerned about the below Sharding of data for best performance

Concept of Redis Cluster

“A query that used to take an hour can run in seconds on cache”.

Redis Cluster is an active-passive cluster implementation that consists of master and slave nodes. The cluster uses hash partitioning to split the key space into 16,384 key slots, with each master responsible for a subset of those slots. 

Each slave replicates a specific master and can be reassigned to replicate another master or be elected to a master node as needed. 

Ports Communication

Each node in a cluster requires two TCP ports. 

  • One port is used for client connections and communications. This is the port you would configure into client applications or command line tools. 
  • Second required port is reserved for node-to-node communication that occurs in a binary protocol and allows the nodes to discuss configuration and node availability.

Failover

When a master fails or is found to be unreachable by the majority of the cluster as determined by the nodes communication via the gossip port, the remaining masters hold a vote and elect one of the failing masters’ slaves to take its place. 

Rejoining The Cluster

When the failing master eventually rejoins the cluster, it will join as a slave and begin to replicate another master.

Sharding

Redis sharded data automatically into the servers.
Redis has a concept of hash slot in order to split data. All the data are divided into slots.
There are 16384 slots. These slots are divided by the number of servers.

If there are 3 servers; A, B and C then

  • Server 1 contains hash slots from 0 to 5500.
  • Server 2 contains hash slots from 5501 to 11000.
  • Server 3 contains hash slots from 11001 to 16383.

6 Node M/S Cluster

In a 6 node cluster mode, 3 nodes will be serving as a master and the 3 node will be their respective slave.

Here, Redis service will be running on port 6379 on all servers in the cluster. Each master server is replicating the keys to its respective redis slave node assigned during cluster creation process.

3 Node M/S Cluster

In a 3 node cluster mode, there will be 2 redis services running on each server on different ports. All 3 nodes will be serving as a master with redis slave on cross nodes.

Here, two redis services will be running on each server on two different ports and each master is replicating the keys to its respective redis slave running on other node.

WHAT IF Redis Goes Down

1 node goes down in a 6 node Redis Cluster

If one of the node goes down in Redis 6-node cluster setup, its respective slave will be promoted as master.

In above example, master Server3 goes down and it slave Server6 is promoted as master.

1 node goes down in a 3 node Redis Cluster

If one of the node goes down in Redis 3-node cluster setup, its respective slave running on the separate node will be promoted to master.

In above example, Server 3 goes down and slave running on Server1 is promoted to master.

Redis service goes down on one of the 3 node Redis Cluster

If redis service goes down on one of the node in Redis 3-node cluster setup, its respective slave will be promoted as master.

Conclusion

Although, this methodology will prevent Redis Cluster in partial Failover scenarios only, but if we want full failover we need to look for Disaster Recovery techniques as well.

Well this implementation helped me having a sound sleep while thinking of Redis availability, sharding and performance.

Enough of reading, eager to know how this all works when it comes to implementation. Don’t worry, my next blog Redis Cluster: Setup, Sharding and Failover Testing will be guiding you through the process.

Enjoy happy and safe DIWALI