A while back we got the requirement for working on Apache Druid. By working on Apache Druid, We mean setup, management, and monitoring. Since it was a new topic for us we started evaluating it and we actually find it has a lot of amazing features.
So for the people who don’t have an idea about Druid and just starting with Druid. Let me give a quick walk-through of it.
Logging is a critical part of monitoring and there are a lot of tools for logs monitoring like Splunk, Sumologic, and Elasticsearch, etc. Since Kubernetes is becoming so much popular now, and running multiple applications and services on a Kubernetes cluster requires a centralized, cluster-level stack to analyze the logs created by pods. One of the well-liked centralized logging solutions is the combination of multiple opensource tools i.e. Elasticsearch, Fluentd, and Kibana. In this blog, we will talk about setting up the logging stack on the Kubernetes cluster with our newly developed operator named “Logging Operator”.
Redis is a popular and opensource in-memory database that supports multiple data structures like strings, hashes, lists, and sets. But similar to other tools, we can scale standalone redis to a particular extent and not beyond that. That’s why we have a cluster mode setup in which we can scale Redis nodes horizontally and then distribute data among those nodes.
Since Kubernetes is becoming buzz technology and people are using it to manage their applications, databases, and middlewares at a single place. So in this blog, we will see how we can deploy the Redis cluster in production mode in the Kubernetes cluster and test failover.
When I set forth with my journey of containerization with docker, I have gone through a misconception that Overlay networking in docker can’t be set up without any orchestrator like Docker swarm, Kubernetes. But after spending some time with containers I realized that I was wrong, Orchestrators leverage the functionality of overlay networking but it is not true that we cannot use overlay networks without any swarm or Kubernetes.
In the modern world, the container is a fascinating technology, as it has revolutionized software development and delivery. Everyone is using containers because of its dynamic, scalable, and isolated nature.
People do use some orchestration software such as Kubernetes, Openshift, Docker Swarm, and AWS ECS, etc to run their production workloads on containers.
While tools like Kubernetes is becoming an essential need for modern cloud-based infrastructure, there is a high potential for cloud-native CI/CD. To achieve that there is a philosophical approach has emerged i.e. GitOps. As we have discussed the important principles of GitOps in our previous blog, So in this blog, we will see how to implement GitOps in our current DevOps processes, and finally GitOps implementation in a light manner. If you haven’t gone through our previous blog, here you can take a look at it.
Initially, we had the DevOps framework in which Development and Operation team collaborated to create an agile development ecosystem. Then a new wave came with the name of “DevSecOps” in which we integrated the security into the existing DevOps process. But nowadays a new terminology “GitOps” is getting famous because of its “Single Source of Truth” nature. Its fame has reached to this level that it was a trending topic at KubeCon.
Recently I was working on a project which includes Terraform and AWS stuff. While working on that I was using my local machine for terraform code testing and luckily everything was going fine. But when we actually want to test it for the production environment we got some issues there. Then, as usual, we started to dig into the issue and finally, we got the issue which was quite a silly one 😜. The production server Terraform version and my local development server Terraform version was not the same.
After wasting quite a time on this issue, I decided to come up with a solution so this will never happen again.
But before jumping to the solution, let’s think is this problem was only related to Terraform or do we have faced the similar kind of issue in other scenarios as well.
Well, I guess we face a similar kind of issue in other scenarios as well. Let’s talk about some of the scenario’s first.
Suppose you have to create a CI pipeline for a project and that too with code re-usability. Now pipeline is ready and it is working fine in your project and then after some time, you have to implement the same kind of pipeline for the different project. Now you can use the same code but you don’t know the exact version of tools which you were using with CI pipeline. This will lead you to error elevation.
Let’s take another example, suppose you are developing something in any of the programming languages. Surely that utility or program will have some dependencies as well. While installing those dependencies on the local system, it can corrupt your complete system or package manager for dependency management. A decent example is Pip which is a dependency manager of Python😉.
These are some example scenarios which we have faced actually and based on that we got the motivation for writing this blog.
To resolve all this problem we just need one thing i.e. containers. I can also say docker as well but container and docker are two different things.
But yes for container management we use docker.
So let’s go back to our first problem the terraform one. If we have to solve that problem there are multiple ways to solve this. But we tried it to solve this using Docker.
As Docker says
Build Once and Run Anywhere
So based on this statement what we did, we created a Dockerfile for required Terraform version and stored it alongside with the code. Basically our Dockerfile looks like this:-
In this Dockerfile, we are defining the version of Terraform which needs to run the code. In a similar fashion, all other above listed problem can be solved using Docker. We just have to create a Dockerfile with exact dependencies which are needed and that same file can work in various environments and projects.
To take it to the next level you can also dump a Makefile as well to make everyone life easier. For example:-
When I began my journey of learning Kubernetes, I always thought why Kubernetes has made the pod its smallest entity, why not the container. But when I started diving deep in it I realized, there is a big rationale behind it and now I thank Kubernetes for making the Pod as an only object, not containers.
After being inspired by the working of a Pod, I would like to share my experience and knowledge with you guys.
What exactly Pod means?
The literal meaning of pod means the peel of pea which holds the beans and following the same analogy in Kubernetes pod means a logical object which holds a container or more than one container. The bookish definition could be – a pod represents a request to execute one or more containers on the same node.
The question that needs to be raised why pod?So let me clear this, pods are considered the fundamental building blocks of Kubernetes, because all the Kubernetes workloads, like Deployments, ReplicaSets or Jobs are eventually expressed in terms of pods.
Pods are the one and only objects in Kubernetes that results in the execution of containers which means No Pod No Containers !!!
Now after the context setting over pod I would like to answer my beloved question:- Why Pod over container??
My answer is why not 🙂
Let’s take an example, suppose you have an application which generates two types of logs one is access log and other logs are error log. Now you have to add log shipper agent, In case of the container, you will install the log shipper in the container image. Now you got another request to add application monitoring in the application. So again you have to recreate the container image with APM agent in it. Don’t you think this is quite an untidy way to do it? Of course, it is, why I have to add these things in my application image, it makes my image quite bulky and difficult to manage.
What if I tell you that Kubernetes has its own way of dealing situations like this.
Yup the solution is a sidecar. Just like in real life if I have a two sitter bike and I want to take 3 persons on a ride, So I will add a sidecar in my bike to take 2 persons together on the ride. In a similar fashion, I can do the same thing with Kubernetes as well. To solve the above problem I will just create 3 containers (application, log-shipper and APM agent) in the same pod. Now the question is how they will access the data between them and how the networking magic will happen. The answer is quite simple containers within the pod can share Pod IP address and can listen on localhost. For volume, we can share volumes also across the containers in a pod.
The architecture would be something like this:-
Now another interesting query arises that when to use sidecar and when not.
Just as shown in the above image we should not keep application and database as a sidecar in the same pod. The reason behind it is Kubernetes does not scale a container it scales a pod. So when autoscaling will happen it scales the application as well as database which could not be required.
Instead of that, we should keep log-shippers, health-check containers and monitoring agent as a sidecar because anyhow application will scale these agents also needs to be scaled with the application.
Now I am assuming you are also madly in love with the pods.
For diving deep in the pod stay tuned for the next part of this blog Why I love pods in Kubernetes? Part – 2. In my next part, I will discuss the different phases and lifecycle of the pod and how pod makes our life really smooth. Thanks for reading, I’d really appreciate any and all feedback, please leave your comment below if you guys have any feedback.