While deploying your same terraform code manually multiple times you must have got through the thoughts:
If we can automate the whole deployment process and replace the whole tedious process with few clicks.
If we can dynamically change the values of terraform.tfvars.
If we can restrict the regions of deployments.
If we can limit our VM types to maintain better cost optimization.
In this article, we will touch upon these problems and try to resolve them in a way that the same concepts can also be applied to similar requirements.
Soo… Let’s Get Started !!!
First of all, we need to know what is Terraform & Azure DevOps.
Talking About Terraform: HashiCorp Terraform is an infrastructure as a code tool that lets you define both cloud and on-prem resources in human-readable configuration files that you can version, reuse, and share. You can then use a consistent workflow to provision and manage all of your infrastructures throughout its cycle. Terraform can manage low-level components like compute, storage, and networking resources, as well as high-level components like DNS entries and SaaS features.
While deploying the deployment manifest, we found that some of the critical pods are not getting scheduled whereas others are getting scheduled easily. Now, I wanted to make sure that the critical pod gets scheduled first over other pods. I started exploring pod scheduling and then came across one of the native solutions for Pod Scheduling using Pod Priority & Priority Class. So in this blog, we’ll talk about Priority Class & Pod Priority and how we can use them for pod scheduling.
It determines the importance of a pod over another pod. It is most helpful when we need to schedule the critical pods, which are unable to schedule due to resource capacity issues.
It is a non-namespace object. It is used to define the priority. Priority Class objects can have any 32-bit integer value smaller than or equal to 1 billion. The higher the value, the higher will be the priority.
It allows the higher-priority pods to evict the lower-priority pods so that higher-priority pods can be scheduled, which is by default enabled when we create PriorityClass.
In this blog, we will create an active-active infrastructure on Microsoft Azure using Terraform and Jenkins.
Prime Reasons to have an active-active set-up of your infrastructure
Disaster recovery (DR) is an organization’s method of regaining access and functionality to its IT infrastructure after events like a natural disaster, cyber attack, or even business disruptions just like during the COVID-19 pandemic.
Ensure business resilience No matter what happens, a good DR plan can ensure that the business can return to full operations rapidly, without losing data or transactions.
Maintain competitiveness Loyalty is rare and when a business goes offline, customers turn to competitors to get the goods or services they require. A DR plan prevents this.
Avoid data loss The longer a business’s systems are down, the greater the risk that data will be lost. A robust DR plan minimizes this risk.
Maintain reputation A business that has trouble resuming operations after an outage can suffer brand damage. For that reason, a solid DR plan is critical.
Before deep dive into the SRE world, let’s talk about, where SRE is derived from. The concept of SRE got originated in 2003 by Ben Treynor Sloss. In 2003, when the cloud wasn’t a thing, Google was one of the most prominent web companies with a massive and distributed infrastructure. They had several challenges to face simultaneously; keep the trust and reputation of their services, provide a smooth user experience involving minimum downtime and latency, manage dozens of sprawling data centers, etc. They needed to rely heavily on automation and, thereby, formulated strategies that led them to implement large-scale automation. Small Companies at that time could bear the loss of a few hours of downtime but giants like Google could not afford it as they were a frontier of best user experience. Therefore, come to think of it, building a team that can help ensure the application’s availability and reliability was an obvious outcome.
Today’s world is entirely internet-driven, be it in any field, we can get any product of our choice with one click.
Talking about e-commerce more in DevOps terms, the entire application/website is based on microservice architecture i.e. distributing a bulk application into smaller services to increase scalability, manageability & more process driven.
Hence, to maintain smaller services one of the important aspects is to enable their Monitoring.
One such commonly known stack is, EFK stack i.e. (Elasticsearch, Fluentd, Kibana) along with Kafka.
Kafka is basically an open-source event streaming platform and is currently used by many companies.
Question: Why use Kafka within EFK monitoring?
Answer: Well this is the first question that strikes many minds hence, in this blog we’ll focus on why to use Kafka, what are its benefits and how to integrate it with the EFK stack.