Are you searching for service discovery or a service mesh tool for a distributed environment?
Did you find any with easy installation? Not yet!! Think fast….It’s just a piece of cake.YES! NO! Calm down because I got it !!!!
A few days back we got a requirement where we had to setup multiple services on multiple servers and in a cluster mode….So now the question arises how will the services be auto discovered? how will we get to know the health check of the service? and above all how to restrict users on different services. After a lot of research, I came across a tool named as consul. But now another stumbling block arises HOW TO SETUP IT?
Your answer might be just go ahead and download the binary on every server, if that’s what you’re thinking…then STOP! Because doing it manually on plenty of servers is time-consuming and also not an efficient way. So, I thought of using a configuration management tool that is none other than ansible. Then there were roles that were already present in the market but some have the hard coded encryption key, some were not generating the bootstrap token and also they were not easy to understand. None of the roles fulfilled the requirement.
So, I thought of creating an ansible role with features like, enabling ACL and generating a bootstrap token, and an encryption key with easy-to-understand language.
In this blog, I have explained the OT-OSM consul ansible role.
Whenever in DevOps we discuss about monitoring and alerting systems we often come across the TICK Stack! What is a TICK stack? What is so special about it? Is it different from ELK Stack, Prometheus, Grafana, Cloudwatch, and NewRelic? I will try to answer all of these queries briefly but my motivation for writing this blog is the Alert Flooding issue I faced while testing my TICK stack.
Note: This blog is not about the detailed working about TICK and its setup.
What is TICK ? What is special about it?
To explain TICK, it is basically a complete collection of services provided by the InfluxData community to capture, store, stream, process, and visualize data to provide us a highly available and robust solution for monitoring and alerting. TICK is an abbreviation for :
Telgeraf – It is a very light-weighted server agent for scrapping metrics from the system it runs on, also has the capability to pull the metrics from various third-party APIs like Kafka, StatsD, etc.
InfluxDB – It is known as the heart of the TICK stack and genuinely speaking it is one of the most efficient and high-performance database stores for handling high volumes of time-series data. It is open source and uses SQL-like query language.
Long since Prometheus took on the role of monitoring the systems, it has been the undisputed open-source leader for monitoring and alerting in Kubernetes systems, it has become a go-to solution. While Prometheus does some general instructions for achieving high availability but it has limitations when it comes to data retention, historical data retrieval, and multi-tenancy. This is where Thanos comes into play. In this blog post, we will discuss how to integrate Thanos with Prometheus in Kubernetes environments and why one should choose a particular approach. So let’s get started.
MongoDB is a popular NoSQL database that supports large as well as small size of datasets. Just like any other database standalone setup, MongoDB is straightforward but we have to make a replicated or shared cluster of MongoDB, and there we have certain complications. Especially if we are doing these kinds of setups in orchestration tools like Kubernetes.
There is a lot of complexity in setting up MongoDB on Kubernetes that people(including me) have faced for a long time which I would like to highlight:-
Standalone setup is pretty straightforward but for replicated and sharded clusters additional mongo configurations are required.
In the replicated scenario, separate configurations need to be managed for the leader and follower.
Monitoring and access management of MongoDB inside Kubernetes is a little tricky part to handle.
As we know alerting is the most crucial part of any infrastructure, and it becomes even more challenging when our infrastructure grows since we cannot monitor everything every time. Every client wants to get notified by their own alerting system before their customer reaches out to them and informs “Hey this service is not working or I am not able to access XYZ service“.
Alerting helps to ensure that the system remains healthy, responsive, and secure. It’s an important part of any system that makes performance, availability, and efficiency high. An operator might need to be notified of the event that triggers the alert.
We can set up alerts in many ways, but in this blog, I will be focussing on setting up alerting through azure logic apps.
Azure provides multiple options to send an alert to the end user, maybe through email, Slack, Pagerduty, SMS, etc. In this blog, I will be explaining the way to send an alert through email, Slack, and Pagerduty.