Today’s world is entirely internet-driven, be it in any field, we can get any product of our choice with one click.
Talking about e-commerce more in DevOps terms, the entire application/website is based on microservice architecture i.e. distributing a bulk application into smaller services to increase scalability, manageability & more process driven.
Hence, to maintain smaller services one of the important aspects is to enable their Monitoring.
One such commonly known stack is, EFK stack i.e. (Elasticsearch, Fluentd, Kibana) along with Kafka.
Kafka is basically an open-source event streaming platform and is currently used by many companies.
Question: Why use Kafka within EFK monitoring?
Answer: Well this is the first question that strikes many minds hence, in this blog we’ll focus on why to use Kafka, what are its benefits and how to integrate it with the EFK stack.
Heroism often results as a response to extreme events.
Event Driven Architecture:
Modern digital businesses work in real-time based events. Event-driven architecture is based on the design principle which follows loose-coupling and message-driven architecture. This Architecture helps to publish events/messages that applications and services can consume, and then perform an action based upon those events.
Where are we Today?
Back in the days when we started implementing microservices, were focused more on service decoupling, communication, and security which we were going to handle in such a system.
This year’s pandemic has forced businesses all around the world to adopt a “remote-first” approach for executing daily operations. Although our lives have been greatly disrupted due to lockdown measures and economic impact, we have managed a balance in our social lives through online modes such as shopping, learning, messaging, gaming, and other activities.
Modern software design is also taking a remote-first mindset to ensure that users can collaborate and share information within each app, including the ability to interact with real-time data.
We are starting our journey to explore Kafka in this pandemic. Starting from “zero to hero“, this Kafka series will help you in understanding the Kafka concepts.
We likely know Kafka as a durable, scalable and fault-tolerant publish-subscribe messaging system. Recently I got a requirement to efficiently monitor and manage our Kafka cluster, and I started looking for different solutions. Kafka-manager is an open source tool introduced by Yahoo to manage and monitor the Apache Kafka cluster via UI.
Before I share my experience of configuring Kafka manager on Kubernetes, let’s go through its considerable features
As per their documentation on github below are the major features:
Manage multiple clusters.
Easy inspection of the cluster state.
Run preferred replica election.
Generate partition assignments with the option to select brokers to use
Run reassignment of a partition (based on generated assignments)
Create a topic with optional topic configs (0.8.1.1 has different configs than 0.8.2+)
Delete topic (only supported on 0.8.2+ and remember set delete.topic.enable=true in broker config)
The topic list now indicates topics marked for deletion (only supported on 0.8.2+)
Batch generate partition assignments for multiple topics with the option to select brokers to use
Batch run reassignment of partition for multiple topics
Add partitions to an existing topic
Update config for an existing topic
Optionally filter out consumers that do not have ids/ owners/ & offsets/ directories in zookeeper.
Optionally enable JMX polling for broker level and topic level metrics.
Prerequisites of Kafka Manager:
We should have a running Apache Kafka with Apache Zookeeper.
Deployment on Kubernetes:
To deploy Kafka Manager on Kubernetes, we need to create deployment and service file as given below.
After deployment, we should able to access Kafka manager service via http://:8080 We have two files to Kafka-manager-service.yaml and kafka-manager.yaml to achieve above-mentioned setup. Let’s have a brief description of the different attributes used in these files. Deployment configuration file: namespace: provide a namespace to isolate application within Kubernetes. replicas: number of containers to spun up. image: provide the path of docker image to be used. containerPorts: on which port you want to run your application. environment: “ZK_HOSTS” provide the address of already running zookeeper. Service configuration file: This file contains the details to create Kafka manager service ok Kubernetes. For demo purpose, I have used the node port method to expose my service. As we are using Kubernetes for our underlying platform of deployment it is recommended not to use external IP to access any service. Either we should go with LoadBalancer or use ingress (recommended method) rather than exposing all microservices. To configure ingress, please take a note from Kubernetes Ingress. Once we are able to access Kafka manager we can see similar screens.
To get broker level and topic level metrics we have to enable JMX polling.
So what we will generally do is to set the environment variable in the kubernetes manifest but somehow it is not working most of the times.
To resolve this you need to update JMX settings while creating your docker image as given as below.
if [ -z "$KAFKA_JMX_OPTS" ]; then
#KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false "
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=$HOSTNAME -Djava.net.preferIPv4Stack=true"
Deploying Kafka manager on Kubernetes encourages the easy setup, provides efficient manageability and all-time availability. Managing Kafka cluster over CLI becomes a tedious task and here Kafka manager helps to focus more on the use of Kafka rather than investing our time to configure and manage it. It becomes useful at Enterprise Level, where system engineers can manage multiple Kafka clusters easily via UI.