Prometheus-Alertmanager integration with MS-teams

As we know monitoring our infrastructure is one of the critical components of infrastructure management, which ensures the proper functioning of our applications and infrastructure. But it is of no use if we are not getting notifications for alarms and threats in our system. As a better practice, if we enable all of the notifications in a common work-space, it would be very helpful for our team to track the status and performance of our infrastructure.

Continue reading “Prometheus-Alertmanager integration with MS-teams”

AlertManager Integration with Prometheus

One day I got a call from one of my friend and he said to me that he is facing difficulties while setting up AlertManager with Prometheus. Then, I observed that most of the people face such issues while establishing a connection between AlertManager and receiver such as E-mail, Slack etc.

From there, I got motivation for writing this blog so AlertManager setup with Prometheus will be a piece of cake for everyone.

If you are new to AlertManager I would suggest you go through with our Prometheus blog.

What Actually AlertManager Is?

AlertManager is used to handle alerts for client applications (like Prometheus). It also takes care of alerts deduplicating, grouping and then routes them to different receivers such as E-mail, Slack, Pager Duty.

In this blog, we will only discuss on Slack and E-mail receivers.

AlertManager can be configured via command-line flags and configuration file. While command line flags configure system parameters for AlertManager,  the configuration file defines inhibition rules, notification routing, and notification receivers.

Architecture

Here is a basic architecture of AlertManager with Prometheus.

This is how Prometheus architecture works:-

  • If you see in the above picture Prometheus is scraping the metrics from its client application(exporters).
  • When the alert is generated then it pushes it to the AlertManager, later AlertManager validates the alerts groups on the basis of labels.
  • and then forward it to the receivers like Email or Slack.

If you want to use a single AlertManager for multiple Prometheus server you can also do that. Then architecture will look like this:-

Installation

Installation part of AlertManager is not a fancy thing, we just simply need to download the latest binary of AlertManager from here.

$ cd /opt/
$ wget https://github.com/prometheus/alertmanager/releases/download/v0.11.0/alertmanager-0.11.0.linux-amd64.tar.gz

After downloading, let’s extract the files.

$ tar -xvzf alertmanager-0.11.0.linux-amd64.tar.gz

So we can start AlertManager from here as well but it is always a good practice to follow Linux directory structure.

$ mv alertmanager-0.11.0.linux-amd64/alertmanager /usr/local/bin/

 Configuration

Once the tar file is extracted and binary file is placed at the right location then the configuration part will come. Although AlertManager extracted directory contains the configuration file as well but it is not of our use. So we will create our own configuration. Let’s start by creating a directory for configuration.

$ mkdir /etc/alertmanager/

Then the configuration file will take place.

$ vim /etc/alertmanager/alertmanager.yml

The configuration file for Slack will look like this:-

global:


# The directory from which notification templates are read.
templates:
- '/etc/alertmanager/template/*.tmpl'

# The root route on which each incoming alert enters.
route:
  # The labels by which incoming alerts are grouped together. For example,
  # multiple alerts coming in for cluster=A and alertname=LatencyHigh would
  # be batched into a single group.
  group_by: ['alertname', 'cluster', 'service']

  # When a new group of alerts is created by an incoming alert, wait at
  # least 'group_wait' to send the initial notification.
  # This way ensures that you get multiple alerts for the same group that start
  # firing shortly after another are batched together on the first
  # notification.
  group_wait: 3s

  # When the first notification was sent, wait 'group_interval' to send a batch
  # of new alerts that started firing for that group.
  group_interval: 5s

  # If an alert has successfully been sent, wait 'repeat_interval' to
  # resend them.
  repeat_interval: 1m

  # A default receiver
  receiver: mail-receiver

  # All the above attributes are inherited by all child routes and can
  # overwritten on each.

  # The child route trees.
  routes:
  - match:
      service: node
    receiver: mail-receiver

    routes:
    - match:
        severity: critical
      receiver: critical-mail-receiver

  # This route handles all alerts coming from a database service. If there's
  # no team to handle it, it defaults to the DB team.
  - match:
      service: database
    receiver: mail-receiver
    routes:
    - match:
        severity: critical
      receiver: critical-mail-receiver

receivers:
- name: 'mail-receiver'
  slack_configs:
  - api_url:  https://hooks.slack.com/services/T2AGPFQ9X/B94D2LHHD/jskljaganauheajao2
    channel: '#prom-alert'

   - name: 'critical-mail-receiver'
  slack_configs: 
  
  - api_url:   https://hooks.slack.com/services/T2AGPFQ9X/B94D2LHHD/abhajkaKajKaALALOPaaaJk  channel: '#prom-alert'

You just have to replace the channel name and api_url of the Slack with your information.

The configuration file for E-mail will look something like this:-

global:

templates:
- '/etc/alertmanager/*.tmpl'
# The root route on which each incoming alert enters.
route:
  # default route if none match
  receiver: alert-emailer

  # The labels by which incoming alerts are grouped together. For example,
  # multiple alerts coming in for cluster=A and alertname=LatencyHigh would
  # be batched into a single group.
  # TODO:
  group_by: ['alertname', 'priority']

  # All the above attributes are inherited by all child routes and can
  # overwritten on each.

receivers:
- name: alert-emailer
  email_configs:
  - to: 'receiver@example.com'
    send_resolved: false
    from: 'sender@example.com'
    smarthost: 'smtp.example.com:587'
    auth_username: 'sender@example.com'
    auth_password: 'IamPassword'
    auth_secret: 'sender@example.com'
    auth_identity: 'sender@example.com'

In this configuration file, you need to update the sender and receiver mail details and the authorization password of the sender.

Once the configuration part is done we just have to create a storage directory where AlertManger will store its data.

$ mkdir /var/lib/alertmanager

Then only last piece which will be remaining is my favorite part i.e creating service 🙂

$ vi /etc/systemd/system/alertmanager.service

The service file will look like this:-

[Unit]
Description=AlertManager Server Service
Wants=network-online.target
After=network-online.target

[Service]
User=root
Group=root
Type=Simple
ExecStart=/usr/local/bin/alertmanager \
    --config.file /etc/alertmanager/alertmanager.yml \
    --storage.tsdb.path /var/lib/alertmanager

[Install]
WantedBy=multi-user.target

Then reload the daemon and start the service

$ systemctl daemon-reload
$ systemctl start alertmanager
$ systemctl enable alertmanager

Now you are all set to fire up your monitoring and alerting. So just take a beer and relax until Alert Manager notifies you for alerts. All the best!!!!

Setting up MySQL Monitoring with Prometheus

One thing that I love about Prometheus is that it has a multitude of Integration with different services, both officially supported and the third party supported.
Let’s see how can we monitor MySQL with Prometheus.

Those who are the starter or new to Prometheus can refer to our this blog.

MySQL is a popular opensource relational database system, which exposed a large number of metrics for monitoring but not in Prometheus format. For capturing that data in Prometheus format we use mysqld_exporter.

I am assuming that you have already setup MySQL Server.

Configuration changes in MySQL

For setting up MySQL monitoring, we need a user with reading access on all databases which we can achieve by an existing user also but the good practice is that we should always create a new user in the database for new service.
CREATE USER 'mysqld_exporter'@'localhost' IDENTIFIED BY 'password' WITH MAX_USER_CONNECTIONS 3;
After creating a user we simply have to provide permission to that user.
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'mysqld_exporter'@'localhost';

Setting up MySQL Exporter

Download the mysqld_exporter from GitHub. We are downloading the 0.11.0 version as per latest release now, change the version in future if you want to download the latest version.

wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.11.0/mysqld_exporter-0.11.0.linux-amd64.tar.gz

Then simply extract the tar file and move the binary file at the appropriate location.
tar -xvf mysqld_exporter-0.11.0.linux-amd64.tar.gz
mv mysqld_exporter /usr/bin/
Although we can execute the binary simply, but the best practice is to create service for every Third Party binary application. Also, we are assuming that systemd is already installed in your system. If you are using init then you have to create init service for the exporter.

useradd mysqld_exporter
vim /etc/systemd/system/mysqld_exporter.service
[Unit]
Description=MySQL Exporter Service
Wants=network.target
After=network.target

[Service]
User=mysqld_exporter
Group=mysqld_exporter
Environment="DATA_SOURCE_NAME=mysqld_exporter:password@unix(/var/run/mysqd/mysqld.sock)"
Type=simple
ExecStart=/usr/bin/mysqld_exporter
Restart=always

[Install]
WantedBy=multi-user.target
You may need to adjust the socket location of Unix according to your environment
If you go and visit the http://localhost.com:9104/metrics, you will be able to see them.

Prometheus Configurations

For scrapping metrics from mysqld_exporter in Prometheus we have to make some configuration changes in Prometheus, the changes are not fancy, we just have to add another job for mysqld_exporter, like this:-
vim /etc/prometheus/prometheus.yml
  - job_name: 'mysqld_exporter'
    static_configs:
      - targets:
          - :9104
After the configuration changes, we just have to restart the Prometheus server.

systemctl restart prometheus

Then, if you go to the Prometheus server you can find the MySQL metrics there like this:-

So In this blog, we have covered MySQL configuration changes for Prometheus, mysqld_exporter setup and Prometheus configuration changes.
In the next part, we will discuss how to create a visually impressive dashboard in Grafana for better visualization of MySQL metrics. See you soon… 🙂

Prometheus Overview and Setup

Overview

Prometheus is an opensource monitoring solution that gathers time series based numerical data. It is a project which was started by Google’s ex-employees at SoundCloud. 

To monitor your services and infra with Prometheus your service needs to expose an endpoint in the form of port or URL. For example:- {{localhost:9090}}. The endpoint is an HTTP interface that exposes the metrics.

For some platforms such as Kubernetes and skyDNS Prometheus act as directly instrumented software that means you don’t have to install any kind of exporters to monitor these platforms. It can directly monitor by Prometheus.

One of the best thing about Prometheus is that it uses a Time Series Database(TSDB) because of that you can use mathematical operations, queries to analyze them. Prometheus uses SQLite as a database but it keeps the monitoring data in volumes.

Pre-requisites

  • A CentOS 7 or Ubuntu VM
  • A non-root sudo user, preferably one named prometheus

Installing Prometheus Server

First, create a new directory to store all the files you download in this tutorial and move to it.

mkdir /opt/prometheus-setup
cd /opt/prometheus-setup
Create a user named “prometheus”

useradd prometheus

Use wget to download the latest build of the Prometheus server and time-series database from GitHub.


wget https://github.com/prometheus/prometheus/releases/download/v2.0.0/prometheus-2.0.0.linux-amd64.tar.gz
The Prometheus monitoring system consists of several components, each of which needs to be installed separately.

Use tar to extract prometheus-2.0.0.linux-amd64.tar.gz:

tar -xvzf ~/opt/prometheus-setup/prometheus-2.0.0.linux-amd64.tar.gz .
 Place your executable file somewhere in your PATH variable, or add them into a path for easy access.

mv prometheus-2.0.0.linux-amd64  prometheus
sudo mv  prometheus/prometheus  /usr/bin/
sudo chown prometheus:prometheus /usr/bin/prometheus
sudo chown -R prometheus:prometheus /opt/prometheus-setup/
mkdir /etc/prometheus
mv prometheus/prometheus.yml /etc/prometheus/
sudo chown -R prometheus:prometheus /etc/prometheus/
prometheus --version
  

You should see the following message on your screen:

  prometheus,       version 2.0.0 (branch: HEAD, revision: 0a74f98628a0463dddc90528220c94de5032d1a0)
  build user:       root@615b82cb36b6
  build date:       20171108-07:11:59
  go version:       go1.9.2
Create a service for Prometheus 

sudo vi /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus

[Service]
User=prometheus
ExecStart=/usr/bin/prometheus --config.file /etc/prometheus/prometheus.yml --storage.tsdb.path /opt/prometheus-setup/

[Install]
WantedBy=multi-user.target
systemctl daemon-reload

systemctl start prometheus

systemctl enable prometheus

Installing Node Exporter


Prometheus was developed for the purpose of monitoring web services. In order to monitor the metrics of your server, you should install a tool called Node Exporter. Node Exporter, as its name suggests, exports lots of metrics (such as disk I/O statistics, CPU load, memory usage, network statistics, and more) in a format Prometheus understands. Enter the Downloads directory and use wget to download the latest build of Node Exporter which is available on GitHub.

Node exporter is a binary which is written in go which monitors the resources such as cpu, ram and filesystem. 

wget https://github.com/prometheus/node_exporter/releases/download/v0.15.1/node_exporter-0.15.1.linux-amd64.tar.gz

You can now use the tar command to extract : node_exporter-0.15.1.linux-amd64.tar.gz

tar -xvzf node_exporter-0.15.1.linux-amd64.tar.gz .

mv node_exporter-0.15.1.linux-amd64 node-exporter

Perform this action:-

mv node-exporter/node_exporter /usr/bin/

Running Node Exporter as a Service

Create a user named “prometheus” on the machine on which you are going to create node exporter service.

useradd prometheus

To make it easy to start and stop the Node Exporter, let us now convert it into a service. Use vi or any other text editor to create a unit configuration file called node_exporter.service.


sudo vi /etc/systemd/system/node_exporter.service
This file should contain the path of the node_exporter executable, and also specify which user should run the executable. Accordingly, add the following code:

[Unit]
Description=Node Exporter

[Service]
User=prometheus
ExecStart=/usr/bin/node_exporter

[Install]
WantedBy=default.target

Save the file and exit the text editor. Reload systemd so that it reads the configuration file you just created.


sudo systemctl daemon-reload
At this point, Node Exporter is available as a service which can be managed using the systemctl command. Enable it so that it starts automatically at boot time.

sudo systemctl enable node_exporter.service
You can now either reboot your server or use the following command to start the service manually:
sudo systemctl start node_exporter.service
Once it starts, use a browser to view Node Exporter’s web interface, which is available at http://your_server_ip:9100/metrics. You should see a page with a lot of text:

Starting Prometheus Server with a new node

Before you start Prometheus, you must first edit a configuration file for it called prometheus.yml.

vim /etc/prometheus/prometheus.yml
Copy the following code into the file.

# my global configuration which means it will applicable for all jobs in file
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. scrape_interval should be provided for scraping data from exporters 
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. Evaluation interval checks at particular time is there any update on alerting rules or not.

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'. Here we will define our rules file path 
#rule_files:
#  - "node_rules.yml"
#  - "db_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape: In the scrape config we can define our job definitions
scrape_configs:
  # The job name is added as a label `job=` to any timeseries scraped from this config.
  - job_name: 'node-exporter'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'. 
    # target are the machine on which exporter are running and exposing data at particular port.
    static_configs:
      - targets: ['localhost:9100']
After adding configuration in prometheus.yml. We should restart the service by

systemctl restart prometheus
This creates a scrape_configs section and defines a job called a node. It includes the URL of your Node Exporter’s web interface in its array of targets. The scrape_interval is set to 15 seconds so that Prometheus scrapes the metrics once every fifteen seconds. You could name your job anything you want, but calling it “node” allows you to use the default console templates of Node Exporter.
Use a browser to visit Prometheus’s homepage available at http://your_server_ip:9090. You’ll see the following homepage. Visit http://your_server_ip:9090/consoles/node.html to access the Node Console and click on your server, localhost:9100, to view its metrics.