From there, I got motivation for writing this blog so AlertManager setup with Prometheus will be a piece of cake for everyone.
If you are new to AlertManager I would suggest you go through with our Prometheus blog.
What Actually AlertManager Is?
- If you see in the above picture Prometheus is scraping the metrics from its client application(exporters).
- When the alert is generated then it pushes it to the AlertManager, later AlertManager validates the alerts groups on the basis of labels.
- and then forward it to the receivers like Email or Slack.
$ cd /opt/ $ wget https://github.com/prometheus/alertmanager/releases/download/v0.11.0/alertmanager-0.11.0.linux-amd64.tar.gz
$ tar -xvzf alertmanager-0.11.0.linux-amd64.tar.gz
$ mv alertmanager-0.11.0.linux-amd64/alertmanager /usr/local/bin/
$ mkdir /etc/alertmanager/
$ vim /etc/alertmanager/alertmanager.yml
global: # The directory from which notification templates are read. templates: - '/etc/alertmanager/template/*.tmpl' # The root route on which each incoming alert enters. route: # The labels by which incoming alerts are grouped together. For example, # multiple alerts coming in for cluster=A and alertname=LatencyHigh would # be batched into a single group. group_by: ['alertname', 'cluster', 'service'] # When a new group of alerts is created by an incoming alert, wait at # least 'group_wait' to send the initial notification. # This way ensures that you get multiple alerts for the same group that start # firing shortly after another are batched together on the first # notification. group_wait: 3s # When the first notification was sent, wait 'group_interval' to send a batch # of new alerts that started firing for that group. group_interval: 5s # If an alert has successfully been sent, wait 'repeat_interval' to # resend them. repeat_interval: 1m # A default receiver receiver: mail-receiver # All the above attributes are inherited by all child routes and can # overwritten on each. # The child route trees. routes: - match: service: node receiver: mail-receiver routes: - match: severity: critical receiver: critical-mail-receiver # This route handles all alerts coming from a database service. If there's # no team to handle it, it defaults to the DB team. - match: service: database receiver: mail-receiver routes: - match: severity: critical receiver: critical-mail-receiver receivers: - name: 'mail-receiver' slack_configs:- api_url: https://hooks.slack.com/services/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l: '#prom-alert' - name: 'critical-mail-receiver' slack_configs:- api_url: https://hooks.slack.com/services/T2AGPFQ9X/B94D2LHHD/abhajkaKajKaALALOPaaaJk
global: templates: - '/etc/alertmanager/*.tmpl' # The root route on which each incoming alert enters. route: # default route if none match receiver: alert-emailer # The labels by which incoming alerts are grouped together. For example, # multiple alerts coming in for cluster=A and alertname=LatencyHigh would # be batched into a single group. # TODO: group_by: ['alertname', 'priority'] # All the above attributes are inherited by all child routes and can # overwritten on each. receivers: - name: alert-emailer email_configs: - to: 'email@example.com' send_resolved: false from: 'firstname.lastname@example.org' smarthost: 'smtp.example.com:587' auth_username: 'email@example.com' auth_password: 'IamPassword' auth_secret: 'firstname.lastname@example.org' auth_identity: 'email@example.com'
$ mkdir /var/lib/alertmanager
$ vi /etc/systemd/system/alertmanager.service
[Unit] Description=AlertManager Server Service Wants=network-online.target After=network-online.target [Service] User=root Group=root Type=Simple ExecStart=/usr/local/bin/alertmanager \ --config.file /etc/alertmanager/alertmanager.yml \ --storage.tsdb.path /var/lib/alertmanager [Install] WantedBy=multi-user.target
$ systemctl daemon-reload $ systemctl start alertmanager $ systemctl enable alertmanager