As we promised in our previous blog Prometheus as Scale – Part 1 that in our next blog we will be writing about the implementation part of Cortex with Prometheus, so here we are with our promise. But before going to the implementation part, we would suggest you guys go through our first blog to know the need for it.
Previously we talked that Prometheus is becoming a go-to option for people who want to implement event-based monitoring and alerting. The implementation and management of Prometheus are quite easy. But when we have a large infrastructure to monitor or the infrastructure has started to grow you require to scale monitoring solution as well.
A few days back we were also in a similar kind of situation where one of our client’s infrastructure was growing as per the need and they need a resilient, scalable, and reliable monitoring system. Since they were already using the Prometheus, so we explored our option and came across an interesting project called “Cortex“.
What is Cortex?
As we have discussed in our previous blog that Prometheus has some scalability limitations. So cortex is a project created by Weaveworks originally to overcome those limitations. We can also say that it’s a doozy version of Prometheus which has a lot of additional features like:-
- Horizontal Scaling- Cortex works in a microservice model that means it can be deployed across multiple clusters and multiple Prometheus servers can send data to a single Cortex endpoint. This model embraces the global aggregation of the metrics.
- Highly Available- Each Cortex component can be scaled individually which provides high availability across the services.
- Multi Tenant- If multiple Prometheus servers are sending data to Cortex, in that case, it provides a layer of abstraction between data.
- Long Term Storage- This is one of the key features of Cortex which comes natively inside it. Cortex supports multiple storage backends to store data for long-term analytics purposes. Some of the storage backend examples are:- S3, GCS, Minio, Cassandra, and Big Table, etc.
If we talk about the architecture of Cortex, it looks like this:-
Cortex can be easily installed by using Helm package manager in Kubernetes. So, we will use standard helm chart created by Cortex team, but before we have to install consul inside the cluster as data store.
$ helm repo add hashicorp https://helm.releases.hashicorp.com $ helm search repo hashicorp/consul $ helm install consul hashicorp/consul --set global.name=consul --namespace cortex
Verify the consul nodes using kubectl.
Now we have the datastore in-place, we need to configure the storage gateway to connect with a remote storage backend. We evaluated multiple storage solutions and then decided to go ahead with the S3 bucket in AWS. A few points that how we decided that S3 was a right fit:-
- We were already using AWS for few services.
- Our Kubernetes was running inside the Local Datacenter and Prometheus was also configured at the same location, so we already have a built-in bridge using AWS Direct connect. So network bandwidth was not a concerned anymore.
So we have customized the default values file of Cortex according to our use-case, you guys can find the values file here
$ helm repo add cortex-helm https://cortexproject.github.io/cortex-helm-chart $ helm install cortex --namespace cortex -f my-cortex-values.yaml cortex-helm/cortex
Here we are pretty much done with the cortex setup and now it’s time for configuring the Prometheus to connect with Cortex.
Since we are done with the first part of the setup i.e. Consul, Cortex setup and now it’s time to configure the Prometheus. In Prometheus, we don’t have to do a lot of configuration changes, just a remote write URL needs to be provided.
The remote write and read API is part of Prometheus to send and receive metrics samples to a third-party API, in our case Cortex.
We just have to add these lines of block in our prometheus.yaml file.
remote_write: url: http://cortex.cortex/api/prom/push
If you are using Prometheus Operator in that case we may have to configure this in the CRD definition of Operator.
Once we are done with changes, we are done with all the changes in Prometheus, we can go to the S3 bucket to validate the data creation.
In this blog, we have seen the architecture of Cortex and how we can integrate cortex with Prometheus with S3 as remote backend storage. In the next part of the blog, we will discuss that how we can use Cortex with Grafana to fetch data.
If you guys have any other ideas or suggestions around the approach, please comment in the comment section. Thanks for reading, I’d really appreciate your suggestions and feedback.
Opstree is an End to End DevOps solution provider