kubernetes, prometheus, monitoring, metrics

Get Kubernetes Cluster Metrics with Prometheus in 5 Minutes

Having a Kubernetes cluster up and running is pretty easy these days. But, when you start to use the cluster and deploy some applications you might expect some issues over time. Kubernetes being a distributed system is not easy to troubleshoot. You need a good monitoring solution and because the Prometheus is CNCF project as Kubernetes it is probably the best fit. In this post, I will show you how to get the Prometheus running and start monitoring your Kubernetes cluster in 5 minutes.

Prometheus Operator

CoreOS introduced operators as a business logic in the first place. I wrote about Elasticsearch operator and how it works a few months ago so you might check it out. In my opinion, operators are the best way to deploy stateful applications on Kubernetes.

CoreOS team also provided Prometheus operator that I will use for deployment. Here is the official operator workflow and relationships view:

prometheus_operator_workflow

From the picture above you can see that you can create a ServiceMonitor resource which will scrape the Prometheus metrics from the defined set of pods. For example, if you have a frontend app which exposes Prometheus metrics on port web, all you need to do is to create a service monitor which will configure Prometheus server:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: frontend-app
  labels:
    app: frontend-app
spec:
  selector:
    matchLabels:
      app: frontend-app
  endpoints:
  - port: web
    interval: 10s

Installing operator is pretty easy with Helm. Let's add CoreOS repository and install it:

⚡ helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/

⚡ helm install \
    --name prometheus-operator \
    --namespace monitoring \
    --set rbacEnable=false \
    coreos/prometheus-operator
     
⚡ kubectl get pods -n monitoring
NAME                                   READY     STATUS    RESTARTS   AGE
prometheus-operator-67f87d659c-rtpwq   1/1       Running   0          1m

When you install the Prometheus operator you will get the new Custom Resource Definitions or CRDs. You can check that with this command:

⚡ kubectl get CustomResourceDefinition
NAME                                         AGE
alertmanagers.monitoring.coreos.com          1m
prometheuses.monitoring.coreos.com           1m
servicemonitors.monitoring.coreos.com        1m

As you can see, the Prometheus operator will manage alert manager, Prometheus server, and service monitors.

Prometheus Installation

For Prometheus installation I will use the Helm chart kube-prometheus. This chart has a lot of options, so I encourage you to take a look at default values file and override some values if needed. Among other services, this chart installs Grafana and exporters ready to monitor your cluster. kube-prometheus is an umbrella chart with many dependencies that you can find in requirements file.

I will enable persistent storage for all components and disable RBAC. You should have RBAC, but in my test cluster, it is not enabled. I will expose Grafana with Ingress, so I disabled anonymous authentication and changed the admin password. This is my custom values file:

⚡ cat > custom-values.yaml <<EOF
global:
  rbacEnable: false

alertmanager:
  storageSpec:
    volumeClaimTemplate:
      spec:
        storageClassName: rbd
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 10Gi

prometheus:
  storageSpec:
    volumeClaimTemplate:
      spec:
        storageClassName: rbd
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 10Gi

grafana:
  auth:
    anonymous:
      enabled: "false"
  adminPassword: "YourPass123#"
  ingress:
    enabled: true
    annotations:
      kubernetes.io/ingress.class: nginx
      kubernetes.io/tls-acme: "true"
    hosts: 
      - grafana.test.akomljen.com
    tls:
      - secretName: grafana-tls
        hosts:
          - grafana.test.akomljen.com
  storageSpec:
    class: rbd
    accessMode: "ReadWriteOnce"
    resources:
      requests:
        storage: 10Gi
EOF

The next step is to install the kube-prometheus chart using custom values file I created before:

⚡ helm install \
    --name mon \
    --namespace monitoring \
    -f custom-values.yaml \
    coreos/kube-prometheus

NOTE: Don't use prometheus as Helm release name! You might experience some issues if you do.

You should wait a few minutes and the whole stack will be up and running. Check for all pods in monitoring namespace:

⚡ kubectl get pods -n monitoring
NAME                                       READY     STATUS    RESTARTS   AGE
alertmanager-mon-0                         1/2       Running   0          1m
mon-exporter-kube-state-77b4847f76-wxzcz   1/2       Running   0          1m
mon-exporter-kube-state-7cbfc65568-rxs54   1/2       Running   0          1m
mon-exporter-node-9bngl                    1/1       Running   0          1m
mon-exporter-node-d2hnb                    1/1       Running   0          1m
mon-exporter-node-l7fgh                    1/1       Running   0          1m
mon-exporter-node-rvxlg                    1/1       Running   0          1m
mon-grafana-969d44bff-ctmd2                2/2       Running   0          1m
prometheus-mon-0                           1/2       Running   0          1m
prometheus-operator-67f87d659c-rtpwq       1/1       Running   0          10m

⚡ kubectl get ingress -n monitoring
NAME          HOSTS                       ADDRESS   PORTS     AGE
mon-grafana   grafana.test.akomljen.com             80, 443   1m

When you login to Grafana by default those dashboards will be available:

  • Deployment
  • Pods
  • Nodes
  • StatefulSet
  • Kubernetes Capacity Planning
  • Kubernetes Cluster Health
  • Kubernetes Cluster Status
  • Kubernetes Control Plane Status
  • Kubernetes Resource Requests

Of course, you can always update them, or create a completely new dashboard if you need to. In the example below you can see how the node view looks like:

prometheus_monitoring_3

If you want to access other services you can forward the port to localhost, for example:

# Alert manager
⚡ kubectl port-forward -n monitoring alertmanager-mon-0 9093

# Prometheus server
⚡ kubectl port-forward -n monitoring prometheus-mon-0 9090

When you expose Prometheus server to your localhost, you can also check for alerts at http://localhost:9090/alerts. You could also use Ingress to expose those services but, they don't have authentication so you would need something like OAuth Proxy in front.

Summary

It is almost impossible to not experience any issues with Kubernetes cluster once you start to use it. This monitoring setup will help you along the way. Of course, this is only one part of monitoring and it is cluster related only. Many cloud native applications have Prometheus support out of the box, so getting application metrics should be easy. I will cover this in some future blog post. Stay tuned for the next one.