run KISS: Using prometheus and grafana to monitor kubernetes pods CPU

Once a system is up and running, we usually want to benchmark it. That means we run some stress tests, and check the performance of the system. The performance is measured by various means, for example: transactions per second, CPU usage, and memory usage.

However, in a complex micro-services kubernetes based system, we have many moving parts, and it is hard to pin point the part which is the current bottleneck. This is where prometheus and grafana can assist us.

In this post we will review deployment of prometheus and grafana on a kubernetes cluster.
This includes:

Prometheus kubernetes permissions
Prometheus configuration
Prometheus deployment and service
Grafana deployment and service
Grafana predefined dashboards

Prometheus

Prometheus is an open source community driven monitoring system and time series database.

To deploy it as part of a kubernetes cluster we need to create a service account with permissions to access kubernetes resources. This includes: service account, cluster role, and cluster role binding.

Prometheus Permissions

Service account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name:   prometheus-service-account
  namespace: default

Cluster role:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-role
rules:
  - apiGroups: [""]
    resources:
      - nodes
      - nodes/proxy
      - services
      - endpoints
      - pods
    verbs: ["get", "list", "watch"]
  - apiGroups:
      - extensions
    resources:
      - ingresses
    verbs: ["get", "list", "watch"]
  - nonResourceURLs: ["/metrics"]
    verbs: ["get"]

Cluster role binding:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus-role-binding
subjects:
  - kind: ServiceAccount
    name: prometheus-service-account
    namespace: default
roleRef:
  kind: ClusterRole
  name: prometheus-role
  apiGroup: rbac.authorization.k8s.io

Prometheus Configuration

The prometheus configuration is saved in a kubernetes ConfigMap. It includes various options, such as the scraping interval (sample the targets each X seconds), and the cAdvisor job which runs as part of the kubelet binary.

The ConfigMap is the following:

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
data:
  prometheus.yaml: |-
    global:
      scrape_interval: 5s
      evaluation_interval: 5s
    
    scrape_configs:
      - job_name: 'kubernetes-apiservers'
    
        kubernetes_sd_configs:
        - role: endpoints
        scheme: https
    
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    
        relabel_configs:
        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
          action: keep
          regex: default;kubernetes;https
    
      - job_name: 'kubernetes-nodes'
    
        scheme: https
    
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    
        kubernetes_sd_configs:
        - role: node
    
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - target_label: __address__
          replacement: kubernetes.default.svc:443
        - source_labels: [__meta_kubernetes_node_name]
          regex: (.+)
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics
    
    
      - job_name: 'kubernetes-pods'
    
        kubernetes_sd_configs:
        - role: pod
    
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_pod_name]
          action: replace
          target_label: kubernetes_pod_name
    
      - job_name: 'kube-state-metrics'
        static_configs:
          - targets: ['kube-state-metrics.kube-system.svc.cluster.local:8080']
    
      - job_name: 'kubernetes-cadvisor'
    
        scheme: https
    
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    
        kubernetes_sd_configs:
        - role: node
    
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - target_label: __address__
          replacement: kubernetes.default.svc:443
        - source_labels: [__meta_kubernetes_node_name]
          regex: (.+)
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
    
      - job_name: 'kubernetes-service-endpoints'
    
        kubernetes_sd_configs:
        - role: endpoints
    
        relabel_configs:
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
          action: replace
          target_label: __scheme__
          regex: (https?)
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
          action: replace
          target_label: __address__
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_service_name]
          action: replace
          target_label: kubernetes_name

Prometheus Deployment & Service

Next, we need to deploy the prometheus deployment, and the service, allowing access to the deployment. Notice that we are not using persistent storage, since the data is usually used only during the benchmark testing. In case of need, change the deployment to use persistent storage. Notice that the prometheus service is exposed as NodePort, allowing easy access to the service on port 30030.

Prometheus deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      configid: prometheus-container
  template:
    metadata:
      labels:
        configid: prometheus-container        
    spec:
      serviceAccountName: prometheus-service-account
      containers:
        - name: prometheus
          image: prom/prometheus:v2.15.2
          imagePullPolicy: IfNotPresent
          args:
            - "--config.file=/etc/prometheus/prometheus.yaml"
            - "--storage.tsdb.path=/prometheus/"
          volumeMounts:
            - name: prometheus-config
              mountPath: /etc/prometheus
      volumes:
        - name: prometheus-config
          configMap:
            defaultMode: 420
            name: prometheus-config

Prometheus service:

apiVersion: v1
kind: Service
metadata:
  name: prometheus-service
spec:
  selector:
    configid: prometheus-container
  type: NodePort
  ports:
      - port: 80
        targetPort: 9090
        name: http
        protocol: TCP
        nodePort: 30030

Once the prometheus is deployed, we access the service to view the collected statistics.
For example, to view CPU usage for all MongoDB containers, we can use the PromQL:

rate(container_cpu_usage_seconds_total{pod=~".*mongo.*"}[1m])

and get the following graph:

Grafana

Now that we have the statistics in prometheus, we can use grafana as the GUI to view and analyze the data.

First, we create the ConfigMap that configures prometheus as the data source of grafana:

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-datasources-config
data:
  datasource-prometheus.yaml: |-
    apiVersion: 1
    datasources:
    - name: "prometheus"
      access: "proxy"
      editable: true
      orgId: 1
      type: "prometheus"
      url: "http://prometheus-service"
      version: 1

And then we create a deployment, which uses the ConfigMap:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      configid: grafana-container
  template:
    metadata:
      labels:
        configid: grafana-container        
    spec:
      containers:
        - name: grafana
          image: grafana/grafana:6.5.2-ubuntu
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: grafana-datasources-config
              mountPath: /etc/grafana/provisioning/datasources
      volumes:
        - name: grafana-datasources-config
          configMap:
            name: grafana-datasources-config

and a service to enable access to the grafana GUI.
The service is expose using NodePort 30040.

apiVersion: v1
kind: Service
metadata:
  name: grafana-service
spec:
  selector:
    configid: grafana-container
  type: NodePort
  ports:
      - port: 80
        targetPort: 3000
        name: http
        protocol: TCP
        nodePort: 30040

The grafana GUI can be used now to access the prometheus.

Grafana Predefined Dashboards

But wait...
What if we have worked hard, and created a custom dashboard in grafana, and then we want to reinstall the solution on the kubernetes cluster. Our dashboard is lost?

To overcome this, we can use grafana provisioning capability.
Once our custom dashboard is ready, enter to the dashboard GUI in grafana, click on the cog icon on the top of the page, and select the "JSON Model" tab.

This JSON should be saved to a ConfigMap that will be used upon grafana startup to load a "provisioning" dashboard.
Actually we will use 2 ConfigMaps.
The first ConfigMap is the "provisioning dashboard location" instruct grafana to look for custom dashboards in a specific folder, and the second ConfigMap includes our JSON Model.

The "provisioning dashboard location" ConfigMap is:

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-dashboards-config
data:
  dashboards-provider.yaml: |-
    apiVersion: 1
    
    providers:
      #  an unique provider name
      - name: 'predefined-dashboards'
        #  org id. will default to orgId 1 if not specified
        orgId: 1
        #  name of the dashboard folder. Required
        folder: ''
        #  folder UID. will be automatically generated if not specified
        folderUid: ''
        #  provider type. Required
        type: file
        #  disable dashboard deletion
        disableDeletion: false
        #  enable dashboard editing
        editable: true
        #  how often Grafana will scan for changed dashboards
        updateIntervalSeconds: 10
        #  allow updating provisioned dashboards from the UI
        allowUiUpdates: false
        options:
          #  path to dashboard files on disk. Required
          path: /predefined-dashboards

And the JSON Model ConfigMap is:

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-predefined-dashboards-config
  labels:    
    app.kubernetes.io/instance : bouncer
    app.kubernetes.io/name : bouncer
data:
  dashboard-bouncer.json: |-
    {
   ... The JSON Model text 
   ... NOTICE THE INDENTATION OF THE TEXT
    }

Last, we need to update the grafana deployment to use these ConfigMaps:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      configid: grafana-container
  template:
    metadata:
      labels:
        configid: grafana-container        
    spec:
      containers:
        - name: grafana
          image: grafana/grafana:6.5.2-ubuntu
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: grafana-dashboards-config
              mountPath: /etc/grafana/provisioning/dashboards
            - name: grafana-datasources-config
              mountPath: /etc/grafana/provisioning/datasources
            - name: grafana-predefined-dashboards-config
              mountPath: /predefined-dashboards
      volumes:
        - name: grafana-predefined-dashboards-config
          configMap:
            name: grafana-predefined-dashboards-config
        - name: grafana-dashboards-config
          configMap:
            name: grafana-dashboards-config
        - name: grafana-datasources-config
          configMap:
            name: grafana-datasources-config

Summary

We have used prometheus and grafana to assist us in analyze of kubernetes pods performance.
In this post CPU was measured using the "container_cpu_usage_seconds_total" counter, but many other counters are available, such as memory, disk and network. See this article for example.

Full Blog TOC

Full Blog Table Of Content with Keywords Available HERE

Thursday, January 16, 2020

Using prometheus and grafana to monitor kubernetes pods CPU