In this post we will configure an alert in grafana for restarting pods.
This relates to grafana version 11.5.1. Other version might have a different syntax.
Follow the next steps:
- Open grafana GUI
- Click on Alerting, Alert rules
- Click on New alert rule
- Enter rule name: restarting pods
- Select prometheus as the data source
- Make sure you have kube-state-metrics installed on the kubernetes cluster. In case it is not, install using:
helm repo add kube-state-metrics https://kubernetes.github.io/kube-state-metrics
helm repo update
helm install kube-state-metrics kube-state-metrics/kube-state-metrics \
--namespace kube-system \
--create-namespace
- Use the following PromQL query: sum by (namespace, pod) ( increase(kube_pod_container_status_restarts_total[5m]) ) > 1
- Leave the threshold as: A > 0
- Under Configure no data and error handling, set the Alert state if no data or all values are null to Normal
- In the email notification message, use the following summary:
Pod {{ $labels.pod }} restarted - In the email description use the following text:Pod {{ $labels.pod }} in namespace {{ $labels.namespace }}restarted more than once in the last 5 minutes.Current value: {{ $values.A.Value }}
- Make sure to configure email provider such as SendGrid to enable sending emails from grafana
No comments:
Post a Comment