Wednesday, June 17, 2020

Report a Grafana HeatMap graph from a GO application




In the previous post Report prometheus metrics from a GO application, we've created a simple counters report from a GO application.
In this post we will review an HeapMap reporting from a GO application.

The thing to notice is that Prometheus standard for a pre-bucket counters is that each bucket contains all the smaller buckets, while Grafana expect each bucket to include only the bucket range.

For example, assume we have the following statistics of response time per request:
  • 10 requests had a response time of 100 ms
  • 10 requests had a response time of 200 ms
  • 10 requests had a response time of 300 ms
  • 10 requests had a response time of 400 ms

Prometheus pre-buckets counters will be:
  • response_time{le="100"} 10
  • response_time{le="200"} 20
  • response_time{le="300"} 30
  • response_time{le="400"} 40
While Grafana expects:
  • response_time{le="100"} 10
  • response_time{le="200"} 10
  • response_time{le="300"} 10
  • response_time{le="400"} 10

To solve this, we implement the non cumulative buckets on our own.
First, we configure an array of 10 buckets.
Each bucket width is 500ms, so we actually represent a heatmap of 0-5000ms, using steps of 500ms.


func main() {
heatmap = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "response_time",
Help: "response time for each HTTP handler",
},
[]string{"le"},
)

buckets = make([]int64, 10)
var total int64
for i := 0; i < len(buckets); i++ {
total += 500
buckets[i] = total
}

http.Handle("/metrics", promhttp.Handler())
addHandlerFunc("/foo", fooHandler)
addHandlerFunc("/bar", barHandler)
log.Fatal(http.ListenAndServe(":8080", nil))
}


Next we do the actual assignment of the request to the related bucket according to the response time:


func addHandlerFunc(pattern string, handler func(http.ResponseWriter, *http.Request)) {
http.HandleFunc(pattern, func(w http.ResponseWriter, r *http.Request) {
startTime:= time.Now()
handler(w, r)
labels := make(map[string]string)
labels["handler"] = pattern
passedTime:= time.Since(startTime)
labels["le"] = getBucket(passedTime)
heatmap.With(labels).Inc()
})
}



and the actual bucket locate function is:


func getBucket(passedTime time.Duration) string {
millis := passedTime.Microseconds()
for i := 0; i < len(buckets); i++ {
if millis <= buckets[i] {
return strconv.FormatInt(buckets[i], 10)
}
}
return "INF"
}


And so, we get a nice heatmap in Grafana, indicating the what is the histogram of the response time:





Final Notes


In this post we have reviewed bypassing a compatibility issue of Prometheus and Grafana by using our own buckets implementation.

Maybe in one of the next versions this issue will be solve by one of the Prometheus and Grafana parties.

No comments:

Post a Comment