run KISS: January 2020

Wednesday, January 29, 2020

GO Shared Library

Many projects are based on microservices architecture, and a GO based development. Such a development method usually generates a need for a common shared code among the microservices.
In this post we will review the GO shared library concerns including:

Single project & internal package
Multiple projects & public shared package
Multiple projects & internal shared package

1. Single Project & Internal Package

Let's start with a simple project, which includes a package.

Following the How to Write Go Code article, we get the following files structure:

The project includes the main.go file, which calls to a function from the greet package.

package main
import (
   "fmt"   "github.com/alonana/go-sample-project/greet")

func main() {
   fmt.Printf("Hello, %v\n", greet.RandomGreet())
}

And the greeting.go in the greet package includes the shared code.

package greet
import "math/rand"
var names = []string{"Mister", "Lady", "Alice", "Bob"}

func RandomGreet() string {
   return names[rand.Intn(len(names))]
}

In addition, the project is using GO modules, so a go.mod file is included:

module github.com/alonana/go-sample-project
go 1.12

Some points that should be handled for a shared package:

An exposed API should use upper case names. For example, the RandomGreet is exposed, while the names array is internal to the greet package.
All the files in the same folder should have the same package name.

2. Multiple Projects & Public Shared Package

As we progress in our development, we find that the greet package should be shared among multiple microservices. Since we want to avoid code duplication, we create a separate project for the greet package, and use it in each project.

Let first create the greet shared project, using the following files structure:

The greeting.go remains the same, however, as this is a new project, we need to create its own go.mod file.

module github.com/alonana/greet
go 1.12

Once this is done, we should upload the project to GitHub.
For example, by using the following commands:

git init
git add go.mod
git add greeting.go
git commit -m "First commit"
git remote add origin https://github.com/alonana/greet.git
git push -u origin master

Next, we update the sample project. We delete the greeting.go from the project. and update the go.mod to use the public version of the greet package.

module github.com/alonana/go-sample-project
go 1.12
require github.com/alonana/greet v0.0.0-20200129080101-0619d8064721

And update the import instructions in the sample project main.go:

package main
import "fmt"import "github.com/alonana/greet"
func main() {
   fmt.Printf("Hello, %v\n", greet.RandomGreet())
}

3. Multiple Projects & Internal Shared Package

A shared package through the github is nice, but it includes a overhead.

After each update to the shared package we should update the GitHub, and update the projects, that are using the package, to import the new version. That's good if the package is stable, and is seldom updated. However, if the shared package if often updated, this overhead is cumbersome.

To avoid this, we can import the package directly by a relative path.

This is done by updating the go.mod of the sample project:

module github.com/alonana/go-sample-project
go 1.12
require github.com/alonana/greet v0.0.0-00010101000000-000000000000
replace github.com/alonana/greet => ../greet

This assumes that both the sample project and the shared package project reside in the same parent folder. In case it is not so, usage of a absolute path is also possible.

Final Notes

In this post we've examined methods for using a shared package.

I myself have used the replace method in the go.mod file. This method is great. Any update to the shared package files immediately affects the using projects. I strongly recommend using this method.

Notice that each of these methods require different handling for a related docker image creation. For example, the replace method requires copying of the shared library into the source folder of the built project.

Wednesday, January 22, 2020

Dynamic Function Call using Reflection in GO

I've had a case where a GO based server had received requests over the network.
The requests include a request type in the URL, and the parameters as JSON in the request body.

I could use the echo library for this.
The echo library ask for each handler to handle the data conversion itself, for example:

func handler(c echo.Context) {
  u := new(User)
  if err = c.Bind(u); err != nil {
    return
  }
  // do thing with the User input
}

But something felt wrong...
Why should I manually convert the input to a specific type?
Instead I would prefer to send the input structure directly to the function.
Something like this:

func handler(u User) {
  // do thing with the User input
}

So, I've decided to create a reflection based caller to implement this.
The reflection API activate the related function as well as handles the type conversion.
This is very similar to Java based Spring REST controller.

The Dynamic Activator

The dynamic activator handles casting the bytes input to the function input structure, as well as activation of a the function. It holds list of registered functions:

type Dynamic struct {
   functions map[string]interface{}
}

func (d *Dynamic) Init() {
   d.functions = make(map[string]interface{})
}

To register a function, we add the pointer of the function:

func (d *Dynamic) Register(function interface{}) {
   fullName := runtime.FuncForPC(reflect.ValueOf(function).Pointer()).Name()
   sections := strings.Split(fullName, ".")
   name := sections[len(sections)-1]
   d.functions[name] = function}

And to run the function, we use GO reflection.
The reflection loads the type of the function parameter from the function itself, and then uses the type to de-serialize the bytes array from JSON into the related fields.

func (d *Dynamic) Execute(name string, data []byte) {
   function := d.functions[name]
   functionValue := reflect.ValueOf(function)
   functionType := functionValue.Type()
   parameterType := reflect.New(functionType.In(0))
   err := json.Unmarshal(data, parameterType.Interface())
   if err != nil {
      panic(err)
   }

   parameterValue := parameterType.Elem()
   transactionResult := functionValue.Call([]reflect.Value{parameterValue})

   returnValue := transactionResult[0].Interface()
   if returnValue != nil {
      err := returnValue.(error)
      panic(err)
   }
}

That's all, so simple.

Usage Example

Let see how can we use the Dynamic Activator.

First, let create 2 functions.

Notice:

Each function receives different type of input structure
The input parameters structures must use upper case fields to enable JSON serialization

type Parameters1 struct {
   Price int `json:"price"`}

type Parameters2 struct {
   Id     string `json:"id"`   Amount int    `json:"amount"`}

func func1(p *Parameters1) error {
   fmt.Printf("the price is %v\n", p.Price)
   return nil}

func func2(p *Parameters2) error {
   if p.Id == "-1" {
      return errors.New("bad id")
   }
   fmt.Printf("the amount is %v\n", p.Amount)
   return nil}

Now let register the functions, and dynamically activate each function.

d := Dynamic{}
d.Init()

d.Register(func1)
d.Register(func2)

d.Execute("func1", []byte(`{"price":1}`))
d.Execute("func2", []byte(`{"id":"5","Amount":7}`))
d.Execute("func2", []byte(`{"id":"-1","Amount":2}`))

and the output is:

the price is 1
the amount is 7
panic: bad id

Final Thoughts

Despite the many advantages of GO over Java, GO younger than Java, and still has a way to go before reaching the maturity of Java and SPRING.
Maybe in one of the next versions, the GO echo library developers would take this step forward, and even start the next GO-SPRING...

Redis Persistence

Using a Redis server for cache and for microservices interaction is great.
You can use a Redis cluster to ensure high availability in case one of your instances goes down.
See, for example, the related posts I've published:

But what about system restart?

Even if we have a multiple instances cluster, restarting the entire Redis cluster means that the data is lost, as by default, Redis stores the data in memory only.
This is where Redis persistence saves the day.

Redis has two types of persistence:

RDB - Redis Database Backup file
AOF - Append Only File

RDB

RDB stores snapshots of the data stored in Redis memory to a binary file named dump.rdb.

The RDB can be configured to run every X seconds and Y updates.

The related Redis configuration is:

# location of the backup files
dir /data

# create backup once 300 seconds passed if at least 1000 keys were changed
save 300 1000

# The filename where to dump the DB
dbfilename dump.rdb

The RDB creation is actually performed as a child process fork that handle the backup in the background, so except for the forking time itself (which might be meaningful if using a Redis instance with high memory consumption), the impact is low.

Using RDB means that we have a backup of the database once in a while, but in case of system crash all of the updates since the last backup are lost.

AOF

AOF writes any update to a text log file in an append mode. Upon restart, redis replays the commands to create the state in memory.

In case the Redis process was terminated in the middle of AOF append, the last record in the file might be corrupt, but Redis can overcome this issue automatically.

The AOF can be configured to commit to the disk using one of the following modes:

always - slow and safe, write for each key update
everysec - once in a second
no - trust the OS to commit to the disk ,should be once in ~30 seconds

The related configuration is:

# location of the AOF files
dir /data

appendonly yes
appendfilename "appendonly.aof"

# rewrite the AOF in the following cases
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# automatic recover/ignore a corrupt AOF last record
aof-load-truncated yes

# appendfsync always
# appendfsync everysec
# appendfsync no

Using AOF means that you gain higher durability level than RDB, but the performance and AOF size are higher. Use AOF if data loss is to be avoided as possible with a reasonable cost.

Which method to use?

Redis documentation recommends the following:

Use RDB only if data loss is not critical
Otherwise use RDB and AOF

Why use both?

The Redis engineers claim the RDB is still required for manual backups that you should perform yourself (out of scope of the Redis process).

Final Notes

To investigate an active Redis instance configuration, use the config get CLI, for example:

127.0.0.1:6379> CONFIG GET save
1) "save"
2) "60 1"

Alternatively, use the config get * CLI to view the entire configuration.

For more details, check the Redis persistence documentation, and the Redis documented configuration file.

Thursday, January 16, 2020

Using prometheus and grafana to monitor kubernetes pods CPU

Once a system is up and running, we usually want to benchmark it. That means we run some stress tests, and check the performance of the system. The performance is measured by various means, for example: transactions per second, CPU usage, and memory usage.

However, in a complex micro-services kubernetes based system, we have many moving parts, and it is hard to pin point the part which is the current bottleneck. This is where prometheus and grafana can assist us.

In this post we will review deployment of prometheus and grafana on a kubernetes cluster.
This includes:

Prometheus kubernetes permissions
Prometheus configuration
Prometheus deployment and service
Grafana deployment and service
Grafana predefined dashboards

Prometheus

Prometheus is an open source community driven monitoring system and time series database.

To deploy it as part of a kubernetes cluster we need to create a service account with permissions to access kubernetes resources. This includes: service account, cluster role, and cluster role binding.

Prometheus Permissions

Service account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name:   prometheus-service-account
  namespace: default

Cluster role:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-role
rules:
  - apiGroups: [""]
    resources:
      - nodes
      - nodes/proxy
      - services
      - endpoints
      - pods
    verbs: ["get", "list", "watch"]
  - apiGroups:
      - extensions
    resources:
      - ingresses
    verbs: ["get", "list", "watch"]
  - nonResourceURLs: ["/metrics"]
    verbs: ["get"]

Cluster role binding:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus-role-binding
subjects:
  - kind: ServiceAccount
    name: prometheus-service-account
    namespace: default
roleRef:
  kind: ClusterRole
  name: prometheus-role
  apiGroup: rbac.authorization.k8s.io

Prometheus Configuration

The prometheus configuration is saved in a kubernetes ConfigMap. It includes various options, such as the scraping interval (sample the targets each X seconds), and the cAdvisor job which runs as part of the kubelet binary.

The ConfigMap is the following:

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
data:
  prometheus.yaml: |-
    global:
      scrape_interval: 5s
      evaluation_interval: 5s
    
    scrape_configs:
      - job_name: 'kubernetes-apiservers'
    
        kubernetes_sd_configs:
        - role: endpoints
        scheme: https
    
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    
        relabel_configs:
        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
          action: keep
          regex: default;kubernetes;https
    
      - job_name: 'kubernetes-nodes'
    
        scheme: https
    
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    
        kubernetes_sd_configs:
        - role: node
    
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - target_label: __address__
          replacement: kubernetes.default.svc:443
        - source_labels: [__meta_kubernetes_node_name]
          regex: (.+)
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics
    
    
      - job_name: 'kubernetes-pods'
    
        kubernetes_sd_configs:
        - role: pod
    
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_pod_name]
          action: replace
          target_label: kubernetes_pod_name
    
      - job_name: 'kube-state-metrics'
        static_configs:
          - targets: ['kube-state-metrics.kube-system.svc.cluster.local:8080']
    
      - job_name: 'kubernetes-cadvisor'
    
        scheme: https
    
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    
        kubernetes_sd_configs:
        - role: node
    
        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - target_label: __address__
          replacement: kubernetes.default.svc:443
        - source_labels: [__meta_kubernetes_node_name]
          regex: (.+)
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
    
      - job_name: 'kubernetes-service-endpoints'
    
        kubernetes_sd_configs:
        - role: endpoints
    
        relabel_configs:
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
          action: replace
          target_label: __scheme__
          regex: (https?)
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
          action: replace
          target_label: __address__
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_service_name]
          action: replace
          target_label: kubernetes_name

Prometheus Deployment & Service

Next, we need to deploy the prometheus deployment, and the service, allowing access to the deployment. Notice that we are not using persistent storage, since the data is usually used only during the benchmark testing. In case of need, change the deployment to use persistent storage. Notice that the prometheus service is exposed as NodePort, allowing easy access to the service on port 30030.

Prometheus deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      configid: prometheus-container
  template:
    metadata:
      labels:
        configid: prometheus-container        
    spec:
      serviceAccountName: prometheus-service-account
      containers:
        - name: prometheus
          image: prom/prometheus:v2.15.2
          imagePullPolicy: IfNotPresent
          args:
            - "--config.file=/etc/prometheus/prometheus.yaml"
            - "--storage.tsdb.path=/prometheus/"
          volumeMounts:
            - name: prometheus-config
              mountPath: /etc/prometheus
      volumes:
        - name: prometheus-config
          configMap:
            defaultMode: 420
            name: prometheus-config

Prometheus service:

apiVersion: v1
kind: Service
metadata:
  name: prometheus-service
spec:
  selector:
    configid: prometheus-container
  type: NodePort
  ports:
      - port: 80
        targetPort: 9090
        name: http
        protocol: TCP
        nodePort: 30030

Once the prometheus is deployed, we access the service to view the collected statistics.
For example, to view CPU usage for all MongoDB containers, we can use the PromQL:

rate(container_cpu_usage_seconds_total{pod=~".*mongo.*"}[1m])

and get the following graph:

Grafana

Now that we have the statistics in prometheus, we can use grafana as the GUI to view and analyze the data.

First, we create the ConfigMap that configures prometheus as the data source of grafana:

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-datasources-config
data:
  datasource-prometheus.yaml: |-
    apiVersion: 1
    datasources:
    - name: "prometheus"
      access: "proxy"
      editable: true
      orgId: 1
      type: "prometheus"
      url: "http://prometheus-service"
      version: 1

And then we create a deployment, which uses the ConfigMap:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      configid: grafana-container
  template:
    metadata:
      labels:
        configid: grafana-container        
    spec:
      containers:
        - name: grafana
          image: grafana/grafana:6.5.2-ubuntu
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: grafana-datasources-config
              mountPath: /etc/grafana/provisioning/datasources
      volumes:
        - name: grafana-datasources-config
          configMap:
            name: grafana-datasources-config

and a service to enable access to the grafana GUI.
The service is expose using NodePort 30040.

apiVersion: v1
kind: Service
metadata:
  name: grafana-service
spec:
  selector:
    configid: grafana-container
  type: NodePort
  ports:
      - port: 80
        targetPort: 3000
        name: http
        protocol: TCP
        nodePort: 30040

The grafana GUI can be used now to access the prometheus.

Grafana Predefined Dashboards

But wait...
What if we have worked hard, and created a custom dashboard in grafana, and then we want to reinstall the solution on the kubernetes cluster. Our dashboard is lost?

To overcome this, we can use grafana provisioning capability.
Once our custom dashboard is ready, enter to the dashboard GUI in grafana, click on the cog icon on the top of the page, and select the "JSON Model" tab.

This JSON should be saved to a ConfigMap that will be used upon grafana startup to load a "provisioning" dashboard.
Actually we will use 2 ConfigMaps.
The first ConfigMap is the "provisioning dashboard location" instruct grafana to look for custom dashboards in a specific folder, and the second ConfigMap includes our JSON Model.

The "provisioning dashboard location" ConfigMap is:

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-dashboards-config
data:
  dashboards-provider.yaml: |-
    apiVersion: 1
    
    providers:
      #  an unique provider name
      - name: 'predefined-dashboards'
        #  org id. will default to orgId 1 if not specified
        orgId: 1
        #  name of the dashboard folder. Required
        folder: ''
        #  folder UID. will be automatically generated if not specified
        folderUid: ''
        #  provider type. Required
        type: file
        #  disable dashboard deletion
        disableDeletion: false
        #  enable dashboard editing
        editable: true
        #  how often Grafana will scan for changed dashboards
        updateIntervalSeconds: 10
        #  allow updating provisioned dashboards from the UI
        allowUiUpdates: false
        options:
          #  path to dashboard files on disk. Required
          path: /predefined-dashboards

And the JSON Model ConfigMap is:

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-predefined-dashboards-config
  labels:    
    app.kubernetes.io/instance : bouncer
    app.kubernetes.io/name : bouncer
data:
  dashboard-bouncer.json: |-
    {
   ... The JSON Model text 
   ... NOTICE THE INDENTATION OF THE TEXT
    }

Last, we need to update the grafana deployment to use these ConfigMaps:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      configid: grafana-container
  template:
    metadata:
      labels:
        configid: grafana-container        
    spec:
      containers:
        - name: grafana
          image: grafana/grafana:6.5.2-ubuntu
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: grafana-dashboards-config
              mountPath: /etc/grafana/provisioning/dashboards
            - name: grafana-datasources-config
              mountPath: /etc/grafana/provisioning/datasources
            - name: grafana-predefined-dashboards-config
              mountPath: /predefined-dashboards
      volumes:
        - name: grafana-predefined-dashboards-config
          configMap:
            name: grafana-predefined-dashboards-config
        - name: grafana-dashboards-config
          configMap:
            name: grafana-dashboards-config
        - name: grafana-datasources-config
          configMap:
            name: grafana-datasources-config

Summary

We have used prometheus and grafana to assist us in analyze of kubernetes pods performance.
In this post CPU was measured using the "container_cpu_usage_seconds_total" counter, but many other counters are available, such as memory, disk and network. See this article for example.

Thursday, January 9, 2020

Using NGINX auth_request to proxy to dynamically multiple backend servers

Last week I've had to use NGINX as a reverse proxy for 2 microservices: backend A, and backend B.
However, I needed more than a simple reverse proxy.
Unlike the standard supported NGINX routing based on the request properties, such as URL and query string regex matching, I wanted to use a dynamic backend selection.

I want the NGINX to dynamically select a different backend server per request, based on a request to another microservice. The same request might be routed to the backend A microservice upon first access, but later might be required to be routed to backend server B. The decision was a complete responsibility of application code running in another microservice.

The expected flow is:

The browser access the NGINX
The NGINX sends a request to the router microservice
The router microservice returns the current target for the request, for example: Backend A
NGINX proxies the request to the related backend microservice: A or B

First, I thought about using NGINX lua, but I've found it not recommended for production environments, and quite complicated.

Then I've noticed the NGINX support for auth_request, especially this section in the NGINX documentation:

Syntax: auth_request_set $variable value;

Default: —

Context: http, server, location

Sets the request variable to the given value after the authorization request completes. The value may contain variables from the authorization request, such as $upstream_http_*.

Syntax:	`auth_request_set $variable value;`
Default:	—
Context:	`http`, `server`, `location`

WOW!! Just what I need!
So, I've created the following NGINX configuration:

...

http {

  ...
  
  server {

    location /router
      internal;
      proxy_set_header originalHost $host;
      proxy_set_header originalUrl $request_uri;
      proxy_pass_request_body off;
      proxy_set_header Content-Length "";
      proxy_pass http://router-service/router
    }

    location / {
      auth_request /router;
      auth_request_set $validate_targetHost $upstream_http_targetHost;
      proxy_set_header Host $host;
      proxy_pass "${validate_targetHost}.default.svc.cluster.local${request_uri}";
    }
  }
}

The NGINX configuration contains 2 sections:

The location /
This includes a configuration to send an authorization request to the /router, get the targetHost header from the response, and use it to select the backend microservice as part of the proxy_pass direction.
The location /router
This include a configuration to send the request host and URL to the router microservice, without any body in the request.

The router service is a nodeJS express server.

Here is a snippet of the related code:

import express from 'express'
const app = express()

...


app.post('/router', (req, res) => {
  const originalUrl = req.header('originalUrl')
  const originalHost = req.header('originalHost')

  const targetHost = ...  // set targetHost to backendA or backendB based on application logic
  res.set('targetHost', targetHost)
  res.json({})
})

It looks like I've made it, but then I got an error from NGINX:

no resolver defined to resolve backendA while sending to client

Huh?
I've checked ping to the backendA service, and it works fine.
It turned out that NGINX is not used the standard DNS configuration (see here), so I have had to manually configure the DNS resolution for it.

First, I run the following script as part of NGINX startup:

echo resolver $(awk 'BEGIN{ORS=" "} $1=="nameserver" {print $2}' /etc/resolv.conf) ";" > /etc/nginx/resolvers.conf

and then, I've added the /etc/nginx/resolvers.conf to the nginx.conf file:

...

http {
   
  include         /etc/nginx/resolvers.conf;
  ...

And it worked!

Summary

This post covers how to use auth_request in NGINX to dynamically select the proxy backend.

We have presented how to implement a dynamic router microservice based on NodeJS express, and how to configure NGINX DNS resolvers.

Thursday, January 2, 2020

Monitoring a GO application using pprof

Geth Performance Issue

Recently I've had a performance issue with geth, which is a binary based on GoLang, so I decided checking its bottlenecks. I've found that there is a flags for geth to enable pprof monitoring:

--pprof             Enable the pprof HTTP server
--pprofaddr value   pprof HTTP server listening interface (default: "127.0.0.1")
--pprofport value   pprof HTTP server listening port (default: 6060)

So, I've used the default pprof address and port, and only added the --pprof flag.
But what now?
What is pprof, and how can I use it?

What is pprof?

pprof is a tool by google, enabling "visualization and analysis of profiling data".
Just like the ethereum developers did for the geth binary, it can we easily integrated into any Go application, by following the steps in the pprof package documentation page.

pprof is based on profiles, as described in the package documentation:
"

A Profile is a collection of stack traces showing the call sequences that led to instances of a particular event, such as allocation. Packages can create and maintain their own profiles; the most common use is for tracking resources that must be explicitly closed, such as files or network connections.

"

and you can monitor several resources such as:

goroutine    - stack traces of all current goroutines
heap         - a sampling of memory allocations of live objects
allocs       - a sampling of all past memory allocations
threadcreate - stack traces that led to the creation of new OS threads
block        - stack traces that led to blocking on synchronization primitives
mutex        - stack traces of holders of contended mutexes

Once you have pprof enabled, you can open a browser to your application using the URL: http://127.0.0.1:6060/debug/pprof, and investigate the application's CPU, memory, and more.

How to use pprof?

I've found the best method to do the actual investigation is to follow the next steps.

First, integrate pprof to the application (in case it is not already there) as described above.
Then, run the application and wait until the it starts having performance issues.
Once the application CPU is high, run the following command:

go tool pprof http://127.0.0.1:6060

This samples the application for 30 seconds, and then provides interactive prompt to analyze the 30 seconds activity, for example by running the command top provides the output:

Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 230ms, 100% of 230ms total
      flat  flat%   sum%        cum   cum%
      30ms 13.04% 13.04%       30ms 13.04%  syscall.Syscall
      20ms  8.70% 21.74%       20ms  8.70%  runtime.cgocall
      20ms  8.70% 30.43%       20ms  8.70%  runtime.pcvalue
      20ms  8.70% 39.13%       20ms  8.70%  runtime.usleep
      10ms  4.35% 43.48%       10ms  4.35%  cmpbody
      10ms  4.35% 47.83%       20ms  8.70%  github.com/ethereum/go-ethereum/log.escapeString
      10ms  4.35% 52.17%       10ms  4.35%  net/textproto.(*Reader).ReadMIMEHeader
      10ms  4.35% 56.52%       10ms  4.35%  runtime.(*itabTableType).find
      10ms  4.35% 60.87%       10ms  4.35%  runtime.decoderune
...

While the top command might be OK for simple applications, I've found the diagram of blocks much more effective. To create a diagram, run the command svg. Notice that if you get error:

Failed to execute dot. Is Graphviz installed? Error: exec: "dot": executable file not found in $PATH

Then you need to install Graphviz:

sudo apt-get install graphviz

Once Graphviz is installed, the svg command produces a image of call hierarchy and costs, for example:

While investigating the call stack, I noticed that some of the nodes are dropped due to logic described by the pprof flags:

--nodefraction= This option provides another mechanism for discarding nodes from the display. If the cumulative count for a node is less than this option's value multiplied by the total count for the profile, the node is dropped. The default value is 0.005; i.e. nodes that account for less than half a percent of the total time are dropped. A node is dropped if either this condition is satisfied, or the --nodecount condition is satisfied.
--edgefraction= This option controls the number of displayed edges. First of all, an edge is dropped if either its source or destination node is dropped. Otherwise, the edge is dropped if the sample count along the edge is less than this option's value multiplied by the total count for the profile. The default value is 0.001; i.e., edges that account for less than 0.1% of the total time are dropped.

Since I wanted more information, I've used these flags to select more nodes to be displayed:

go tool pprof  -edgefraction 0 -nodefraction 0 -nodecount 100000 http://127.0.0.1:6060

And then I was able to fully understand the performance issues.

Summary

In this post, we've demonstrated how to use pprof to investigated the performance of a GO based application. It is a great and simple tool that can be easily integrated to any GO application.