Thursday, January 2, 2020

Monitoring a GO application using pprof



Geth Performance Issue


Recently I've had a performance issue with geth, which is a binary based on GoLang, so I decided checking its bottlenecks. I've found that there is a flags for geth to enable pprof monitoring:

--pprof             Enable the pprof HTTP server
--pprofaddr value   pprof HTTP server listening interface (default: "127.0.0.1")
--pprofport value   pprof HTTP server listening port (default: 6060)

So, I've used the default pprof address and port, and only added the --pprof flag.
But what now?
What is pprof, and how can I use it?

What is pprof?


pprof is a tool by google, enabling "visualization and analysis of profiling data".
Just like the ethereum developers did for the geth binary, it can we easily integrated into any Go application, by following the steps in the pprof package documentation page.

pprof is based on profiles, as described in the package documentation:
"
A Profile is a collection of stack traces showing the call sequences that led to instances of a particular event, such as allocation. Packages can create and maintain their own profiles; the most common use is for tracking resources that must be explicitly closed, such as files or network connections.
"

and you can monitor several resources such as:

goroutine    - stack traces of all current goroutines
heap         - a sampling of memory allocations of live objects
allocs       - a sampling of all past memory allocations
threadcreate - stack traces that led to the creation of new OS threads
block        - stack traces that led to blocking on synchronization primitives
mutex        - stack traces of holders of contended mutexes


Once you have pprof enabled, you can open a browser to your application using the URL: http://127.0.0.1:6060/debug/pprof, and investigate the application's CPU,  memory, and more.

How to use pprof?


I've found the best method to do the actual investigation is to follow the next steps.

First, integrate pprof to the application (in case it is not already there) as described above.
Then, run the application and wait until the it starts having performance issues.
Once the application CPU is high, run the following command:

go tool pprof http://127.0.0.1:6060

This samples the application for 30 seconds, and then provides interactive prompt to analyze the 30 seconds activity, for example by running the command top provides the output:

Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 230ms, 100% of 230ms total
      flat  flat%   sum%        cum   cum%
      30ms 13.04% 13.04%       30ms 13.04%  syscall.Syscall
      20ms  8.70% 21.74%       20ms  8.70%  runtime.cgocall
      20ms  8.70% 30.43%       20ms  8.70%  runtime.pcvalue
      20ms  8.70% 39.13%       20ms  8.70%  runtime.usleep
      10ms  4.35% 43.48%       10ms  4.35%  cmpbody
      10ms  4.35% 47.83%       20ms  8.70%  github.com/ethereum/go-ethereum/log.escapeString
      10ms  4.35% 52.17%       10ms  4.35%  net/textproto.(*Reader).ReadMIMEHeader
      10ms  4.35% 56.52%       10ms  4.35%  runtime.(*itabTableType).find
      10ms  4.35% 60.87%       10ms  4.35%  runtime.decoderune
...


While the top command might be OK for simple applications, I've found the diagram of blocks much more effective. To create a diagram, run the command svg. Notice that if you get error:

Failed to execute dot. Is Graphviz installed? Error: exec: "dot": executable file not found in $PATH

Then you need to install Graphviz:

sudo apt-get install graphviz

Once Graphviz is installed, the svg command produces a image of call hierarchy and costs, for example:




While investigating the call stack, I noticed that some of the nodes are dropped due to logic described by the pprof flags:

--nodefraction= This option provides another mechanism for discarding nodes from the display. If the cumulative count for a node is less than this option's value multiplied by the total count for the profile, the node is dropped. The default value is 0.005; i.e. nodes that account for less than half a percent of the total time are dropped. A node is dropped if either this condition is satisfied, or the --nodecount condition is satisfied.
--edgefraction= This option controls the number of displayed edges. First of all, an edge is dropped if either its source or destination node is dropped. Otherwise, the edge is dropped if the sample count along the edge is less than this option's value multiplied by the total count for the profile. The default value is 0.001; i.e., edges that account for less than 0.1% of the total time are dropped.

Since I wanted more information, I've used these flags to select more nodes to be displayed:
go tool pprof  -edgefraction 0 -nodefraction 0 -nodecount 100000 http://127.0.0.1:6060

And then I was able to fully understand the performance issues.

Summary


In this post, we've demonstrated how to use pprof to investigated the performance of a GO based application. It is a great and simple tool that can be easily integrated to any GO application.



No comments:

Post a Comment