Geth Performance Issue
Recently I've had a performance issue with geth, which is a binary based on GoLang, so I decided checking its bottlenecks. I've found that there is a flags for geth to enable pprof monitoring:
--pprof Enable the pprof HTTP server --pprofaddr value pprof HTTP server listening interface (default: "127.0.0.1") --pprofport value pprof HTTP server listening port (default: 6060)
So, I've used the default pprof address and port, and only added the --pprof flag.
But what now?
What is pprof, and how can I use it?
What is pprof?
Just like the ethereum developers did for the geth binary, it can we easily integrated into any Go application, by following the steps in the pprof package documentation page.
pprof is based on profiles, as described in the package documentation:
"
A Profile is a collection of stack traces showing the call sequences that led to instances of a particular event, such as allocation. Packages can create and maintain their own profiles; the most common use is for tracking resources that must be explicitly closed, such as files or network connections.
"and you can monitor several resources such as:
goroutine - stack traces of all current goroutines heap - a sampling of memory allocations of live objects allocs - a sampling of all past memory allocations threadcreate - stack traces that led to the creation of new OS threads block - stack traces that led to blocking on synchronization primitives mutex - stack traces of holders of contended mutexes
Once you have pprof enabled, you can open a browser to your application using the URL: http://127.0.0.1:6060/debug/pprof, and investigate the application's CPU, memory, and more.
How to use pprof?
I've found the best method to do the actual investigation is to follow the next steps.
First, integrate pprof to the application (in case it is not already there) as described above.
Then, run the application and wait until the it starts having performance issues.
Once the application CPU is high, run the following command:
go tool pprof http://127.0.0.1:6060
This samples the application for 30 seconds, and then provides interactive prompt to analyze the 30 seconds activity, for example by running the command top provides the output:
Entering interactive mode (type "help" for commands, "o" for options) (pprof) top Showing nodes accounting for 230ms, 100% of 230ms total flat flat% sum% cum cum% 30ms 13.04% 13.04% 30ms 13.04% syscall.Syscall 20ms 8.70% 21.74% 20ms 8.70% runtime.cgocall 20ms 8.70% 30.43% 20ms 8.70% runtime.pcvalue 20ms 8.70% 39.13% 20ms 8.70% runtime.usleep 10ms 4.35% 43.48% 10ms 4.35% cmpbody 10ms 4.35% 47.83% 20ms 8.70% github.com/ethereum/go-ethereum/log.escapeString 10ms 4.35% 52.17% 10ms 4.35% net/textproto.(*Reader).ReadMIMEHeader 10ms 4.35% 56.52% 10ms 4.35% runtime.(*itabTableType).find 10ms 4.35% 60.87% 10ms 4.35% runtime.decoderune ...
While the top command might be OK for simple applications, I've found the diagram of blocks much more effective. To create a diagram, run the command svg. Notice that if you get error:
Failed to execute dot. Is Graphviz installed? Error: exec: "dot": executable file not found in $PATH
Then you need to install Graphviz:
sudo apt-get install graphviz
Once Graphviz is installed, the svg command produces a image of call hierarchy and costs, for example:
While investigating the call stack, I noticed that some of the nodes are dropped due to logic described by the pprof flags:
--nodefraction=This option provides another mechanism for discarding nodes from the display. If the cumulative count for a node is less than this option's value multiplied by the total count for the profile, the node is dropped. The default value is 0.005; i.e. nodes that account for less than half a percent of the total time are dropped. A node is dropped if either this condition is satisfied, or the --nodecount condition is satisfied. --edgefraction= This option controls the number of displayed edges. First of all, an edge is dropped if either its source or destination node is dropped. Otherwise, the edge is dropped if the sample count along the edge is less than this option's value multiplied by the total count for the profile. The default value is 0.001; i.e., edges that account for less than 0.1% of the total time are dropped.
Since I wanted more information, I've used these flags to select more nodes to be displayed:
go tool pprof -edgefraction 0 -nodefraction 0 -nodecount 100000 http://127.0.0.1:6060
And then I was able to fully understand the performance issues.
No comments:
Post a Comment