Linux Kernel GCOV - tool analysis
Nicholas Mc Guire
Distributed & Embedded Systems Lab
SISE,Lanzhou University, Lanzhou,P.R.China
mcguire@lzu.edu.cn, http://dslab.lzu.edu.cn
February 8, 2006
i
Contents
Contents
1. Kernel gcov support - tool analysis
1.7.1. patch the kernel
1.7.2. patch the modultils
1.1. Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2. patch file
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3. Patch analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4. Architecture dependent changes . . . . . . . . . . . . . . . . . . . . . . . .
1.5. Architecture support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.6. Basic technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.6.1. -fprofile-arcs . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.6.2. -ftest-coverage . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.7. Building for 2.4.X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
1.8. building for 2.6.X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.8.1. Applying the Patch . . . . . . . . . . . . . . . . . . . . . . . . . .
1.9. Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.9.1. Update Lilo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.9.2. Update GRUB . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1
1
2
2
3
4
4
5
5
6
7
7
7
8
9
9
1.10. Runtime Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.11. Data acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.11.1. File content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.12. Extracting profiling data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.13. Checking Code Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.14. Kernel Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.14.1. X86 Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.15. encountered problems
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.16. Performance Impact
1.17. RT-performance impact
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.18. General Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.18.1. Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.18.2. License
1.18.3. Patch status
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.18.4. Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.19. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2. List of Acronyms
22
ii
Version
1.0
1.1
1.2
Contents
Author
Nicholas
Guire
Georg Schiesser
Mc
Date
Jan 2005
18 Jan 2005
Nicholas
Guire
Mc
Jan 2006
Comment
First shot
converted to TEX
document
2.6 revision
This gcov intoduction/manual is released under FDL V1.2 [1]. All software used for this
session is available under GPL V2 license [2].
iii
1. Kernel gcov support - tool analysis
1. Kernel gcov support - tool analysis
In the framework of Work Package 5 - Boot-Time Optimization, of ”A Compara-
tive Study on Real-time enhanced Linux Variants” conducted for Siemens CT SE2,
Muenchen, research on existing tools to analize boot-times was performed. In this arti-
cle, derived from analysis notes, we describe the tools basics and usage. The intention of
this article is to provide practical guidance for engeneers using these tools and provide
concept basics so that thes free-software tools are no long black-boxes. For a general
introduction to runtime debugging in embedded systems we refere you to [6].
As one of the well known tools for user-space applications extending gcov into kernel
space seems like a quite natural thing to do. In this article we describe the tools analysis,
gcov usage, and data acquisition for the 2.4.25 and 2.6.14 kernel.
A brief introduction to the core technology concept and its application in user-space
process and libraries is given.
Feedback to mcguire@lzu.edu.cn is always appreciated. The latest version of this
document is available at http://dslab.lzu.edu.cn.
This manual assumes a default installation of Slackware [3] 10.0 or 10.1 - though it
should apply to more or less any current distribution.
1.1. Source
lcov-1.4.tar.gz (not strictly required) http://sourceforge.net/projects/ltp ->
gcov-2.6.X.patch.gz
note that some patches use the naming scheme linux-2.6.X-gcov.patch.gz.
dependencies: none
1.2. patch file
• /drivers/gcov/gcov-core.c:
The gcov core functions for initializing logging of code coverage data
• /drivers/gcov/gcov-proc.c:
The proc interface built under /proc/gcov
• /include/linux/gcov.h:
GCOV related macros and function prototypes, the struct bb is declared here -
highly gcc version dependent.
1
1. Kernel gcov support - tool analysis
Due to this being quite compiler dependent gcov-core.c is a bit of a mess, basically it is
the same function set for three different compiler versions, ifdef’ed .
Note that the actual instrumentation is done by gcc’s -fprofile-arcs and -ftest-coverage
flags, the kernel patch only needs to make the data accessible (you actually can com-
pile a kernel with CFLAGS KERNEL=-fprofile-arcs -ftest-coverage even without the
patch applied, it would only fail in the linking stage with an unresolved symbol to
bb init func - that is exactly what gcov-core provides.
1.3. Patch analysis
The gcov interface is cleanly encapsulated in /driver/gcov - this code is not really
architecture dependent. All gcov related parts are cleanly ifdef’ed in the code so turning
off gcov support leaves no side effects.
The main changes of the patch are in the configuration (Kconfig for 2.6.X) and in the
Makefiles - the patch is not very invasive at code level (though it does off course change
the runtime behavior of every function).
Fundamentally gcov-core.c is based on a doubly linked list of struct bb (basic blocks)
in which all data is collected - see include/linux/gcov.h for details on struct bb.
One such list is initialized for each module. This is also one source of the performance
penalty as these lists can become quite large and thus increase the cache misses.
1.4. Architecture dependent changes
The only really architecture specific issue is the sections added for the constructor and
destructor functions. Though this is not really that arch dependent, but can’t be placed
in any arch independent file.
/arch/i386/kernel/head.S
__CTOR_LIST__,@object
.section ".ctors","aw"
.globl __CTOR_LIST__
.type
__CTOR_LIST__:
.section ".dtors","aw"
.globl __DTOR_LIST__
.type
__DTOR_LIST__:
__DTOR_LIST__,@object
2
1. Kernel gcov support - tool analysis
Changes to the module interface
Every module has to know where it’t gcov related constructor and destructor functions are
located - so the struct module has two additional fields:
const char *ctors_start; /* Pointer to start of .ctors-section */
const char *ctors_end;
/* Pointer to end of .ctors-section */
which are used in the load module, sys init module and free module function
(kernel/module.c) to initialize the gcov data acquisition:
• load module:
modindex = find_sec(hdr, sechdrs, secstrings, ".ctors");
mod->ctors_start = (char *)sechdrs[modindex].sh_addr;
mod->ctors_end
= (char *)(mod->ctors_start +
sechdrs[modindex].sh_size);
• sys init module:
if (mod->ctors_start && mod->ctors_end) {
do_global_ctors(mod->ctors_start, mod->ctors_end, mod);
}
• free module:
if (mod->ctors_start && mod->ctors_end)
remove_bb_link(mod);
1.5. Architecture support
i386, ia64, ppc, ppc64, s390. Though it should not be too difficult to extend to further
architectures - none of the patch components are actually arch dependent
3
1. Kernel gcov support - tool analysis
1.6. Basic technology
gcov is a test coverage program.
It operates in conjunction with GCC’s -fprofile-arcs
and -ftest-coverage. The main goals are code coverage and hot-spot location. With
some limits you can use gcov as a profiling tool to help discover where your optimization
efforts will best affect your code, the main limitation being the system load specific results
of gcov - thus optimization is for a specified load profile and generally implies decreasing
performance in other system load scenarios. The main questions kernel-gcov can answer are:
• how often each line of code executes
• what lines of code are actually executed
• how much computing time each section of code uses
The flags used to enable profiling have the following effects:
1.6.1. -fprofile-arcs
is taken.
During execution the program records how many times each branch is executed and
how many times it
This data is stored in the filename.da file in
/proc/gcov/PATH IN KERNEL TREE/ for every kernel file filename. This profiling data
is collected by instrumenting the functions. The results from -fprofile-arcs is what
later can be used to optimize the system by feeding basic-block information back with
-fbranch-probabilities (-ftest-coverage is not needed for this purpose). The in-
strumentation is not as simple as the one for KFI (Kernel Function Instrumentation) that
simply adds entry and exit code to every function (see -finstrument-functions in the
GCC manual and KFI [?]), -fprofile-arcs is more selective, and far more light-weight.
For each function GCC creates a program flow graph, then finds a spanning tree for the graph
(that is eliminating loops and redundant paths). Only arcs that are not on the spanning tree
have to be instrumented by adding code to count how often these arcs are executed. A arc
that is not on the spanning tree is an entry or exit point into the function in question. When
an arc is the only exit or only entrance to a block it is directly instrumented, if not, a new
basic block is created and instrumented.
Since spanning tree creation starts with block 0, low numbered arcs are more likely to end up
on the spanning tree than high numbered arcs. This causes most instrumented arcs to be at
the end, which implies a asymmetric distortion of the kernel code - thus timing information
4
1. Kernel gcov support - tool analysis
gained from gcov instrumented kernels are most likely not reliable. For details on this see
the file gcc/gcov.c in the gcc source tree.
The actual code change done in the files is to initialize an array the size of the arcs found
in the file and then to instrument them with a 64 bit counter for each arc (implemented as
two 32 bit values).
addl
adcl
$1, .LPBX2+OFFSET
$0, .LPBX2+OFFSET+4
The LPBX2 array (conforming to the gcov .da file format) can be quite large (i.e.
kernel/sched.c 13k , drivers/ide/ide-disk.c 11k), the overall kernel size is increased
by roughly 60%.
The overhead of the instrumentation code it self is thus not to wild - the cache side effects
are more dramatic (see section performance below).
1.6.2. -ftest-coverage
This flag to gcc tels it to dump the profiling data to files for the gcov code-coverage utility.
For an introduction to the gcov utility see the info pages to gcov info gcov. Note that
-ftest-coverage in user-space causes profiling data to be generated on program exit only,
which kernel gcov continuously updates the arc counters accessible via /proc/gcov/, this
has the side effect that you never get a consistent trace state if looking at multiple files. The
practical consequence of the non-synchronous file generation via the proc files is that you
can’t trace the effects of short running programs. To see this effect we cleared the counters
by writing to /proc/gcov/vmlinux and then imediately copied all the files - this copy
operation alone creates significant counts throughout the entire kernel tree - thus distorting
any application related counts. For long running applications the impact can be concidered
negligable, but for short running applications the file copy operation must be considered.
1.7. Building for 2.4.X
GCOV support in the 2.4.X kernel series is still a bit experimental - for the 2.6.X series it
looks like a solid tool (see the later section on 2.6.X kernel).
5