2. Writing a New Valgrind Tool

			Valgrind Technical Documentation

2.1. Introduction

So you want to write a Valgrind tool? Here are some instructions that may help. They were last updated for Valgrind 3.2.2.

2.1.1. Tools

The key idea behind Valgrind's architecture is the division between its "core" and "tool plug-ins".

The core provides the common low-level infrastructure to support program instrumentation, including the JIT compiler, low-level memory manager, signal handling and a scheduler (for pthreads). It also provides certain services that are useful to some but not all tools, such as support for error recording and suppression.

But the core leaves certain operations undefined, which must be filled by tools. Most notably, tools define how program code should be instrumented. They can also call certain functions to indicate to the core that they would like to use certain services, or be notified when certain interesting events occur. But the core takes care of all the hard work.

2.2. Writing a Tool

2.2.1. How tools work

Tool plug-ins must define various functions for instrumenting programs that are called by Valgrind's core. They are then linked against Valgrind's core to define a complete Valgrind tool which will be used when the --tool option is used to select it.

2.2.2. Getting the code

To write your own tool, you'll need the Valgrind source code. You'll need a check-out of the Subversion repository for the automake/autoconf build instructions to work. See the information about how to do check-out from the repository at the Valgrind website.

2.2.3. Getting started

Valgrind uses GNU automake and autoconf for the creation of Makefiles and configuration. But don't worry, these instructions should be enough to get you started even if you know nothing about those tools.

In what follows, all filenames are relative to Valgrind's top-level directory valgrind/.

Choose a name for the tool, and a two-letter abbreviation that can be used as a short prefix. We'll use foobar and fb as an example.
Make three new directories foobar/, foobar/docs/ and foobar/tests/.
Create empty files foobar/docs/Makefile.am and foobar/tests/Makefile.am.
Copy none/Makefile.am into foobar/. Edit it by replacing all occurrences of the string "none" with "foobar", and all occurrences of the string "nl_" with "fb_".
Copy none/nl_main.c into foobar/, renaming it as fb_main.c. Edit it by changing the details lines in nl_pre_clo_init() to something appropriate for the tool. These fields are used in the startup message, except for bug_reports_to which is used if a tool assertion fails. Also replace the string "nl_" with "fb_" again.
Edit Makefile.am, adding the new directory foobar to the TOOLS or EXP_TOOLSvariables.
Edit configure.in, adding foobar/Makefile, foobar/docs/Makefile and foobar/tests/Makefile to the AC_OUTPUT list.
Run:
```
  autogen.sh
  ./configure --prefix=`pwd`/inst
  make install
```
It should automake, configure and compile without errors, putting copies of the tool in foobar/ and inst/lib/valgrind/.

You can test it with a command like:

  inst/bin/valgrind --tool=foobar date

(almost any program should work; date is just an example). The output should be something like this:

  ==738== foobar-0.0.1, a foobarring tool for x86-linux.
  ==738== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
  ==738== Using LibVEX rev 1791, a library for dynamic binary translation.
  ==738== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
  ==738== Using valgrind-3.3.0, a dynamic binary instrumentation framework.
  ==738== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
  ==738== For more details, rerun with: -v
  ==738==
  Tue Nov 27 12:40:49 EST 2007
  ==738==

The tool does nothing except run the program uninstrumented.

These steps don't have to be followed exactly - you can choose different names for your source files, and use a different --prefix for ./configure.

Now that we've setup, built and tested the simplest possible tool, onto the interesting stuff...

2.2.4. Writing the code

A tool must define at least these four functions:

  pre_clo_init()
  post_clo_init()
  instrument()
  fini()

The names can be different to the above, but these are the usual names. The first one is registered using the macro VG_DETERMINE_INTERFACE_VERSION (which also checks that the core/tool interface of the tool matches that of the core). The last three are registered using the VG_(basic_tool_funcs) function.

In addition, if a tool wants to use some of the optional services provided by the core, it may have to define other functions and tell the core about them.

2.2.5. Initialisation

Most of the initialisation should be done in pre_clo_init(). Only use post_clo_init() if a tool provides command line options and must do some initialisation after option processing takes place ("clo" stands for "command line options").

First of all, various "details" need to be set for a tool, using the functions VG_(details_*)(). Some are all compulsory, some aren't. Some are used when constructing the startup message, detail_bug_reports_to is used if VG_(tool_panic)() is ever called, or a tool assertion fails. Others have other uses.

Second, various "needs" can be set for a tool, using the functions VG_(needs_*)(). They are mostly booleans, and can be left untouched (they default to False). They determine whether a tool can do various things such as: record, report and suppress errors; process command line options; wrap system calls; record extra information about malloc'd blocks, etc.

For example, if a tool wants the core's help in recording and reporting errors, it must call VG_(needs_tool_errors) and provide definitions of eight functions for comparing errors, printing out errors, reading suppressions from a suppressions file, etc. While writing these functions requires some work, it's much less than doing error handling from scratch because the core is doing most of the work. See the function VG_(needs_tool_errors) in include/pub_tool_tooliface.h for full details of all the needs.

Third, the tool can indicate which events in core it wants to be notified about, using the functions VG_(track_*)(). These include things such as blocks of memory being malloc'd, the stack pointer changing, a mutex being locked, etc. If a tool wants to know about this, it should provide a pointer to a function, which will be called when that event happens.

For example, if the tool want to be notified when a new block of memory is malloc'd, it should call VG_(track_new_mem_heap)() with an appropriate function pointer, and the assigned function will be called each time this happens.

More information about "details", "needs" and "trackable events" can be found in include/pub_tool_tooliface.h.

2.2.6. Instrumentation

instrument() is the interesting one. It allows you to instrument VEX IR, which is Valgrind's RISC-like intermediate language. VEX IR is described fairly well in the comments of the header file VEX/pub/libvex_ir.h.

The easiest way to instrument VEX IR is to insert calls to C functions when interesting things happen. See the tool "Lackey" (lackey/lk_main.c) for a simple example of this, or Cachegrind (cachegrind/cg_main.c) for a more complex example.

2.2.7. Finalisation

This is where you can present the final results, such as a summary of the information collected. Any log files should be written out at this point.

2.2.8. Other Important Information

Please note that the core/tool split infrastructure is quite complex and not brilliantly documented. Here are some important points, but there are undoubtedly many others that I should note but haven't thought of.

The files include/pub_tool_*.h contain all the types, macros, functions, etc. that a tool should (hopefully) need, and are the only .h files a tool should need to #include. They have a reasonable amount of documentation in it that should hopefully be enough to get you going.

Note that you can't use anything from the C library (there are deep reasons for this, trust us). Valgrind provides an implementation of a reasonable subset of the C library, details of which are in pub_tool_libc*.h.

When writing a tool, you shouldn't need to look at any of the code in Valgrind's core. Although it might be useful sometimes to help understand something.

The include/pub_tool_basics.h and VEX/pub/libvex_basictypes.h files file have some basic types that are widely used.

Ultimately, the tools distributed (Memcheck, Cachegrind, Lackey, etc.) are probably the best documentation of all, for the moment.

Note that the VG_ macro is used heavily. This just prepends a longer string in front of names to avoid potential namespace clashes. It is defined in include/pub_tool_basics_asm.h.

There are some assorted notes about various aspects of the implementation in docs/internals/. Much of it isn't that relevant to tool-writers, however.

2.2.9. Words of Advice

Writing and debugging tools is not trivial. Here are some suggestions for solving common problems.

2.2.9.1. Segmentation Faults

If you are getting segmentation faults in C functions used by your tool, the usual GDB command:

  gdb <prog> core

usually gives the location of the segmentation fault.

2.2.9.2. Debugging C functions

If you want to debug C functions used by your tool, you can achieve this by following these steps:

Set VALGRIND_LAUNCHER to <prefix>/bin/valgrind:

  export VALGRIND_LAUNCHER=/usr/local/bin/valgrind

Then run gdb <prefix>/lib/valgrind/<platform>/<tool>:
```
  gdb /usr/local/lib/valgrind/ppc32-linux/lackey
```
Do handle SIGSEGV SIGILL nostop noprint in GDB to prevent GDB from stopping on a SIGSEGV or SIGILL:
```
  (gdb) handle SIGILL SIGSEGV nostop noprint
```
Set any breakpoints you want and proceed as normal for GDB:
```
  (gdb) b vgPlain_do_exec
```
The macro VG_(FUNC) is expanded to vgPlain_FUNC, so If you want to set a breakpoint VG_(do_exec), you could do like this in GDB.
Run the tool with required options:
```
  (gdb) run `pwd`
```

GDB may be able to give you useful information. Note that by default most of the system is built with -fomit-frame-pointer, and you'll need to get rid of this to extract useful tracebacks from GDB.

2.2.9.3. IR Instrumentation Problems

If you are having problems with your VEX IR instrumentation, it's likely that GDB won't be able to help at all. In this case, Valgrind's --trace-flags option is invaluable for observing the results of instrumentation.

2.2.9.4. Miscellaneous

If you just want to know whether a program point has been reached, using the OINK macro (in include/pub_tool_libcprint.h) can be easier than using GDB.

The other debugging command line options can be useful too (run valgrind --help-debug for the list).

2.3. Advanced Topics

Once a tool becomes more complicated, there are some extra things you may want/need to do.

2.3.1. Suppressions

If your tool reports errors and you want to suppress some common ones, you can add suppressions to the suppression files. The relevant files are valgrind/*.supp; the final suppression file is aggregated from these files by combining the relevant .supp files depending on the versions of linux, X and glibc on a system.

Suppression types have the form tool_name:suppression_name. The tool_name here is the name you specify for the tool during initialisation with VG_(details_name)().

2.3.2. Documentation

As of version 3.0.0, Valgrind documentation has been converted to XML. Why? See The XML FAQ.

2.3.2.1. The XML Toolchain

If you are feeling conscientious and want to write some documentation for your tool, please use XML. The Valgrind Docs use the following toolchain and versions:

 xmllint:   using libxml version 20607
 xsltproc:  using libxml 20607, libxslt 10102 and libexslt 802
 pdfxmltex: pdfTeX (Web2C 7.4.5) 3.14159-1.10b
 pdftops:   version 3.00
 DocBook:   version 4.2

Latency: you should note that latency is a big problem: DocBook is constantly being updated, but the tools tend to lag behind somewhat. It is important that the versions get on with each other, so if you decide to upgrade something, then you need to ascertain whether things still work nicely - this *cannot* be assumed.

Stylesheets: The Valgrind docs use various custom stylesheet layers, all of which are in valgrind/docs/lib/. You shouldn't need to modify these in any way.

Catalogs: Catalogs provide a mapping from generic addresses to specific local directories on a given machine. Most recent Linux distributions have adopted a common place for storing catalogs (/etc/xml/). Assuming that you have the various tools listed above installed, you probably won't need to modify your catalogs. But if you do, then just add another group to this file, reflecting your local installation.

2.3.2.2. Writing the Documentation

Follow these steps (using foobar as the example tool name again):

The docs go in valgrind/foobar/docs/, which you will have created when you started writing the tool.
Write foobar/docs/Makefile.am. Use memcheck/docs/Makefile.am as an example.
Copy the XML documentation file for the tool Nulgrind from valgrind/none/docs/nl-manual.xml to foobar/docs/, and rename it to foobar/docs/fb-manual.xml.

Note: there is a *really stupid* tetex bug with underscores in filenames, so don't use '_'.
Write the documentation. There are some helpful bits and pieces on using xml markup in valgrind/docs/xml/xml_help.txt.
Include it in the User Manual by adding the relevant entry to valgrind/docs/xml/manual.xml. Copy and edit an existing entry.
Validate foobar/docs/fb-manual.xml using the following command from within valgrind/docs/:
```
% make valid
```
You will probably get errors that look like this:
```
./xml/index.xml:5: element chapter: validity error : No declaration for
attribute base of element chapter
```
Ignore (only) these -- they're not important.

Because the xml toolchain is fragile, it is important to ensure that fb-manual.xml won't break the documentation set build. Note that just because an xml file happily transforms to html does not necessarily mean the same holds true for pdf/ps.
You can (re-)generate the HTML docs while you are writing fb-manual.xml to help you see how it's looking. The generated files end up in valgrind/docs/html/. Use the following command, within valgrind/docs/:
```
% make html-docs
```
When you have finished, also generate pdf and ps output to check all is well, from within valgrind/docs/:
```
% make print-docs
```
Check the output .pdf and .ps files in valgrind/docs/print/.

2.3.3. Regression Tests

Valgrind has some support for regression tests. If you want to write regression tests for your tool:

The tests go in foobar/tests/, which you will have created when you started writing the tool.
Write foobar/tests/Makefile.am. Use memcheck/tests/Makefile.am as an example.
Write the tests, .vgtest test description files, .stdout.exp and .stderr.exp expected output files. (Note that Valgrind's output goes to stderr.) Some details on writing and running tests are given in the comments at the top of the testing script tests/vg_regtest.
Write a filter for stderr results foobar/tests/filter_stderr. It can call the existing filters in tests/. See memcheck/tests/filter_stderr for an example; in particular note the $dir trick that ensures the filter works correctly from any directory.

2.3.4. Profiling

To profile a tool, use Cachegrind on it. Read README_DEVELOPERS for details on running Valgrind under Valgrind.

Alternatively, you can use OProfile. In most cases, it is better than Cachegrind because it's much faster, and gives real times, as opposed to instruction and cache hit/miss counts.

2.3.5. Other Makefile Hackery

If you add any directories under valgrind/foobar/, you will need to add an appropriate Makefile.am to it, and add a corresponding entry to the AC_OUTPUT list in valgrind/configure.in.

If you add any scripts to your tool (see Cachegrind for an example) you need to add them to the bin_SCRIPTS variable in valgrind/foobar/Makefile.am.

2.3.6. Core/tool Interface Versions

In order to allow for the core/tool interface to evolve over time, Valgrind uses a basic interface versioning system. All a tool has to do is use the VG_DETERMINE_INTERFACE_VERSION macro exactly once in its code. If not, a link error will occur when the tool is built.

The interface version number is changed when binary incompatible changes are made to the interface. If the core and tool has the same major version number X they should work together. If X doesn't match, Valgrind will abort execution with an explanation of the problem.

This approach was chosen so that if the interface changes in the future, old tools won't work and the reason will be clearly explained, instead of possibly crashing mysteriously. We have attempted to minimise the potential for binary incompatible changes by means such as minimising the use of naked structs in the interface.

2.4. Final Words

The core/tool interface is not fixed. It's pretty stable these days, but it does change. We deliberately do not provide backward compatibility with old interfaces, because it is too difficult and too restrictive. The interface checking should catch any incompatibilities. We view this as a good thing -- if we had to be backward compatible with earlier versions, many improvements now in the system could not have been added.

Happy programming.