Event Tracing of the Earth System Modeling Framework with VampirTrace

Coupled Earth System Models exhibit architectural diversity. There has been very little written about the software engineering advantages of the different architectures and, to my knowledge, no one has performed a comprehensive comparison of ESM architectures.  Toward this end, I have recently had a very good experience with a program for tracing/profiling high performance software called VampirTrace and a related product called Vampir for visualizing and analyzing traces. In this post, I will describe how to automatically add tracing instrumentation to Earth System Modeling Framework applications and show some sample analyses using the Vampir tool.

VampirTrace automatically instruments C, C++, and Fortran programs so that important events are recorded as the program executes. VampirTrace is designed to work with high performance applications, such as distributed memory parallel codes based on MPI. This works well for us since the vast majority of ESMs are based on this paradigm. When a program compiled with the VampirTrace compiler wrappers is executed, certain events will be recorded and, after program execution, the trace data will be available in a standardized format called Open Trace Format (OTF).

Lots of different events can be traced with fine grained timing data:

  • MPI events such as message send, receive, and collective calls
  • Function entry and exit
  • System calls such as memory allocations and I/O calls
  • Other user-defined events (requires manual code instrumentation)

A major advantage of using ESMF is that it provides a number of numerical modeling abstractions that would otherwise have to be coded by hand. Many of these abstractions hide the details of rather complex interactions, such as repartitioning field data from one decomposition (processor layout) to another and interpolating field data between grids with different resolutions. Because we are interested in what is going on within ESMF (not just within the user code that calls ESMF functions), I have instrumented ESMF itself using VampirTrace.

Instrumenting the ESMF library using VampirTrace

These are the software packages I am using:

Although VampirTrace ships with the OpenMPI distribution, it seems that the version of VampirTrace packaged with OpenMPI lags behind the latest stable release.  So, I use the latest version of VampirTrace and ignore the version packaged with OpenMPI. VampirTrace uses compiler wrappers for automatic code instrumentation. So, the ESMF build environment must be set up to use the VampirTrace wrappers instead of the usual C and Fortran compilers.  The Intel compilers, OpenMPI, NetCDF, and VampirTrace packages are all set up using the standard build/install instructions that come with those packages.

Here are my ESMF environment variables:

export ESMF_COMM="openmpi"
export ESMF_COMPILER="intel"
export ESMF_DIR="<some directory>/esmf"

export ESMF_CXX=vtcxx
export ESMF_F90=vtf90
export ESMF_CXXCOMPILEOPTS="-vt:cxx mpicxx -DVTRACE"
export ESMF_F90COMPILEOPTS="-vt:f90 mpif90 -DVTRACE"

With these environment variables, ESMF will be compiled using the VampirTrace wrappers vtcxx and vtf90 instead of the OpenMPI compiler wrappers mpicxx and mpif90. Make sure that VampirTrace, OpenMPI, and Intel compilers are all in the PATH or specify absolute locations for ESMF_CXX and ESMF_F90.

Now build ESMF and install it.

$ gmake
$ gmake install

I do not recommend executing gmake check because all of the ESMF unit and system tests will be executed with tracing turned on.  If you wish to check the build, set up your environment using the OpenMPI compiler wrappers, execute gmake to recompile and then run gmake check.  If everything checks, then rebuild with the VampirTrace compiler wrappers.  The only difference will be that ESMF is linked against the VampirTrace library instead of the regular MPI library. So, we’ll assume that the unit and system tests would pass in this case. If they did not, the bug would be within VampirTrace itself.

Okay, now that we have an ESMF build linked with the VampirTrace libraries, let’s run a small ESMF application and take a look at the resulting trace data. The easiest way to do this is to execute one or more of the system tests that ship with ESMF.  For example:

$ cd $ESMF_DIR/system_tests/ESMF_ArrayRedist
$ gmake
$ cd $ESMF_DIR/test/testO/Linux.intel.64.openmpi.default
$ mpirun -np 6 ./ESMF_ArrayRedistSTest

A couple notes:

  • Line 2 will compile the ESMF_ArrayRedist system test.  It will probably fail with an error File not found:  ‘opari.tab.c’.  To get around this, I’ve had to execute opari -table opari.tab.c manually.  I’m not sure why it is not working automatically.
  • The directory location in line 3 will change depending on the os, compiler, and MPI implementation you are using.

Line 4 will execute the instrumented system test. When it finishes, you should see several files produced in the same directory with the ESMF_ArrayRedistSTest executable: a series of files ESMF_ArrayRedistSTest.X.events.z (one for each of the six MPI processes), a file named ESMF_ArrayRedistSTest.0.def.z, and a file name ESMF_ArrayRedistSTest.otf.  This set of files contains the trace data for the run.

Analysis with Vampir

Vampir is an analysis and visualization package capable of reading trace files in OTF format.  It is not open source, but a demo license is available.  Let’s examine the trace files from the array redistribution system test.

  1. Install Vampir (download here)
  2. Start Vampir
  3. Open the file ESMF_ArrayRedistSTest.otf that was generated from the system test run
Now you’ll be able to see a number of different analyses on the trace data.  Here are a few of them:
This view shows the timeline of events for all six processes. The colors represent groups of related functions. Red, which dominates all processes, represents MPI-related functions. You’ll notice that for this short system test, a large portion of time is spend initializing the MPI threads. Yellow represents the ESMF library, green represents the “user code” (i.e., the code in the system test itself), and blue represents overhead of the VampirTrace code instrumentation. Black lines connecting two processes represent message sends/receives and collective MPI calls, such as a barrier. The vertical dashed lines appear every .5 seconds of elapsed wall clock time. The burst of messages between 2.5 and 3.5 seconds is attributed to the ESMF_ArrayRedistStore function, which calculates the communication pattern required to repartition field data decomposed on processes 0-3 onto processes 4-5.
The function summary view (above) shows the accumulated amount of time spend in each of the top N functions. Again, for this short run, the MPI_Init_thread call dominates. The first ESMF function to appear is on the third line. This is the function that calculates the repartitioning pattern. Keep in mind that this is a system test with no real science code–so, you do not see much time spend in the green “user” functions because they do not do anything useful.
This view shows the number of MPI messages grouped by message size. This particular plot is for a zoomed-in section of the timeline–it only covers messages passed during the ESMF_ArrayRedistStore function. The total amount of data transmitted can also be determined as well as the data throughput. This shows us that ESMF requires 152 MPI messages to calculate a repartitioning for the particular scenario in the system test (4 processors to 2 processors, 4×1 decomposition to 1×2  decomposition, 100×150 grid resolution on both components).
The communication matrix view shows properties of messages sent between individual processes.  Sending processes are on the vertical axis and receiving processes are on the horizontal axis. Again, this particular plot is only for the ESMF_ArrayRedistStore function.

In this example I have modified the system test to perform the same array redistribution 1000 times. This is a one-way repartitioning, so messages are sent from processes 0-3 and received by processes 4 and 5.

For now, this just gives an idea of the kinds of analyses supported by VampirTrace + Vampir.  I have been impressed by the initial experience with the software. Potential issues may arise as the number of processes increases and the overall length of the run (this run took 7 seconds wall clock time), but so far so good.


About rsdunlapiv

Computer science PhD student at Georgia Tech

3 responses to “Event Tracing of the Earth System Modeling Framework with VampirTrace”

  1. Jeff Squyres says :

    FWIW, the VT that ships with the Open MPI v1.5 series should be much more up-to-date than the one shipped with the v1.4 series.

    • rsdunlapiv says :

      Thanks Jeff. Actually, there were no technical reasons why the VT bundled with OpenMPI 1.4.x would not have worked. The only issue was that the documentation on the VT site is for the latest version and at least one of the environment variables had changed… (for the location of an NM file).

      • Jeff Squyres says :

        Fair enough.

        Be aware that the v1.4 series is our “super stable” release series, so it doesn’t get updates or new features — it only gets bug fixes. Hence, things like VT get locked down to whatever release they were when 1.4.0 was created (VT still gets bug fixes, of course, and I think the VT guys bump the version numbers when they apply fixes, but it is intentionally way behind the most recent version).

        The v1.5 series is our “feature” series, meaning that new things appear / disappear throughout the life of the series. Contrib packages like VT stay updated. The big difference between the “stable” and “release” series is that the feature series usually aren’t as time-tested/mature as the stable series. The feature series gets all the same QA testing that the stable series does, but it hasn’t survived out “in the wild” where users do unexpected things and catch bugs that we don’t catch in QA.

        This page describes our version methodology: http://www.open-mpi.org/software/ompi/versions/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: