Profiling Workshop

Our Profiling workshop is a free, two-part interactive workshop that aims to familiarize attendees with code profiling and tracking tool TAU on the HPC and how to use it.

TAU (Tuning and Analysis Utilities) is a powerful performance analysis and tracing tool for parallel applications created with Fortran, C++, C, Python, Java and many other languages. TAU is useful for things such as:

  • finding the amount of time spent on each function, subroutine, loop, or phase, measure speed (FLOPS),
  • measuring time spent on I/O and bandwidths,
  • finding memory utilization during different stages of execution of a code or a selected portion of a code,
  • finding memory leaks in C/C++ codes,
  • determining how a given application scale, and much more.

Workshop Format

This is an interactive, hands-on workshop. If you write parallel (MPI) code or use parallel packages developed by others, this workshop is for you. TAU can analyze and help you pinpoint a variety of issues in parallel programs. We will illustrate this using variety of examples (eg. WRF). The workshop will take place in a fully-equipped classroom laboratory with furnished computers. Attendees may bring their own notebook computers, but this will not be necessary.

Topics Covered

By the end of this workshop, attendees will be able to use TAU tools to improve efficiency of their parallel codes and find hidden issues.

Specific topics covered include:

  • Introduction to profiling and tracing parallel codes
  • Basic usage of TAU: Dynamic vs. scripted instrumentation, selecting appropriate makefile, run-time collection of data from a software, reducing performance overhead of TAU,
  • Finding which subroutines or parts of your package account for the most run-time and how much,
  • Visualizing and analyzing gathered data using various tools provided as part of the TAU package: Text summary, Paraprof, Jumpshot,
  • Advanced topics: finding run-time memory usage of codes, detecting memory leaks, selective profiling, finding the efficiency of your code, determining how performance scales with core count,
  • Use of PAPI (Performance API -- already configured to work with TAU) to measure various hardware counters to measure performance related events such as cache misses.


This workshop will be led by Prasad Maddumage, an Applications Specialist at RCC. Prasad has over seven years of parallel scientific code development experience mainly using Fortran.