Date Posted: October 2, 2008
Update: January 26, 2009 Updates to DataCollector.v1.1.tar.gz and DC4L.LKM.v1.1.tar.gz offer improvements to better support x86_64 platforms, including a pre-built binary library with 64-bit SUN JVM support, installation script.
What is IBM Toolkit for Data Collection and Visual Analysis for Multi-Core Systems?
IBM® Toolkit for Data Collection and Visual Analysis for Multi-Core Systems targets many common performance problems pertaining to Java™ applications running on multi-core or multi-processing platforms. These problems can have a significant impact on the overall performance of Java middleware and applications. This toolkit has been designed for easy use by users with average skills and provides many features not found in performance tools from other vendors. With this toolkit, high performance is easily obtainable on IBM machines.
The toolkit currently comprises three components:
- Data Collector for Linux®: a Linux Kernel Module (LKM) that is used for collecting hardware performance data and relevant operating system (OS) kernel activities from the target machine
- Data Collection Java Agent: an agent module that is loaded into the address space of the Java process in order to monitor performance relevant events emitted by the Java Virtual Machine
- Visual Analyzer: a stand-alone Eclipse Rich Client Platform (RCP) application that is used to perform visual analysis on the trace data collected.
The Data Collector for Linux component is a loadable kernel module for Linux operating systems. Both x86 and POWER5 multi-core processors are supported. To use this component, the Linux kernel must have DebugFS and/or Relay features enabled and kernel debug symbols ready to use (most Linux distributions have pre-built, online packages with debugging information). If you are not interested in activities at hardware layer and/or kernel layer, you can still use the other two components.
The Data Collection Java Agent component is a dynamically loadable library that can be loaded into the address space of the Java process. It features intelligent installation and uninstallation of handlers for relevant JVM (Java Virtual Machine) internal events. If the data collector for Linux component above is also installed, this component will poll performance data from the kernel module and write them into trace files.
The Visual Analyzer is an Eclipse-based RCP performance visualization and analysis tool. The implementation of the analyzer was based on an existing framework from IBM TuningFork Visualization Tool, which was developed for profiling Real-time Java virtual machine and applications and which is also available here at alphaWorks. The main advantage of using this Visual Analyzer is that it provides a rich set of predefined figures and views for performance analysis, as well as various supports to user interactions.
The main advantage of this toolkit is that it establishes a reasoning process that enables Java developers to identify and fix performance anomalies in Java applications when running on multi-core systems. The reasoning process can be performed in two directions with the help of this toolkit:
- horizontal reasoning for comparison of application activities with its impact to JVM activities
- vertical reasoning for comparison of Java layer activities with their associated native OS activities.
Collectively, the toolkit facilitates the analysis of common performance problems pertaining to Java applications on multi-core/multiprocessing platforms. Examples of such problems include:
- inferior processor core use, where some of the cores are not fully used or where a significant portion of the processor resources are used for activities other than executing application code
- ineffective parallelism level, where either too many application threads are inducing overwhelming garbage collection activities or context switches overhead, or where too few application threads are created to share the computation workload
- random thread migrations, where imprudently-made thread migration decisions by the operating system lead to the bumping of threads among processors, which is then manifested as frequent cache invalidation and execution choking
- ill-organized synchronization, where inappropriately chosen synchronization primitives or bad programming habits lead to other problems, such as under-use of processor cores; under parallelism; bad scalability; and so on.
- ignorance of hardware-level resource constraints, where contention on the shared processor resources leads to performance interference among applications (threads) running side by side.
How does it work?
At present, the Visual Analyzer component borrows the framework established for real-time Java visualization contained in IBM TuningFork Visualization Tool. The component is a stand-alone RCP application. Features include capabilities for data stream generation, figure definition, and some of the basic views provided in TuningFork, such as the open framework for trace analysis. As a pure Java application, Visual Analyzer can be run on any platform.
The Data Collection Java agent is provided as a dynamic, linkable library, where event callbacks are provided to handle events fired by the JVM through JVMTI (JVM Tooling Interface). To enable an in-depth analysis of kernel/hardware layer activities, a loadable kernel module (LKM) is provided for Linux operating systems. This kernel module, after being loaded and started, can instrument a running Linux kernel so that it can capture events relevant to the running Java program, along with performance numbers obtained from Hardware Performance Monitor registers found in many new multi-core processors.
About the technology author(s)
The design and development of IBM Toolkit for Data Collection and Visual Analysis for Multi-core Systems was led by Qi Ming Teng, a staff researcher at IBM China Research Laboratory (CRL). Contributors to the toolkit include Xiao Zhong, Feng Wang, Hai Chuan Wang, and Wei Hong Qian, all from CRL; and Peter F. Sweeney from Watson Research Center.
The development team acknowledges the encouragement and guidance from their mentors Evelyn Duesterwald, Raja Das, and Bob Blainey. The team also thanks TuningFork team members Dave Bacon and Dave Grove for their help.
