OProfile FAQ Updated March 14, 2006 Q: What versions of Red Hat Linux have OProfile RPMS? A: There are OProfile RPMs available for ia32, amd64 (hammer), and ia64. In many cases you will need to use the SMP kernels because oprofile support is not enable in the uniprocessor kernels. * RH Linux 7.x: no rpms available * RH Linux 8.0: oprofile-0.3-0.20020806 (ia32 processors, uses perf. monitoring hardware on Pentium Pro, PII, PIII, Duron, and Athlon) * RH Linux 9: oprofile-0.4-44 (ia32 processors, uses perf. monitoring hardware on Pentium Pro, PII, PIII, Pentium 4, Pentium 4 HT, Duron, and Athlon) * RH Linux 9: oprofile-0.4-44 (x86_64 processors, uses perf. monitoring hardware on Hammer) * RHEL 2.1: no rpms available * RHEL 3 (U2): oprofile-devel-0.5.4-22, RPMS for x86, hammer, ia64, s390/s390x, and ppc/ppc64. * RHEL 3-U6: oprofile-0.5.4-22 OProfile support in taroon kernel for all architectures. Note that only SMP kernel have OProfile support enabled. * RHEL 4-U2: oprofile-0.8.1-21 OProfile enable on SMP and UP kernels for all architectures. * FC1: oprofile-0.7cvs-0.20030829.6 * FC2: oprofile-0.8-0.20040121.3 * FC3: oprofile-0.8.1-11 * FC4: oprofile-0.8.2-4 Q: Do the OProfile RPMs work with every kernel? A: NO! The OProfile RPM does not have the device driver for the kernel. The RH kernel must have the oprofile driver. So far, only the RH 8.x kernel and newer distributions of RH Linux have the required OProfile device drivers in the kernel. The 2.5.46 Linux kernel has incorporated OProfile; this is incompatible with the OProfile RPMs for initial 8.0 distribution. The Linux 2.4.20 kernels used in 9 has a back port of the 2.6 Linux kernel mechanism. Thus, do not mix the 8.0 and 9 RPMs for kernel and OProfile. The RHEL kernels use the 2.6 mechanism also. When the errata for RHL 8.0 comes out with a newer kernel the 8.0 OProfile will move to the same version as the one being used for RHL 9. Q: Does OProfile support for Pentium 4 processors? A: There is support for the Pentium 4 processor's performance monitoring hardware in RHL 9, RHEL 3, Fedora Core 1, and Fedora Core 2. The initial RHL 8.0 RPMs use an older version of the software and only provide OProfile support throught the RTC. To use OProfile on a Pentium 4 with RH Linux 8.0 you will need include the "nortc" option as a kernel option in the grub.conf or lilo.conf file. Without this kernel option the RTC module takes over the RTC hardware and OProfile will be unable to use the hardware to collect measurements. The newer OProfile rpms (oprofile-0.4-xxx and oprofile-0.5.1-xxx), Linux 2.4.20 in RH Linux 9, and RHEL kernels support Pentium 4 processors (including processors with Hyper-Threading). Q: Does OProfile work on SMP machines? A: Yes, OProfile is designed to collect data on SMP machines. OProfile sets up the performance monitoring hardware on each processor to the same configuration for events monitored, e.g. processor clock cycles and the interval between samples. Each processor will collect samples locally. The OProfile sample data does not include information about which processor the sample was taken on. The RTC mechanism (used when the processor's performance monitoring hardware is unsupport) operates different than the performance monitoring hardware driver. The RTC mechanism does not do a good job collect data on SMP machines because it RTC interrupt only interrupts one of the processors. On a multiprocessor machine only one processor gets sampled for each RTC interrupts and there is no guarantee which processor is sampled. This RTC mechanism is only used on the initial RHL 8.0 distribution. The TIMER_INT mechanism which is a fall-back mechanism on newer kernels on works SMP machines and will collect data on all the processors in the machine. The TIMER_INT mechanism is used on RHL 9, RHEL 3/4, and Fedora Core. Q: How do I start OProfile? A: You must have root priviledges to start the OProfile data collection. There is a GUI interface, oprof_start and a command line interface, op_start (RH 8.0) and opcontrol (RH 9, RHEL, and FC1/2/3/4). oprof_start is a bit simpler to use. Note that you may also need to flush the data with "op_dump" to be able to analyze it. Q: What should I measure? A: As a first pass most people are interested in finding out where their program spends time. On the Pentium Pro, Pentium II, and Pentium III this would be the CPU_CLK_UNHALTED event. On the AMD processors measure CPU_CLK_UNHALTED. The RTC support is time based, so just set it for a power of two HZ, e.g. 512 to get time based measurements. With Pentium 4 processor performance monitoring hardware support measure GLOBAL_POWER_EVENTS. On ia64 use CPU_CYCLES. With the TIMER_INT mechanism there is no choice, only time based samples are taken. If you want to get a better understanding why it is spending time in specific regions of code, you can look at other events such as instruction or data cache misses, conditional branch mispredictions, pipeline flushes, and instructions retired. A listing of the available events can be obtained from the op_help command. Q: What if my code makes heavy use of libraries? A: For the initial RHL 8.0 There is an option "--separate-samples" in op_start that attributes the samples in a library to the executable that called the library. Normally oprofile associates the samples with the executable file and all the samples for the shared libraries from the different executables gets lumped together. The data analysis program oprofpp should be passed the "--show-shared-libs" option when analyzing data collected with "--separate-samples". For RHL 9 and RHEL use "--separate=library" option with opcontrol. Q: How do I look at the collected data? A: On RHL 8, RHL 9, RHEL 3 for a summary of executables run on the machine use the "op_time" command. For more detailed analysis of a specific executable "oprofpp" or "op_to_source". On Fedora Core use the "opreport" command. Q: What if there are no samples? A: Flush the samples with "opcontrol --dump". Q: Where can I get additional information about OProfile? A: http://oprofile.sourceforge.net/ Q: Why I am getting the "timer" mechanism rather than performance monitoring hardware on a processor? A: On recent uniprocessor kernels only the "timer"mechanism is supported. This improvement for UP kernels is described in more detail in Red Hat Bugzilla 138832. The "timer" mechanism will also be used if the OProfile cannot identify the particular processor in use. This can happen with very new versions of Intel Xeon processors, Red Hat Bugzilla 176601.