* Ted Ts'o <tytso@xxxxxxx> wrote: > I don't believe there's ever been any guarantee that "perf test" > from version N of the kernel will always work on a version N+M of > the kernel. Perhaps I am wrong, though. If that is a guarantee > that the perf developers are willing to stand behind, or have > already made, I would love to be corrected and would be delighted > to hear that in fact there is a stable, backwards compatible perf > ABI. We do even more than that, the perf ABI is fully backwards *and* forwards compatible: you can run older perf on newer ABIs and newer perf on older ABIs. To show you how it works in practice, here's a random cross-compatibility experiment: going back to the perf ABI of 2 years ago. I used v2.6.32 which was just the second upstream kernel with perf released in it. So i took a fresh perf tool version and booted a vanilla v2.6.32 (x86, defconfig, PERF_COUNTERS=y) kernel: $ uname -a Linux mercury 2.6.32 #162137 SMP Tue Nov 8 10:55:37 CET 2011 x86_64 x86_64 x86_64 GNU/Linux $ perf --version perf version 3.1.1927.gceec2 $ perf top Events: 2K cycles 61.68% [kernel] [k] sha_transform 16.09% [kernel] [k] mix_pool_bytes_extract 4.70% [kernel] [k] extract_buf 4.17% [kernel] [k] _spin_lock_irqsave 1.44% [kernel] [k] copy_user_generic_string 0.75% [kernel] [k] extract_entropy_user 0.37% [kernel] [k] acpi_pm_read [the box is running a /dev/urandom stress-test as you can see.] $ perf stat sleep 1 Performance counter stats for 'sleep 1': 0.766698 task-clock # 0.001 CPUs utilized 1 context-switches # 0.001 M/sec 0 CPU-migrations # 0.000 M/sec 177 page-faults # 0.231 M/sec 1,513,332 cycles # 1.974 GHz <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 522,609 instructions # 0.35 insns per cycle 65,812 branches # 85.838 M/sec 7,762 branch-misses # 11.79% of all branches 1.076211168 seconds time elapsed The two <not supported> events are not supported by the old kernel - but the other events were and the tool picked them up without bailing out. Regular profiling: $ perf record -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.075 MB perf.data (~3279 samples) ] perf report output: $ perf report Events: 1K cycles 64.45% dd [kernel.kallsyms] [k] sha_transform 19.39% dd [kernel.kallsyms] [k] mix_pool_bytes_extract 4.11% dd [kernel.kallsyms] [k] _spin_lock_irqsave 2.98% dd [kernel.kallsyms] [k] extract_buf 0.84% dd [kernel.kallsyms] [k] copy_user_generic_string 0.38% ssh libcrypto.so.0.9.8b [.] lh_insert 0.28% flush-8:0 [kernel.kallsyms] [k] block_write_full_page_endio 0.28% flush-8:0 [kernel.kallsyms] [k] generic_make_request These examples show *PICTURE PERFECT* backwards ABI compatibility, when using the bleeding perf tool on an ancient perf kernel (when it wasnt even called 'perf events' but 'perf counters'). [ Note, i didnt go back to v2.6.31, the oldest upstream perf kernel, because it's such a pain to build with recent binutils and recent GCC ... v2.6.32 already needed a workaround and a couple of .config tweaks to build and boot at all. ] Then i built the ancient v2.6.32 perf tool from 2 years ago: $ perf --version perf version 0.0.2.PERF and booted a fresh v3.1+ kernel: $ uname -a Linux mercury 3.1.0-tip+ #162138 SMP Tue Nov 8 11:14:26 CET 2011 x86_64 x86_64 x86_64 GNU/Linux $ perf stat ls Performance counter stats for 'ls': 1.739193 task-clock-msecs # 0.069 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 250 page-faults # 0.144 M/sec 3477562 cycles # 1999.526 M/sec 1661460 instructions # 0.478 IPC 839826 cache-references # 482.883 M/sec 15742 cache-misses # 9.051 M/sec 0.025231139 seconds time elapsed $ perf top ------------------------------------------------------------------------------ PerfTop: 38916 irqs/sec kernel:99.6% [100000 cycles], (all, 2 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 41191.00 - 53.1% : sha_transform 20818.00 - 26.8% : mix_pool_bytes_extract 5481.00 - 7.1% : _raw_spin_lock_irqsave 2132.00 - 2.7% : extract_buf 1788.00 - 2.3% : copy_user_generic_string 801.00 - 1.0% : acpi_pm_read 446.00 - 0.6% : _raw_spin_unlock_irqrestore 284.00 - 0.4% : __memset 259.00 - 0.3% : extract_entropy_user $ perf record -a -f sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.034 MB perf.data (~1467 samples) ] $ perf report # Samples: 1023 # # Overhead Command Shared Object Symbol # ........ ............. ................................ ...... # 4.50% swapper [kernel] [k] acpi_pm_read 4.01% swapper [kernel] [k] delay_tsc 2.05% sudo /lib64/libcrypto.so.0.9.8b [.] 0x000000000a0549 1.96% perf [kernel] [k] vsnprintf 1.86% swapper [kernel] [k] test_clear_page_writeback 1.66% perf [kernel] [k] format_decode 1.56% sudo /lib64/ld-2.7.so [.] do_lookup_x These examples show *PICTURE PERFECT* forwards ABI compatibility, using the ancient perf tool on a bleeding edge kernel. During the years we migrated across various transformations of the subsystem and added tons of features, while maintaining the perf ABI. I don't know where the whole ABI argument comes from - perf has argumably one of the best and most compatible tooling ABIs within Linux. I suspect back in the original perf flamewars people made up their mind prematurely that it 'cannot' possibly work and never changed their mind about it, regardless of reality proving them wrong ;-) And yes, the quality of the ABI and tooling cross-compatibility is not accidental at all, it is fully intentional and we take great care that it stays so. More than that we'll gladly take more 'perf test' testcases, for obscure corner-cases that other tools might rely on. I.e. we are willing to help external tooling to get their testcases built into the kernel repo. Note that such level of ABI support is arguably clearly overkill for instrumentation: which by its very nature tends to migrate to the newer versions - still we maintain it because in our opinion good, usable tooling should have a good, extensible ABI. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html