On 03/08/2022 22:47, Adam Williamson wrote:
On Sun, 2022-07-24 at 10:28 +0100, Richard W.M. Jones wrote:
The current Fedora Rawhide kernels are too slow to run libguestfs
tests when doing Koji builds. These run in a qemu VM, running the
Rawhide kernel, emulated using software virtualization (ie. TCG).
They now time out because these kernels are so slow. Until fairly
recently they were slow but working.
I wondered if particular debug options had a greater effect on
performance, so I compiled many kernels (v5.19-rc7 from upstream)
using the baseline "no debug" config, then adding each debug option
that we use in turn, and measuring the performance using [1], using
qemu software virtualization (TCG). The tests were run many times
with warmups discarded to get the mean and standard deviation, using
the hyperfine program[2].
The results are below, and not very conclusive, but some options do
have a very large performance impact.
NO_DEBUG is the kernel compiled with no debug options enabled (ie. the
baseline).
In the actual debug kernel I expect the slow downs to be multiplied
together. To test that I did an extra run with all debug options
enabled (ALL_DEBUG).
CONFIG_PROVE_LOCKING, CONFIG_LOCK_STAT and CONFIG_DEBUG_LOCK_ALLOC
were present and enabled in the kernel when it was imported into git
in 2010.
CONFIG_DEBUG_WW_MUTEX_SLOWPATH was turned off in the past
(RHBZ#1114160). It seems to have been switched on again in 2020.
CONFIG_DEBUG_KMEMLEAK seems like it was enabled in 2012.
It's also possible that an existing debug option has got slower in the
upstream kernel, that is, it's not that we've recently changed
something in Fedora.
Thanks a lot for this work, Richard! And thanks to Justin for looking
at it. I would be super appreciative of anything we can do to reduce
the performance hit here, as it is also an issue for openQA testing -
we get noticeably more test failures due to timeouts, things taking
longer than expected, or typing errors when Rawhide is on a debug
kernel.
In Cockpit we recently enabled rawhide testing on the testing farm and
noticed similar performance issues. [1]
In comparison to Fedora 36 it takes 5 minutes longer in one test
scenario. So it would be great to speed that up a bit!
[1] https://gitlab.com/testing-farm/general/-/issues/45
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue