On Wed, Oct 30, 2024 at 1:22 AM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > On Wed, 30 Oct 2024 00:16:48 +0000, > Raghavendra Rao Ananta <rananta@xxxxxxxxxx> wrote: > > > > On Tue, Oct 29, 2024 at 11:47 AM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > > > On Tue, 29 Oct 2024 17:06:09 +0000, > > > Raghavendra Rao Ananta <rananta@xxxxxxxxxx> wrote: > > > > > > > > On Tue, Oct 29, 2024 at 9:27 AM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > > > > > > > On Mon, 28 Oct 2024 23:45:33 +0000, > > > > > Raghavendra Rao Ananta <rananta@xxxxxxxxxx> wrote: > > > > > > > > > > > Did you have a chance to check whether this had any negative impact on > > > > > actual workloads? Since the entry/exit code is a bit of a hot spot, > > > > > I'd like to make sure we're not penalising the common case (I only > > > > > wrote this patch while waiting in an airport, and didn't test it at > > > > > all). > > > > > > > > > I ran the kvm selftests, kvm-unit-tests and booted a linux guest to > > > > test the change and noticed no failures. > > > > Any specific test you want to try out? > > > > > > My question is not about failures (I didn't expect any), but > > > specifically about *performance*, and whether checking the flag > > > without a static key can lead to any performance drop on the hot path. > > > > > > Can you please run an exit-heavy workload (such as hackbench, for > > > example), and report any significant delta you could measure? > > > > Oh, I see. I ran hackbench and micro-bench from kvm-unit-tests (which > > also causes a lot of entry/exits), on Ampere Altra with kernel at > > v6.12-rc1, and see no significant difference in perf. > > Thanks for running this stuff. > > > timer_10ms 231040.0 902.0 > > timer_10ms 234120.0 914.0 > > This seems to be the only case were we are adversely affected by this > change. Hmm, I'm not sure how much we want to trust this comparison. For instance, I just ran micro-bench again a few more times and here are the outcomes of timer_10ms for each try with the patch: Tries total ns avg ns ----------------------------------------------------------------------------------- 1_timer_10ms 231840.0 905.0 2_timer_10ms 234560.0 916.0 3_timer_10ms 227440.0 888.0 4_timer_10ms 236640.0 924.0 5_timer_10ms 231200.0 903.0 Here's a few on the baseline: Tries total ns avg ns ----------------------------------------------------------------------------------- 1_timer_10ms 231080.0 902.0 2_timer_10ms 238040.0 929.0 3_timer_10ms 231680.0 905.0 4_timer_10ms 229280.0 895.0 5_timer_10ms 228520.0 892.0 > In the grand scheme of thins, that's noise. But this gives us > a clear line of sight for the removal of the in-kernel interrupts back > to userspace. Sorry, I didn't follow you completely on this part. Thank you. Raghavendra