On Wed, 30 Oct 2024 16:36:19 +0000, Raghavendra Rao Ananta <rananta@xxxxxxxxxx> wrote: > > On Wed, Oct 30, 2024 at 1:22 AM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > On Wed, 30 Oct 2024 00:16:48 +0000, > > Raghavendra Rao Ananta <rananta@xxxxxxxxxx> wrote: > > > > > > On Tue, Oct 29, 2024 at 11:47 AM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > > > > > On Tue, 29 Oct 2024 17:06:09 +0000, > > > > Raghavendra Rao Ananta <rananta@xxxxxxxxxx> wrote: > > > > > > > > > > On Tue, Oct 29, 2024 at 9:27 AM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, 28 Oct 2024 23:45:33 +0000, > > > > > > Raghavendra Rao Ananta <rananta@xxxxxxxxxx> wrote: > > > > > > > > > > > > > Did you have a chance to check whether this had any negative impact on > > > > > > actual workloads? Since the entry/exit code is a bit of a hot spot, > > > > > > I'd like to make sure we're not penalising the common case (I only > > > > > > wrote this patch while waiting in an airport, and didn't test it at > > > > > > all). > > > > > > > > > > > I ran the kvm selftests, kvm-unit-tests and booted a linux guest to > > > > > test the change and noticed no failures. > > > > > Any specific test you want to try out? > > > > > > > > My question is not about failures (I didn't expect any), but > > > > specifically about *performance*, and whether checking the flag > > > > without a static key can lead to any performance drop on the hot path. > > > > > > > > Can you please run an exit-heavy workload (such as hackbench, for > > > > example), and report any significant delta you could measure? > > > > > > Oh, I see. I ran hackbench and micro-bench from kvm-unit-tests (which > > > also causes a lot of entry/exits), on Ampere Altra with kernel at > > > v6.12-rc1, and see no significant difference in perf. > > > > Thanks for running this stuff. > > > > > timer_10ms 231040.0 902.0 > > > timer_10ms 234120.0 914.0 > > > > This seems to be the only case were we are adversely affected by this > > change. > Hmm, I'm not sure how much we want to trust this comparison. For > instance, I just ran micro-bench again a few more times and here are > the outcomes of timer_10ms for each try with the patch: > > Tries total ns > avg ns > ----------------------------------------------------------------------------------- > 1_timer_10ms 231840.0 905.0 > 2_timer_10ms 234560.0 916.0 > 3_timer_10ms 227440.0 888.0 > 4_timer_10ms 236640.0 924.0 > 5_timer_10ms 231200.0 903.0 > > Here's a few on the baseline: > > Tries total ns > avg ns > ----------------------------------------------------------------------------------- > 1_timer_10ms 231080.0 902.0 > 2_timer_10ms 238040.0 929.0 > 3_timer_10ms 231680.0 905.0 > 4_timer_10ms 229280.0 895.0 > 5_timer_10ms 228520.0 892.0 OK, so this benchmark is all over the place, and we can't derive much from it. > > In the grand scheme of thins, that's noise. But this gives us > > a clear line of sight for the removal of the in-kernel interrupts back > > to userspace. > Sorry, I didn't follow you completely on this part. Just me moaning. The code that was gated by the static key that you just removed is used to signal interrupts from the kernel back to userspace, and I'm resisting the urge to remove it altogether now. M. -- Without deviation from the norm, progress is not possible.