On Fri, 2 Aug 2019 12:46:33 +0200 Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > On 02/08/19 12:37, Marc Zyngier wrote: > > When a vpcu is about to block by calling kvm_vcpu_block, we call > > back into the arch code to allow any form of synchronization that > > may be required at this point (SVN stops the AVIC, ARM synchronises > > the VMCR and enables GICv4 doorbells). But this synchronization > > comes in quite late, as we've potentially waited for halt_poll_ns > > to expire. > > > > Instead, let's move kvm_arch_vcpu_blocking() to the beginning of > > kvm_vcpu_block(), which on ARM has several benefits: > > > > - VMCR gets synchronised early, meaning that any interrupt delivered > > during the polling window will be evaluated with the correct guest > > PMR > > - GICv4 doorbells are enabled, which means that any guest interrupt > > directly injected during that window will be immediately recognised > > > > Tang Nianyao ran some tests on a GICv4 machine to evaluate such > > change, and reported up to a 10% improvement for netperf: > > > > <quote> > > netperf result: > > D06 as server, intel 8180 server as client > > with change: > > package 512 bytes - 5500 Mbits/s > > package 64 bytes - 760 Mbits/s > > without change: > > package 512 bytes - 5000 Mbits/s > > package 64 bytes - 710 Mbits/s > > </quote> > > > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > > --- > > virt/kvm/kvm_main.c | 7 +++---- > > 1 file changed, 3 insertions(+), 4 deletions(-) > > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index 887f3b0c2b60..90d429c703cb 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -2322,6 +2322,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) > > bool waited = false; > > u64 block_ns; > > > > + kvm_arch_vcpu_blocking(vcpu); > > + > > start = cur = ktime_get(); > > if (vcpu->halt_poll_ns && !kvm_arch_no_poll(vcpu)) { > > ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns); > > @@ -2342,8 +2344,6 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) > > } while (single_task_running() && ktime_before(cur, stop)); > > } > > > > - kvm_arch_vcpu_blocking(vcpu); > > - > > for (;;) { > > prepare_to_swait_exclusive(&vcpu->wq, &wait, TASK_INTERRUPTIBLE); > > > > @@ -2356,9 +2356,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) > > > > finish_swait(&vcpu->wq, &wait); > > cur = ktime_get(); > > - > > - kvm_arch_vcpu_unblocking(vcpu); > > out: > > + kvm_arch_vcpu_unblocking(vcpu); > > block_ns = ktime_to_ns(cur) - ktime_to_ns(start); > > > > if (!vcpu_valid_wakeup(vcpu)) > > > > Acked-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> Thanks for that. I've pushed this patch into -next so that it gets a bit of exposure (I haven't heard from the AMD folks, and I'd like to make sure it doesn't regress their platforms). M. -- Without deviation from the norm, progress is not possible.