2017-05-16 18:58+0200, Paolo Bonzini: > On 18/04/2017 12:41, Paolo Bonzini wrote: >> In some fio benchmarks, halt_poll_ns=400000 caused CPU utilization to >> increase heavily even in cases where the performance improvement was >> small. In particular, bandwidth divided by CPU usage was as much as >> 60% lower. >> >> To some extent this is the expected effect of the patch, and the >> additional CPU utilization is only visible when running the >> benchmarks. However, halving the threshold also halves the extra >> CPU utilization (from +30-130% to +20-70%) and has no negative >> effect on performance. >> >> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > Ping? I didn't see any regression in crude benchmarks either and 200 us seems better anyway (just under 1/2 of Windows' timer frequency). Queued for rc2 as it is simple enough, thanks. --- Still, I think we have dynamic polling to mitigate this overhead; how was it behaving? I noticed a questionable decision in growing the window: we know how long the polling should have been (block_ns), but we do not use that information to set the next halt_poll_ns. Has something like this been tried? diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index f0fe9d02f6bb..d8dbf50957fc 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2193,7 +2193,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) /* we had a short halt and our poll time is too small */ else if (vcpu->halt_poll_ns < halt_poll_ns && block_ns < halt_poll_ns) - grow_halt_poll_ns(vcpu); + vcpu->halt_poll_ns = block_ns /* + x ? */; } else vcpu->halt_poll_ns = 0; It would avoid a case where several halts in a row were interrupted after 300 us, but on the first one we'd schedule out after 10 us, then after 20, 40, 80, 160, and finally have the successful poll at 320 us, but we have just wasted time if the window is reset at any point before that. (I really don't like benchmarking ...) Thanks.