On 05/02/2015 21:39, David Matlack wrote: >> This parameter helps a lot for latency-bound workloads [...] >> KVM's performance here is usually around 30% of bare metal, >> or 50% if you use cache=directsync or cache=writethrough. >> With this patch performance reaches 60-65% of bare metal and, more >> important, 99% of what you get if you use idle=poll in the guest. > > I used loopback TCP_RR and loopback memcache as benchmarks for halt > polling. I saw very similar results as you (before: 40% bare metal, > after: 60-65% bare metal and 95% of guest idle=poll). Good that it also works for network! > Reviewed-by: David Matlack <dmatlack@xxxxxxxxxx> > >> arch/x86/include/asm/kvm_host.h | 1 + >> arch/x86/kvm/x86.c | 28 ++++++++++++++++++++++++---- >> include/linux/kvm_host.h | 1 + >> virt/kvm/kvm_main.c | 22 +++++++++++++++------- >> 4 files changed, 41 insertions(+), 11 deletions(-) >> >> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h >> index 848947ac6ade..a236e39cc385 100644 >> --- a/arch/x86/include/asm/kvm_host.h >> +++ b/arch/x86/include/asm/kvm_host.h >> @@ -655,6 +655,7 @@ struct kvm_vcpu_stat { >> u32 irq_window_exits; >> u32 nmi_window_exits; >> u32 halt_exits; >> + u32 halt_successful_poll; >> u32 halt_wakeup; >> u32 request_irq_exits; >> u32 irq_exits; >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 1373e04e1f19..b7b20828f01c 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -96,6 +96,9 @@ EXPORT_SYMBOL_GPL(kvm_x86_ops); >> static bool ignore_msrs = 0; >> module_param(ignore_msrs, bool, S_IRUGO | S_IWUSR); >> >> +unsigned int halt_poll = 0; >> +module_param(halt_poll, uint, S_IRUGO | S_IWUSR); > > Suggest encoding the units in the name. "halt_poll_cycles" in this case. I left it out because of the parallel with ple_window/ple_gap. But I will call it "halt_poll_ns" in the next version. >> + >> unsigned int min_timer_period_us = 500; >> module_param(min_timer_period_us, uint, S_IRUGO | S_IWUSR); >> >> @@ -145,6 +148,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { >> { "irq_window", VCPU_STAT(irq_window_exits) }, >> { "nmi_window", VCPU_STAT(nmi_window_exits) }, >> { "halt_exits", VCPU_STAT(halt_exits) }, >> + { "halt_successful_poll", VCPU_STAT(halt_successful_poll) }, >> { "halt_wakeup", VCPU_STAT(halt_wakeup) }, >> { "hypercalls", VCPU_STAT(hypercalls) }, >> { "request_irq", VCPU_STAT(request_irq_exits) }, >> @@ -5819,13 +5823,29 @@ void kvm_arch_exit(void) >> int kvm_emulate_halt(struct kvm_vcpu *vcpu) >> { >> ++vcpu->stat.halt_exits; >> - if (irqchip_in_kernel(vcpu->kvm)) { >> - vcpu->arch.mp_state = KVM_MP_STATE_HALTED; >> - return 1; >> - } else { >> + if (!irqchip_in_kernel(vcpu->kvm)) { >> vcpu->run->exit_reason = KVM_EXIT_HLT; >> return 0; >> } >> + >> + vcpu->arch.mp_state = KVM_MP_STATE_HALTED; >> + if (halt_poll) { > > Would it be useful to poll in kvm_vcpu_block() for the benefit of all > arch's? Sure. Especially if I use time instead of cycles. >> + u64 start, curr; >> + rdtscll(start); > > Why cycles instead of time? People's love for rdtsc grows every time you tell them it's wrong... >> + do { >> + /* >> + * This sets KVM_REQ_UNHALT if an interrupt >> + * arrives. >> + */ >> + if (kvm_vcpu_check_block(vcpu) < 0) { >> + ++vcpu->stat.halt_successful_poll; >> + break; >> + } >> + rdtscll(curr); >> + } while(!need_resched() && curr - start < halt_poll); > > I found that using need_resched() was not sufficient at preventing > VCPUs from delaying their own progress. To test this try running with > and without polling on a 2 VCPU VM, confined to 1 PCPU, that is running > loopback TCP_RR in the VM. The problem goes away if you stop polling as > soon as there are runnable threads on your cpu. (e.g. use > "single_task_running()" instead of "!need_resched()" > http://lxr.free-electrons.com/source/kernel/sched/core.c#L2398 ). This > also guarantees polling only delays the idle thread. Great, I'll include all of your suggestions in the next version of the patch. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html