On 13/06/2016 23:19, Yunhong Jiang wrote: > The VMX-preemption timer is a feature on VMX, it counts down, from the > value loaded by VM entry, in VMX nonroot operation. When the timer > counts down to zero, it stops counting down and a VM exit occurs. > > This patchset utilize VMX preemption timer for tsc deadline timer > virtualization. The VMX preemption timer is armed before the vm-entry if the > tsc deadline timer is enabled. A VMExit will happen if the virtual TSC > deadline timer expires. > > When the vCPU thread is blocked because of HLT instruction, the tsc deadline > timer virtualization will be switched to use the current solution, i.e. use > the timer for it. It's switched back to VMX preemption timer when the vCPU > thread is unblocked. > > This solution replace the complex OS's hrtimer system, and also the host > timer interrupt handling cost, with a preemption_timer VMexit. It fits well > for some NFV usage scenario, when the vCPU is bound to a pCPU and the pCPU > is isolated, or some similar scenarioes. > > The benefits offset the small extra work to do on each VM-entry to setup the > preemption timer I made a similar test with tscdeadline_latency from kvm-unit-tests. With your patches: - we lose about ~20 clock cycles in the worst case where we HLT just after programming the TSC deadline timer - we gain ~800 clock cycles (which is a 25% reduction in latency) in the best case where the test busy waits. Good job! Paolo > Signed-off-by: Yunhong Jiang <yunhong.jiang@xxxxxxxxx> > > Performance Evalaution: > Host: > [nfv@otcnfv02 ~]$ cat /proc/cpuinfo > .... > cpu family : 6 > model : 63 > model name : Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz > > Guest: > Two vCPU with vCPU pinned to isolated pCPUs, idle=poll on guest kernel. > > Test tools: > cyclictest [1] running 10 minutes with 1ms interval, i.e. 600000 loop in > total. > > 1. enable_hv_timer=Y. > > # Histogram > ...... > 000003 000000 > 000004 000024 > 000005 049862 > 000006 470060 > 000007 066930 > 000008 009364 > 000009 002496 > 000010 001063 > 000011 000106 > 000012 000023 > > ...... > # Total: 000599993 > # Min Latencies: 00004 > # Avg Latencies: 00006 > > 2. enable_hv_timer=N. > > # Histogram > ...... > 000003 000000 > 000004 000000 > 000005 000169 > 000006 000804 > 000007 017241 > 000008 207048 > 000009 271618 > 000010 100960 > 000011 000507 > 000012 000562 > > ...... > # Total: 000599994 > # Min Latencies: 00005 > # Avg Latencies: 00008 > > Changes since v3 [4]: > * On kvm_arch_vcpu_load, do the set_hv_timer() again. > * On vmx_arm_hv_timer(), check the vcpu->arch.hv_deadline_tsc, instead of > the apic->lapic_timer.hv_timer_in_use. > * Change return of vmx_set_hv_timer() from -1 to ERANGE. > * Initialize the kvm_x86_ops->set_hv_timer/cancel_hv_timer() only on > CONFIG_X86_64 > > Changes since v2 [3]: > * Switch on HLT instruction instead of sched_out/sched_in. > * VMX preemption timer is broken on some CPU, added the check. > * Reduce the overhead to the vm-entry code path. We calculate the > host deadline tsc in advance and set the vmcs exec_control earlier. > * Adding the TSC scaling support. This codepath is not tested yet because > still looking for a platform with TSC scaling capability. > * Checking if the host delta TSC, after the multication, will be more than 32 > bit, which is the width of the vmcs field. > > Changes since v1 [2]: > * Remove the vmx_sched_out and no changes to kvm_x86_ops for it. > * Remove the two expired timer checkings on each vm-entry. > * Rename the hwemul_timer to hv_timer > * Clear vmx_x86_ops's membership if preemption timer is not usable. > * Cache cpu_preemption_timer_multi. > * Keep the tracepoint with the function patch. > * Other minor changes based on Paolo's review. > > [1] https://rt.wiki.kernel.org/index.php/Cyclictest > [2] http://www.spinics.net/lists/kvm/msg132895.html > [3] http://www.spinics.net/lists/kvm/msg133185.html > [4] http://www.spinics.net/lists/kvm/msg133538.html > > Yunhong Jiang (4): > Rename the vmx_pre/post_block to pi_pre/post_block > Utilize the vmx preemption timer > Separate the start_sw_tscdeadline > Utilize the vmx preemption timer for tsc deadline timer > > arch/x86/include/asm/kvm_host.h | 6 ++ > arch/x86/kvm/lapic.c | 122 ++++++++++++++++++++----- > arch/x86/kvm/lapic.h | 5 ++ > arch/x86/kvm/trace.h | 16 ++++ > arch/x86/kvm/vmx.c | 192 +++++++++++++++++++++++++++++++++++++++- > arch/x86/kvm/x86.c | 5 ++ > 6 files changed, 320 insertions(+), 26 deletions(-) > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html