[RFC PATCH V2 0/4] Utilizing VMX preemption for timer virtualization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The VMX-preemption timer is a feature on VMX, it counts down, from the
value loaded by VM entry, in VMX nonroot operation. When the timer
counts down to zero, it stops counting down and a VM exit occurs.

This patchset utilize VMX preemption timer for tsc deadline timer
virtualization. The VMX preemption timer is armed before the vm-entry if the
tsc deadline timer is enabled. A VMExit will happen if the virtual TSC
deadline timer expires.

When the vCPU thread is scheduled out, the tsc deadline timer
virtualization will be switched to use the current solution, i.e. use
the timer for it. It's switched back to VMX preemption timer when the
vCPU thread is scheduled int.

This solution replace the complex OS's hrtimer system, and also the
host timer interrupt handling cost, with a preemption_timer VMexit. It
fits well for some NFV usage scenario, when the vCPU is bound to a
pCPU and the pCPU is isolated, or some similar scenarioes.

However, it possibly has impact if the vCPU thread is scheduled in/out
very frequently, because it switches from/to the hrtimer emulation a
lot. A module parameter is provided to turn it on or off.

Signed-off-by: Yunhong Jiang <yunhong.jiang@xxxxxxxxx>

Performance Evalaution:
Host:
[nfv@otcnfv02 ~]$ cat /proc/cpuinfo
....
cpu family      : 6
model           : 63
model name      : Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz

Guest:
Two vCPU with vCPU pinned to isolated pCPUs, idle=poll on guest kernel.
When the vCPU is not pinned, the benefit is smaller than pinned situation.

Test tools:
cyclictest [1] running 10 minutes with 1ms interval, i.e. 600000 loop in
total.

1. enable_hv_timer=Y.

# Histogram
......
000003 000000
000004 000029
000005 023017
000006 357485
000007 192723
000008 026141
000009 000106
000010 000067
......
# Min Latencies: 00004
# Avg Latencies: 00006

2. enable_hv_timer=N.

# Histogram
......
000004 000000
000005 000074
000006 001943
000007 005820
000008 164729
000009 424401
000010 001964
000011 000252
000012 000190
......
# Min Latencies: 00005
# Avg Latencies: 00010

Changes since v1 [2]:

* Remove the vmx_sched_out and no changes to kvm_x86_ops for it.
* Remove the two expired timer checkings on each vm-entry.
* Rename the hwemul_timer to hv_timer
* Clear vmx_x86_ops's membership if preemption timer is not usable.
* Cache cpu_preemption_timer_multi.
* Keep the tracepoint with the function patch.
* Other minor changes based on Paolo's review.

[1] https://rt.wiki.kernel.org/index.php/Cyclictest
[2] http://www.spinics.net/lists/kvm/msg132895.html

Yunhong Jiang (4):
  Add the kvm sched_out hook
  Utilize the vmx preemption timer
  Separate the start_sw_tscdeadline
  Utilize the vmx preemption timer for tsc deadline timer

 arch/arm/include/asm/kvm_host.h     |   1 +
 arch/mips/include/asm/kvm_host.h    |   1 +
 arch/powerpc/include/asm/kvm_host.h |   1 +
 arch/s390/include/asm/kvm_host.h    |   1 +
 arch/x86/include/asm/kvm_host.h     |   4 +
 arch/x86/kvm/lapic.c                | 144 ++++++++++++++++++++++++++++++------
 arch/x86/kvm/lapic.h                |  11 +++
 arch/x86/kvm/trace.h                |  22 ++++++
 arch/x86/kvm/vmx.c                  |  51 ++++++++++++-
 arch/x86/kvm/x86.c                  |   8 ++
 include/linux/kvm_host.h            |   1 +
 virt/kvm/kvm_main.c                 |   1 +
 12 files changed, 221 insertions(+), 25 deletions(-)

TODO:
	Find out the CPUs with VMX preemption timer broken.
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux