Re: [PATCH v7 0/3] KVM: Dynamic Halt-Polling

Wanpeng Li <wanpeng.li@xxxxxxxxxxx> · Thu, 10 Sep 2015 09:55:18 +0800

On 9/9/15 9:39 PM, Christian Borntraeger wrote:
Am 03.09.2015 um 16:07 schrieb Wanpeng Li:
v6 -> v7:
  * explicit signal (set a bool)
  * fix the tracepoint

v5 -> v6:
  * fix wait_ns and poll_ns

v4 -> v5:
  * set base case 10us and max poll time 500us
  * handle short/long halt, idea from David, many thanks David

v3 -> v4:
  * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
    when idle VCPU is detected

v2 -> v3:
  * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or /halt_poll_ns_shrink
  * drop the macros and hard coding the numbers in the param definitions
  * update the comments "5-7 us"
  * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns time,
    vcpu->halt_poll_ns start at zero
  * drop the wrappers
  * move the grow/shrink logic before "out:" w/ "if (waited)"

v1 -> v2:
  * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of
    the module parameter
  * use the shrink/grow matrix which is suggested by David
  * set halt_poll_ns_max to 2ms

There is a downside of always-poll since poll is still happened for idle
vCPUs which can waste cpu usage. This patchset add the ability to adjust
halt_poll_ns dynamically, to grow halt_poll_ns when shot halt is detected,
and to shrink halt_poll_ns when long halt is detected.

There are two new kernel parameters for changing the halt_poll_ns:
halt_poll_ns_grow and halt_poll_ns_shrink.

                         no-poll      always-poll    dynamic-poll
-----------------------------------------------------------------------
Idle (nohz) vCPU %c0     0.15%        0.3%            0.2%
Idle (250HZ) vCPU %c0    1.1%         4.6%~14%        1.2%
TCP_RR latency           34us         27us            26.7us

"Idle (X) vCPU %c0" is the percent of time the physical cpu spent in
c0 over 60 seconds (each vCPU is pinned to a pCPU). (nohz) means the
guest was tickless. (250HZ) means the guest was ticking at 250HZ.

The big win is with ticking operating systems. Running the linux guest
with nohz=off (and HZ=250), we save 3.4%~12.8% CPUs/second and get close
to no-polling overhead levels by using the dynamic-poll. The savings
should be even higher for higher frequency ticks.

Wanpeng Li (3):
   KVM: make halt_poll_ns per-vCPU
   KVM: dynamic halt-polling
   KVM: trace kvm_halt_poll_ns grow/shrink

  include/linux/kvm_host.h   |  1 +
  include/trace/events/kvm.h | 30 +++++++++++++++++++
  virt/kvm/kvm_main.c        | 72 ++++++++++++++++++++++++++++++++++++++++++----
  3 files changed, 97 insertions(+), 6 deletions(-)

I get some nice improvements for uperf between 2 guests,

Good to hear that.

but there is one "bug":
If there is already some polling ongoing, its impossible to disable the polling,

The polling will stop if long halt is detected, and there is no need to 
manual tuning. Just like dynamise PLE window can detect false positive 
and handle ple window suitably.

Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html