On 9/10/15 3:13 PM, Christian Borntraeger wrote:
Am 10.09.2015 um 03:55 schrieb Wanpeng Li:
On 9/9/15 9:39 PM, Christian Borntraeger wrote:
Am 03.09.2015 um 16:07 schrieb Wanpeng Li:
v6 -> v7:
* explicit signal (set a bool)
* fix the tracepoint
v5 -> v6:
* fix wait_ns and poll_ns
v4 -> v5:
* set base case 10us and max poll time 500us
* handle short/long halt, idea from David, many thanks David
v3 -> v4:
* bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
when idle VCPU is detected
v2 -> v3:
* grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or /halt_poll_ns_shrink
* drop the macros and hard coding the numbers in the param definitions
* update the comments "5-7 us"
* remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns time,
vcpu->halt_poll_ns start at zero
* drop the wrappers
* move the grow/shrink logic before "out:" w/ "if (waited)"
v1 -> v2:
* change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of
the module parameter
* use the shrink/grow matrix which is suggested by David
* set halt_poll_ns_max to 2ms
There is a downside of always-poll since poll is still happened for idle
vCPUs which can waste cpu usage. This patchset add the ability to adjust
halt_poll_ns dynamically, to grow halt_poll_ns when shot halt is detected,
and to shrink halt_poll_ns when long halt is detected.
There are two new kernel parameters for changing the halt_poll_ns:
halt_poll_ns_grow and halt_poll_ns_shrink.
no-poll always-poll dynamic-poll
-----------------------------------------------------------------------
Idle (nohz) vCPU %c0 0.15% 0.3% 0.2%
Idle (250HZ) vCPU %c0 1.1% 4.6%~14% 1.2%
TCP_RR latency 34us 27us 26.7us
"Idle (X) vCPU %c0" is the percent of time the physical cpu spent in
c0 over 60 seconds (each vCPU is pinned to a pCPU). (nohz) means the
guest was tickless. (250HZ) means the guest was ticking at 250HZ.
The big win is with ticking operating systems. Running the linux guest
with nohz=off (and HZ=250), we save 3.4%~12.8% CPUs/second and get close
to no-polling overhead levels by using the dynamic-poll. The savings
should be even higher for higher frequency ticks.
Wanpeng Li (3):
KVM: make halt_poll_ns per-vCPU
KVM: dynamic halt-polling
KVM: trace kvm_halt_poll_ns grow/shrink
include/linux/kvm_host.h | 1 +
include/trace/events/kvm.h | 30 +++++++++++++++++++
virt/kvm/kvm_main.c | 72 ++++++++++++++++++++++++++++++++++++++++++----
3 files changed, 97 insertions(+), 6 deletions(-)
I get some nice improvements for uperf between 2 guests,
Good to hear that.
but there is one "bug":
If there is already some polling ongoing, its impossible to disable the polling,
The polling will stop if long halt is detected, and there is no need to manual tuning. Just like dynamise PLE window can detect false positive and handle ple window suitably.
Yes, but as soon as somebody sets halt_poll_ns to 0, polling will never stop,
as grow and shrink are only handled if halt_poll_ns is !=0.
Good catch! I will send a patch to fix it.
Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html