On Tue, Sep 1, 2015 at 3:30 PM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> wrote: > On 9/2/15 5:45 AM, David Matlack wrote: >> >> On Thu, Aug 27, 2015 at 2:47 AM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> wrote: >>> >>> v3 -> v4: >>> * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks >>> when idle VCPU is detected >>> >>> v2 -> v3: >>> * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or >>> /halt_poll_ns_shrink >>> * drop the macros and hard coding the numbers in the param definitions >>> * update the comments "5-7 us" >>> * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns >>> time, >>> vcpu->halt_poll_ns start at zero >>> * drop the wrappers >>> * move the grow/shrink logic before "out:" w/ "if (waited)" >> >> I posted a patchset which adds dynamic poll toggling (on/off switch). I >> think >> this gives you a good place to build your dynamic growth patch on top. The >> toggling patch has close to zero overhead for idle VMs and equivalent >> performance VMs doing message passing as always-poll. It's a patch that's >> been >> in my queue for a few weeks but just haven't had the time to send out. We >> can >> win even more with your patchset by only polling as much as we need (via >> dynamic growth/shrink). It also gives us a better place to stand for >> choosing >> a default for halt_poll_ns. (We can run experiments and see how high >> vcpu->halt_poll_ns tends to grow.) >> >> The reason I posted a separate patch for toggling is because it adds >> timers >> to kvm_vcpu_block and deals with a weird edge case (kvm_vcpu_block can get >> called multiple times for one halt). To do dynamic poll adjustment >> correctly, >> we have to time the length of each halt. Otherwise we hit some bad edge >> cases: >> >> v3: v3 had lots of idle overhead. It's because vcpu->halt_poll_ns grew >> every >> time we had a long halt. So idle VMs looked like: 0 us -> 500 us -> 1 >> ms -> >> 2 ms -> 4 ms -> 0 us. Ideally vcpu->halt_poll_ns should just stay at 0 >> when >> the halts are long. >> >> v4: v4 fixed the idle overhead problem but broke dynamic growth for >> message >> passing VMs. Every time a VM did a short halt, vcpu->halt_poll_ns would >> grow. >> That means vcpu->halt_poll_ns will always be maxed out, even when the >> halt >> time is much less than the max. >> >> I think we can fix both edge cases if we make grow/shrink decisions based >> on >> the length of kvm_vcpu_block rather than the arrival of a guest interrupt >> during polling. >> >> Some thoughts for dynamic growth: >> * Given Windows 10 timer tick (1 ms), let's set the maximum poll time >> to >> less than 1ms. 200 us has been a good value for always-poll. We can >> probably go a bit higher once we have your patch. Maybe 500 us? >> >> * The base case of dynamic growth (the first grow() after being at 0) >> should >> be small. 500 us is too big. When I run TCP_RR in my guest I see poll >> times >> of < 10 us. TCP_RR is on the lower-end of message passing workload >> latency, >> so 10 us would be a good base case. > > > How to get your TCP_RR benchmark? > > Regards, > Wanpeng Li Install the netperf package, or build from here: http://www.netperf.org/netperf/DownloadNetperf.html In the vm: # ./netserver # ./netperf -t TCP_RR Be sure to use an SMP guest (we want TCP_RR to be a cross-core message passing workload in order to test halt-polling). > > >>> v1 -> v2: >>> * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of >>> the module parameter >>> * use the shrink/grow matrix which is suggested by David >>> * set halt_poll_ns_max to 2ms >>> >>> There is a downside of halt_poll_ns since poll is still happen for idle >>> VCPU which can waste cpu usage. This patchset add the ability to adjust >>> halt_poll_ns dynamically, grows halt_poll_ns if an interrupt arrives and >>> shrinks halt_poll_ns when idle VCPU is detected. >>> >>> There are two new kernel parameters for changing the halt_poll_ns: >>> halt_poll_ns_grow and halt_poll_ns_shrink. >>> >>> >>> Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of >>> halt-poll is the default 500000ns, the max halt_poll_ns of dynamic >>> halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool. >>> The test method is almost from David. >>> >>> +-----------------+----------------+-------------------+ >>> | | | | >>> | w/o halt-poll | w/ halt-poll | dynamic halt-poll | >>> +-----------------+----------------+-------------------+ >>> | | | | >>> | ~0.9% | ~1.8% | ~1.2% | >>> +-----------------+----------------+-------------------+ >>> >>> The always halt-poll will increase ~0.9% cpu usage for idle vCPUs and the >>> dynamic halt-poll drop it to ~0.3% which means that reduce the 67% >>> overhead >>> introduced by always halt-poll. >>> >>> Wanpeng Li (3): >>> KVM: make halt_poll_ns per-VCPU >>> KVM: dynamic halt_poll_ns adjustment >>> KVM: trace kvm_halt_poll_ns grow/shrink >>> >>> include/linux/kvm_host.h | 1 + >>> include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++ >>> virt/kvm/kvm_main.c | 50 >>> +++++++++++++++++++++++++++++++++++++++++++--- >>> 3 files changed, 78 insertions(+), 3 deletions(-) >>> -- >>> 1.9.1 >>> > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html