On Tue, Sep 1, 2015 at 3:58 PM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> wrote: > On 9/2/15 6:34 AM, David Matlack wrote: >> >> On Tue, Sep 1, 2015 at 3:30 PM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> wrote: >>> >>> On 9/2/15 5:45 AM, David Matlack wrote: >>>> >>>> On Thu, Aug 27, 2015 at 2:47 AM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >>>> wrote: >>>>> >>>>> v3 -> v4: >>>>> * bring back grow vcpu->halt_poll_ns when interrupt arrives and >>>>> shrinks >>>>> when idle VCPU is detected >>>>> >>>>> v2 -> v3: >>>>> * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or >>>>> /halt_poll_ns_shrink >>>>> * drop the macros and hard coding the numbers in the param >>>>> definitions >>>>> * update the comments "5-7 us" >>>>> * remove halt_poll_ns_max and use halt_poll_ns as the max >>>>> halt_poll_ns >>>>> time, >>>>> vcpu->halt_poll_ns start at zero >>>>> * drop the wrappers >>>>> * move the grow/shrink logic before "out:" w/ "if (waited)" >>>> >>>> I posted a patchset which adds dynamic poll toggling (on/off switch). I >>>> think >>>> this gives you a good place to build your dynamic growth patch on top. >>>> The >>>> toggling patch has close to zero overhead for idle VMs and equivalent >>>> performance VMs doing message passing as always-poll. It's a patch >>>> that's >>>> been >>>> in my queue for a few weeks but just haven't had the time to send out. >>>> We >>>> can >>>> win even more with your patchset by only polling as much as we need (via >>>> dynamic growth/shrink). It also gives us a better place to stand for >>>> choosing >>>> a default for halt_poll_ns. (We can run experiments and see how high >>>> vcpu->halt_poll_ns tends to grow.) >>>> >>>> The reason I posted a separate patch for toggling is because it adds >>>> timers >>>> to kvm_vcpu_block and deals with a weird edge case (kvm_vcpu_block can >>>> get >>>> called multiple times for one halt). To do dynamic poll adjustment > > > Why this can happen? Ah, probably because I'm missing 9c8fd1ba220 (KVM: x86: optimize delivery of TSC deadline timer interrupt). I don't think the edge case exists in the latest kernel. > > >>>> correctly, >>>> we have to time the length of each halt. Otherwise we hit some bad edge >>>> cases: >>>> >>>> v3: v3 had lots of idle overhead. It's because vcpu->halt_poll_ns >>>> grew >>>> every >>>> time we had a long halt. So idle VMs looked like: 0 us -> 500 us -> >>>> 1 >>>> ms -> >>>> 2 ms -> 4 ms -> 0 us. Ideally vcpu->halt_poll_ns should just stay at >>>> 0 >>>> when >>>> the halts are long. >>>> >>>> v4: v4 fixed the idle overhead problem but broke dynamic growth for >>>> message >>>> passing VMs. Every time a VM did a short halt, vcpu->halt_poll_ns >>>> would >>>> grow. >>>> That means vcpu->halt_poll_ns will always be maxed out, even when >>>> the >>>> halt >>>> time is much less than the max. >>>> >>>> I think we can fix both edge cases if we make grow/shrink decisions >>>> based >>>> on >>>> the length of kvm_vcpu_block rather than the arrival of a guest >>>> interrupt >>>> during polling. >>>> >>>> Some thoughts for dynamic growth: >>>> * Given Windows 10 timer tick (1 ms), let's set the maximum poll >>>> time >>>> to >>>> less than 1ms. 200 us has been a good value for always-poll. We >>>> can >>>> probably go a bit higher once we have your patch. Maybe 500 us? > > > Did you test your patch against a windows guest? I have not. I tested against a 250HZ linux guest to check how it performs against a ticking guest. Presumably, windows should be the same, but at a higher tick rate. Do you have a test for Windows? > >>>> >>>> * The base case of dynamic growth (the first grow() after being at >>>> 0) >>>> should >>>> be small. 500 us is too big. When I run TCP_RR in my guest I see >>>> poll >>>> times >>>> of < 10 us. TCP_RR is on the lower-end of message passing workload >>>> latency, >>>> so 10 us would be a good base case. >>> >>> >>> How to get your TCP_RR benchmark? >>> >>> Regards, >>> Wanpeng Li >> >> Install the netperf package, or build from here: >> http://www.netperf.org/netperf/DownloadNetperf.html >> >> In the vm: >> >> # ./netserver >> # ./netperf -t TCP_RR >> >> Be sure to use an SMP guest (we want TCP_RR to be a cross-core message >> passing workload in order to test halt-polling). > > > Ah, ok, I use the same benchmark as yours. > > Regards, > Wanpeng Li > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html