On Thu, Aug 27, 2015 at 2:59 AM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> wrote: > Hi David, > On 8/26/15 1:19 AM, David Matlack wrote: >> >> Thanks for writing v2, Wanpeng. >> >> On Mon, Aug 24, 2015 at 11:35 PM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> wrote: >>> >>> There is a downside of halt_poll_ns since poll is still happen for idle >>> VCPU which can waste cpu usage. This patch adds the ability to adjust >>> halt_poll_ns dynamically. >> >> What testing have you done with these patches? Do you know if this removes >> the overhead of polling in idle VCPUs? Do we lose any of the performance >> from always polling? >> >>> There are two new kernel parameters for changing the halt_poll_ns: >>> halt_poll_ns_grow and halt_poll_ns_shrink. A third new parameter, >>> halt_poll_ns_max, controls the maximal halt_poll_ns; it is internally >>> rounded down to a closest multiple of halt_poll_ns_grow. The shrink/grow >>> matrix is suggested by David: >>> >>> if (poll successfully for interrupt): stay the same >>> else if (length of kvm_vcpu_block is longer than halt_poll_ns_max): >>> shrink >>> else if (length of kvm_vcpu_block is less than halt_poll_ns_max): grow >> >> The way you implemented this wasn't what I expected. I thought you would >> time >> the whole function (kvm_vcpu_block). But I like your approach better. It's >> simpler and [by inspection] does what we want. > > > I see there is more idle vCPUs overhead w/ this method even more than always > halt-poll, so I bring back grow vcpu->halt_poll_ns when interrupt arrives > and shrinks when idle VCPU is detected. The perfomance looks good in v4. Why did this patch have a worse idle overhead than always poll? > > Regards, > Wanpeng Li > > >> >>> halt_poll_ns_shrink/ | >>> halt_poll_ns_grow | grow halt_poll_ns | shrink halt_poll_ns >>> ---------------------+----------------------+------------------- >>> < 1 | = halt_poll_ns | = 0 >>> < halt_poll_ns | *= halt_poll_ns_grow | /= halt_poll_ns_shrink >>> otherwise | += halt_poll_ns_grow | -= halt_poll_ns_shrink >> >> I was curious why you went with this approach rather than just the >> middle row, or just the last row. Do you think we'll want the extra >> flexibility? >> >>> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >>> --- >>> virt/kvm/kvm_main.c | 65 >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++- >>> 1 file changed, 64 insertions(+), 1 deletion(-) >>> >>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >>> index 93db833..2a4962b 100644 >>> --- a/virt/kvm/kvm_main.c >>> +++ b/virt/kvm/kvm_main.c >>> @@ -66,9 +66,26 @@ >>> MODULE_AUTHOR("Qumranet"); >>> MODULE_LICENSE("GPL"); >>> >>> -static unsigned int halt_poll_ns; >>> +#define KVM_HALT_POLL_NS 500000 >>> +#define KVM_HALT_POLL_NS_GROW 2 >>> +#define KVM_HALT_POLL_NS_SHRINK 0 >>> +#define KVM_HALT_POLL_NS_MAX 2000000 >> >> The macros are not necessary. Also, hard coding the numbers in the param >> definitions will make reading the comments above them easier. >> >>> + >>> +static unsigned int halt_poll_ns = KVM_HALT_POLL_NS; >>> module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR); >>> >>> +/* Default doubles per-vcpu halt_poll_ns. */ >>> +static unsigned int halt_poll_ns_grow = KVM_HALT_POLL_NS_GROW; >>> +module_param(halt_poll_ns_grow, int, S_IRUGO); >>> + >>> +/* Default resets per-vcpu halt_poll_ns . */ >>> +static unsigned int halt_poll_ns_shrink = KVM_HALT_POLL_NS_SHRINK; >>> +module_param(halt_poll_ns_shrink, int, S_IRUGO); >>> + >>> +/* halt polling only reduces halt latency by 10-15 us, 2ms is enough */ >> >> Ah, I misspoke before. I was thinking about round-trip latency. The >> latency >> of a single halt is reduced by about 5-7 us. >> >>> +static unsigned int halt_poll_ns_max = KVM_HALT_POLL_NS_MAX; >>> +module_param(halt_poll_ns_max, int, S_IRUGO); >> >> We can remove halt_poll_ns_max. vcpu->halt_poll_ns can always start at >> zero >> and grow from there. Then we just need one module param to keep >> vcpu->halt_poll_ns from growing too large. >> >> [ It would make more sense to remove halt_poll_ns and keep >> halt_poll_ns_max, >> but since halt_poll_ns already exists in upstream kernels, we probably >> can't >> remove it. ] >> >>> + >>> /* >>> * Ordering of locks: >>> * >>> @@ -1907,6 +1924,48 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu >>> *vcpu, gfn_t gfn) >>> } >>> EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty); >>> >>> +static unsigned int __grow_halt_poll_ns(unsigned int val) >>> +{ >>> + if (halt_poll_ns_grow < 1) >>> + return halt_poll_ns; >>> + >>> + val = min(val, halt_poll_ns_max); >>> + >>> + if (val == 0) >>> + return halt_poll_ns; >>> + >>> + if (halt_poll_ns_grow < halt_poll_ns) >>> + val *= halt_poll_ns_grow; >>> + else >>> + val += halt_poll_ns_grow; >>> + >>> + return val; >>> +} >>> + >>> +static unsigned int __shrink_halt_poll_ns(int val, int modifier, int >>> minimum) >> >> minimum never gets used. >> >>> +{ >>> + if (modifier < 1) >>> + return 0; >>> + >>> + if (modifier < halt_poll_ns) >>> + val /= modifier; >>> + else >>> + val -= modifier; >>> + >>> + return val; >>> +} >>> + >>> +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) >> >> These wrappers aren't necessary. >> >>> +{ >>> + vcpu->halt_poll_ns = __grow_halt_poll_ns(vcpu->halt_poll_ns); >>> +} >>> + >>> +static void shrink_halt_poll_ns(struct kvm_vcpu *vcpu) >>> +{ >>> + vcpu->halt_poll_ns = __shrink_halt_poll_ns(vcpu->halt_poll_ns, >>> + halt_poll_ns_shrink, halt_poll_ns); >>> +} >>> + >>> static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) >>> { >>> if (kvm_arch_vcpu_runnable(vcpu)) { >>> @@ -1954,6 +2013,10 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) >>> break; >>> >>> waited = true; >>> + if (vcpu->halt_poll_ns > halt_poll_ns_max) >>> + shrink_halt_poll_ns(vcpu); >>> + else >>> + grow_halt_poll_ns(vcpu); >> >> Shouldn't this go after the loop, and before "out:", in case we schedule >> more than once? You can gate it on "if (waited)" so it only runs if we >> actually scheduled. >> >>> schedule(); >>> } >>> >>> -- >>> 1.9.1 >>> > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html