Re: [PATCH] cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Mon, 26 Aug 2019 17:40:50 -0300

On Tue, Aug 13, 2019 at 08:55:29AM +0800, Wanpeng Li wrote:
> On Sun, 4 Aug 2019 at 04:21, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
> >
> > On Thu, Aug 01, 2019 at 06:54:49PM +0200, Paolo Bonzini wrote:
> > > On 01/08/19 18:51, Rafael J. Wysocki wrote:
> > > > On 8/1/2019 9:06 AM, Wanpeng Li wrote:
> > > >> From: Wanpeng Li <wanpengli@xxxxxxxxxxx>
> > > >>
> > > >> The downside of guest side polling is that polling is performed even
> > > >> with other runnable tasks in the host. However, even if poll in kvm
> > > >> can aware whether or not other runnable tasks in the same pCPU, it
> > > >> can still incur extra overhead in over-subscribe scenario. Now we can
> > > >> just enable guest polling when dedicated pCPUs are available.
> > > >>
> > > >> Cc: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > > >> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx>
> > > >> Cc: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> > > >> Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx>
> > > >
> > > > Paolo, Marcelo, any comments?
> > >
> > > Yes, it's a good idea.
> > >
> > > Acked-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > >
> > > Paolo
> >
> 
> Hi Marcelo,
> 
> Sorry for the late response.
> 
> > I think KVM_HINTS_REALTIME is being abused somewhat.
> > It has no clear meaning and used in different locations
> > for different purposes.
> 
> ================== ============ =================================
> KVM_HINTS_REALTIME 0                      guest checks this feature bit to
> 
> determine that vCPUs are never
> 
> preempted for an unlimited time

Unlimited time means infinite time, or unlimited time means 
10s ? 1s ?

The previous definition was much better IMO: HINTS_DEDICATED.

> allowing optimizations
> ================== ============ =================================
> 
> Now it disables pv queued spinlock, 

OK. 

> pv tlb shootdown, 

OK.

> pv sched yield

"The idea is from Xen, when sending a call-function IPI-many to vCPUs,
yield if any of the IPI target vCPUs was preempted. 17% performance
increasement of ebizzy benchmark can be observed in an over-subscribe
environment. (w/ kvm-pv-tlb disabled, testing TLB flush call-function
IPI-many since call-function is not easy to be trigged by userspace
workload)."

This can probably hurt if vcpus are rarely preempted. 

> which are not expected present in vCPUs are never preempted for an
> unlimited time scenario.
> 
> >
> > For example, i think that using pv queued spinlocks and
> > haltpoll is a desired scenario, which the patch below disallows.
> 
> So even if dedicated pCPU is available, pv queued spinlocks should
> still be chose if something like vhost-kthreads are used instead of
> DPDK/vhost-user. 

Can't you enable the individual features you need for optimizing 
the overcommitted case? This is how things have been done historically:
If a new feature is available, you enable it to get the desired
performance. x2apic, invariant-tsc, cpuidle haltpoll...

So in your case: enable pv schedyield, enable pv tlb shootdown.

> kvm adaptive halt-polling will compete with
> vhost-kthreads, however, poll in guest unaware other runnable tasks in
> the host which will defeat vhost-kthreads.

It depends on how much work vhost-kthreads needs to do, how successful 
halt-poll in the guest is, and what improvement halt-polling brings.
The amount of polling will be reduced to zero if polling 
is not successful.