Re: [PATCH v4 0/4] KVM: lapic: Fix a variety of timer adv issues

Sean Christopherson <sean.j.christopherson@xxxxxxxxx> · Tue, 30 Apr 2019 12:31:02 -0700

On Sun, Apr 28, 2019 at 08:54:30AM +0800, Wanpeng Li wrote:
> Hi Sean,
> On Thu, 18 Apr 2019 at 01:18, Sean Christopherson
> <sean.j.christopherson@xxxxxxxxx> wrote:
> >
> > KVM's recently introduced adaptive tuning of lapic_timer_advance_ns has
> > several critical flaws:
> [.../...]
> >
> >   - TSC scaling is done on a per-vCPU basis, while the advancement value
> >     is global.  This issue is also present without adaptive tuning, but
> >     is now more pronounced.
> 
> Did you test this against overcommit scenario? Your per-vCPU variable
> can be a large number(yeah, below your 5000ns) when neighbour VMs on
> the same host consume cpu heavily, however, kvm will wast a lot of
> time to wait when the neighbour VMs are idle. My original patch
> evaluate the conservative hypervisor overhead when the first VM is
> deployed on the host. It doesn't matter whether or not the VMs on this
> host alter their workload behaviors later. Unless you tune the
> per-vCPU variable always, however, I think it will introduce more
> overhead. So Liran's patch "Consider LAPIC TSC-Deadline Timer expired
> if deadline too short" also can't depend on this.

I didn't test it in overcommit scenarios.  I wasn't aware of how the
automatic adjustments were being used in real deployments.

The best option I can think of is to expose a vCPU's advance time to
userspace (not sure what mechanism would be best).  This would allow
userspace to run a single vCPU VM with auto-tuning enabled, snapshot
the final adjusted advancment, and then update KVM's parameter to set
an explicit advancement and effectively disable auto-tuning.