Re: KVM: arm64: pmu: Reset sample period on overflow handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2021-06-16 at 11:31 +0100, Marc Zyngier wrote:
> 
> Hi Aman,
> 
> On Wed, 16 Jun 2021 10:17:28 +0100,
> Aman Priyadarshi <apeureka@xxxxxxxxx> wrote:
> > 
> > Hi Marc,
> > 
> > On Tue, 2021-06-15 at 18:05 +0100, Marc Zyngier wrote:
> > > 
> > > Can you reproduce the issue with vanilla guest kernels? It'd be
> > > interesting to understand what makes it work on the guest side. Can
> > > you please bisect it?
> > > 
> > 
> > yes, I was able to narrow it down to the commit 0cbb058be904 ("arm64:
> > perf:
> > Disable PMU while processing counter overflows"), which fixes the
> > problem
> > on the guest side.
> 
> Which is 3cce50dfec4a5b0414c974190940f47dd32c6dee in mainline. This
> doesn't seem to have ever been backported before 4.18. So I don't know
> why your 4.15 kernel was correctly behaving, but it could be that the
> distro had randomly picked up the correct patch!

Yes. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1836117

> 
> You may want to backport it to 4.14.y and let Greg know about that.
> 

Ack.

> > 
> > I _think_, I understand the problem now. Please correct me if I am
> > wrong.
> > 
> > commit 30d97754b2d1 ("KVM: arm/arm64: Re-create event when setting
> > counter
> > value") adds a new code path for perf event when counter value is set,
> > therefore kvm would generate more events than before. Without this
> > change,
> > we have a lot less events, thus reducing the chances of guest messing
> > things up.
> 
> Without this fix, we don't communicate the new guest sample period to
> the host's perf counter, and depending on what the guest wrote (and
> the previous value), it can go one way or the other.
> 
> > On the other side, commit 8c3252c06516 ("KVM: arm64: pmu: Reset sample
> > period on overflow handling") resets the sample period to the max
> > value,
> > thus reducing the number of overflow events to guest to an optimal
> > value
> > (note, number of interrupts actually handled by guest would remain same
> > in
> > either case). Less number of overflow interrupts to the guest, reduces
> > the
> > chance of guest making up for any left over overflow event that it did
> > not
> > see earlier.
> 
> This fix is the natural complement of the previous one. We need to
> emulate the actual overflow, and prevent perf from doing its thing on
> the host (reloading from the previously provided value). So we reset
> the period to the value that perf did observe on taking the physical
> interrupt.
> 
> Together, these two patches provide a more correct PMU emulation.
> 
> The guest patch fixes prevents additional overflow being observed due
> while the guest is reprogramming its counters and observe a moving
> target. Note that the host itself needs that initial fix to correctly
> emulate the PMU! ;-)
> 
> It is pretty hard to picture exactly *what* happens when you are
> missing any of these 3 patches. Both the kernel and KVM were buggy at
> some point, and you need all three patches to ensure something
> correct.
> 

Thanks for the explanation!

Regards,
Aman Priyadarshi




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux