Re: [PATCH] KVM: x86/mmu: Update number of zapped pages even if page list is stable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 15, 2021, David Matlack wrote:
> On Mon, Nov 15, 2021 at 11:23 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >
> > On Mon, Nov 15, 2021, David Matlack wrote:
> > > On Thu, Nov 11, 2021 at 2:14 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > >
> > > > When zapping obsolete pages, update the running count of zapped pages
> > > > regardless of whether or not the list has become unstable due to zapping
> > > > a shadow page with its own child shadow pages.  If the VM is backed by
> > > > mostly 4kb pages, KVM can zap an absurd number of SPTEs without bumping
> > > > the batch count and thus without yielding.  In the worst case scenario,
> > > > this can cause an RCU stall.
> > > >
> > > >   rcu: INFO: rcu_sched self-detected stall on CPU
> > > >   rcu:     52-....: (20999 ticks this GP) idle=7be/1/0x4000000000000000
> > > >                                           softirq=15759/15759 fqs=5058
> > > >    (t=21016 jiffies g=66453 q=238577)
> > > >   NMI backtrace for cpu 52
> > > >   Call Trace:
> > > >    ...
> > > >    mark_page_accessed+0x266/0x2f0
> > > >    kvm_set_pfn_accessed+0x31/0x40
> > > >    handle_removed_tdp_mmu_page+0x259/0x2e0
> > > >    __handle_changed_spte+0x223/0x2c0
> > > >    handle_removed_tdp_mmu_page+0x1c1/0x2e0
> > > >    __handle_changed_spte+0x223/0x2c0
> > > >    handle_removed_tdp_mmu_page+0x1c1/0x2e0
> > > >    __handle_changed_spte+0x223/0x2c0
> > > >    zap_gfn_range+0x141/0x3b0
> > > >    kvm_tdp_mmu_zap_invalidated_roots+0xc8/0x130
> > >
> > > This is a useful patch but I don't see the connection with this stall.
> > > The stall is detected in kvm_tdp_mmu_zap_invalidated_roots, which runs
> > > after kvm_zap_obsolete_pages. How would rescheduling during
> > > kvm_zap_obsolete_pages help?
> >
> > Ah shoot, I copy+pasted the wrong splat.  The correct, revelant backtrace is:
> 
> Ok that makes more sense :). Also that was a soft lockup rather than
> an RCU stall.

*sigh*  I'm not sure which blatant "this is the wrong splat" goof is worse, the
explicit tdp_mmu in the backtrace, or the fact that the legacy MMU doesn't rely
on RCU...

I'll get v2 posted.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux