On Mon, Nov 29, 2021 at 11:52:33PM +0000, Sean Christopherson wrote: > When zapping obsolete pages, update the running count of zapped pages > regardless of whether or not the list has become unstable due to zapping > a shadow page with its own child shadow pages. If the VM is backed by > mostly 4kb pages, KVM can zap an absurd number of SPTEs without bumping > the batch count and thus without yielding. In the worst case scenario, > this can cause a soft lokcup. > > watchdog: BUG: soft lockup - CPU#12 stuck for 22s! [dirty_log_perf_:13020] > RIP: 0010:workingset_activation+0x19/0x130 > mark_page_accessed+0x266/0x2e0 > kvm_set_pfn_accessed+0x31/0x40 > mmu_spte_clear_track_bits+0x136/0x1c0 > drop_spte+0x1a/0xc0 > mmu_page_zap_pte+0xef/0x120 > __kvm_mmu_prepare_zap_page+0x205/0x5e0 > kvm_mmu_zap_all_fast+0xd7/0x190 > kvm_mmu_invalidate_zap_pages_in_memslot+0xe/0x10 > kvm_page_track_flush_slot+0x5c/0x80 > kvm_arch_flush_shadow_memslot+0xe/0x10 > kvm_set_memslot+0x1a8/0x5d0 > __kvm_set_memory_region+0x337/0x590 > kvm_vm_ioctl+0xb08/0x1040 > > Fixes: fbb158cb88b6 ("KVM: x86/mmu: Revert "Revert "KVM: MMU: zap pages in batch""") > Reported-by: David Matlack <dmatlack@xxxxxxxxxx> > Reviewed-by: Ben Gardon <bgardon@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> Reviewed-by: David Matlack <dmatlack@xxxxxxxxxx> > --- > > v2: > - Rebase to kvm/master, commit 30d7c5d60a88 ("KVM: SEV: expose...") > - Collect Ben's review, modulo bad splat. > - Copy+paste the correct splat and symptom. [David]. > > @David, I kept the unstable declaration out of the loop, mostly because I > really don't like putting declarations in loops, but also because > nr_zapped is declared out of the loop and I didn't want to change that > unnecessarily or make the code inconsistent. Sounds good. For stable patches I definitely agree with not changing variable declarations needlessly. And as a general rule, I get your point on the other thread about shadowing variables and therefore preferring to declare all variables at function scope. I have some other ideas on this topic but it's out of scope for this patch so I'll post it on the other thread. > > arch/x86/kvm/mmu/mmu.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 0c839ee1282c..208c892136bf 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -5576,6 +5576,7 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm) > { > struct kvm_mmu_page *sp, *node; > int nr_zapped, batch = 0; > + bool unstable; > > restart: > list_for_each_entry_safe_reverse(sp, node, > @@ -5607,11 +5608,12 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm) > goto restart; > } > > - if (__kvm_mmu_prepare_zap_page(kvm, sp, > - &kvm->arch.zapped_obsolete_pages, &nr_zapped)) { > - batch += nr_zapped; > + unstable = __kvm_mmu_prepare_zap_page(kvm, sp, > + &kvm->arch.zapped_obsolete_pages, &nr_zapped); > + batch += nr_zapped; > + > + if (unstable) > goto restart; > - } > } > > /* > -- > 2.34.0.rc2.393.gf8c9666880-goog >