On 14/11/19 16:10, Sean Christopherson wrote: > On Thu, Nov 14, 2019 at 01:16:21PM +0100, Paolo Bonzini wrote: >> On 13/11/19 20:30, Sean Christopherson wrote: >>> Failing to take slots_lock when toggling nx_huge_pages allows multiple >>> instances of kvm_mmu_zap_all_fast() to run concurrently, as the other >>> user, KVM_SET_USER_MEMORY_REGION, does not take the global kvm_lock. >>> Concurrent fast zap instances causes obsolete shadow pages to be >>> incorrectly identified as valid due to the single bit generation number >>> wrapping, which results in stale shadow pages being left in KVM's MMU >>> and leads to all sorts of undesirable behavior. >> >> Indeed the current code fails lockdep miserably, but isn't the whole >> body of kvm_mmu_zap_all_fast() covered by kvm->mmu_lock? What kind of >> badness can happen if kvm->slots_lock isn't taken? > > kvm_zap_obsolete_pages() temporarily drops mmu_lock and reschedules so > that it doesn't block other vCPUS from inserting shadow pages into the new > generation of the mmu. Oh, of course. I've worked on all this on the pre-5.4 MMU and that does not have commit ca333add693 ("KVM: x86/mmu: Explicitly track only a single invalid mmu generation"). I queued this patch with a small tweak to the commit message, to explain why it doesn't need a stable backport. Paolo