On Thu, Nov 14, 2019 at 01:16:21PM +0100, Paolo Bonzini wrote: > On 13/11/19 20:30, Sean Christopherson wrote: > > Failing to take slots_lock when toggling nx_huge_pages allows multiple > > instances of kvm_mmu_zap_all_fast() to run concurrently, as the other > > user, KVM_SET_USER_MEMORY_REGION, does not take the global kvm_lock. > > Concurrent fast zap instances causes obsolete shadow pages to be > > incorrectly identified as valid due to the single bit generation number > > wrapping, which results in stale shadow pages being left in KVM's MMU > > and leads to all sorts of undesirable behavior. > > Indeed the current code fails lockdep miserably, but isn't the whole > body of kvm_mmu_zap_all_fast() covered by kvm->mmu_lock? What kind of > badness can happen if kvm->slots_lock isn't taken? kvm_zap_obsolete_pages() temporarily drops mmu_lock and reschedules so that it doesn't block other vCPUS from inserting shadow pages into the new generation of the mmu.