On 13/11/19 20:30, Sean Christopherson wrote: > Acquire the per-VM slots_lock when zapping all shadow pages as part of > toggling nx_huge_pages. The fast zap algorithm relies on exclusivity > (via slots_lock) to identify obsolete vs. valid shadow pages, e.g. it > uses a single bit for its generation number. Holding slots_lock also > obviates the need to acquire a read lock on the VM's srcu. > > Failing to take slots_lock when toggling nx_huge_pages allows multiple > instances of kvm_mmu_zap_all_fast() to run concurrently, as the other > user, KVM_SET_USER_MEMORY_REGION, does not take the global kvm_lock. > Concurrent fast zap instances causes obsolete shadow pages to be > incorrectly identified as valid due to the single bit generation number > wrapping, which results in stale shadow pages being left in KVM's MMU > and leads to all sorts of undesirable behavior. > > The bug is easily confirmed by running with CONFIG_PROVE_LOCKING and > toggling nx_huge_pages via its module param. > > Note, the fast zap algorithm could use a 64-bit generation instead of > relying on exclusivity for correctness, but all callers except the > recently added set_nx_huge_pages() need to hold slots_lock anyways. > Given that toggling nx_huge_pages is by no means a fast path, force it > to conform to the current approach instead of reworking the algorithm to > support concurrent calls. > > Fixes: b8e8c8303ff28 ("kvm: mmu: ITLB_MULTIHIT mitigation") > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > --- > arch/x86/kvm/mmu.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index cf718fa23dff..2ce9da58611e 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -6285,14 +6285,13 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp) > > if (new_val != old_val) { > struct kvm *kvm; > - int idx; > > mutex_lock(&kvm_lock); > > list_for_each_entry(kvm, &vm_list, vm_list) { > - idx = srcu_read_lock(&kvm->srcu); > + mutex_lock(&kvm->slots_lock); > kvm_mmu_zap_all_fast(kvm); > - srcu_read_unlock(&kvm->srcu, idx); > + mutex_unlock(&kvm->slots_lock); > > wake_up_process(kvm->arch.nx_lpage_recovery_thread); > } > Queued, thanks. Paolo