On Wed, Mar 31, 2021, Paolo Bonzini wrote: > On 31/03/21 18:41, Sean Christopherson wrote: > > > That said, the easiest way to avoid this would be to always update > > > mmu_notifier_count. > > Updating mmu_notifier_count requires taking mmu_lock, which would defeat the > > purpose of these shenanigans. > > Okay; I wasn't sure if the problem was contention with page faults in > general, or just the long critical sections from the MMU notifier callbacks. > Still updating mmu_notifier_count unconditionally is a good way to break up > the patch in two and keep one commit just for the rwsem nastiness. Rereading things, a small chunk of the rwsem nastiness can go away. I don't see any reason to use rw_semaphore instead of rwlock_t. install_new_memslots() only holds the lock for a handful of instructions. Readers could get queued up behind a writer, but since install_new_memslots() is serialized by slots_lock (the existing mutex), there is no chance of multiple writers, i.e. the worst case wait duration is bounded at the length of an in-flight notification. And that's _already_ the worst case since notifications are currently serialized by mmu_lock. In practice, the existing worst case is probably far worse since there can be far more writers trying to acquire mmu_lock. In other words, there's no strong argument for sleeping instead of busy waiting in the notifiers. By switching to rwlock_t, taking mmu_notifier_slots_lock doesn't have to depend on mmu_notifier_range_blockable(), and the must_lock path also goes away. > > > > +#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) > > > > + down_write(&kvm->mmu_notifier_slots_lock); > > > > +#endif > > > > rcu_assign_pointer(kvm->memslots[as_id], slots); > > > > +#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) > > > > + up_write(&kvm->mmu_notifier_slots_lock); > > > > +#endif > > > Please do this unconditionally, the cost is minimal if the rwsem is not > > > contended (as is the case if the architecture doesn't use MMU notifiers at > > > all). > > It's not the cost, it's that mmu_notifier_slots_lock doesn't exist. That's an > > easily solved problem, but then the lock wouldn't be initialized since > > kvm_init_mmu_notifier() is a nop. That's again easy to solve, but IMO would > > look rather weird. I guess the counter argument is that __kvm_memslots() > > wouldn't need #ifdeffery. > > Yep. Less #ifdefs usually wins. :) > > > These are the to ideas I've come up with: > > > > Option 1: > > static int kvm_init_mmu_notifier(struct kvm *kvm) > > { > > init_rwsem(&kvm->mmu_notifier_slots_lock); > > > > #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) > > kvm->mmu_notifier.ops = &kvm_mmu_notifier_ops; > > return mmu_notifier_register(&kvm->mmu_notifier, current->mm); > > #else > > return 0; > > #endif > > } > > Option 2 is also okay I guess, but the simplest is option 1 + just init it > in kvm_create_vm. Arr. I'll play around with it to try and purge the #ifdefs.