On Wed, Dec 28, 2022, Paolo Bonzini wrote: > Currently only the locking order of SRCU vs kvm->slots_arch_lock > and kvm->slots_lock is documented. Extend this to kvm->lock > since Xen emulation got it terribly wrong. > > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > --- > Documentation/virt/kvm/locking.rst | 19 ++++++++++++++----- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst > index 845a561629f1..a3ca76f9be75 100644 > --- a/Documentation/virt/kvm/locking.rst > +++ b/Documentation/virt/kvm/locking.rst > @@ -16,17 +16,26 @@ The acquisition orders for mutexes are as follows: > - kvm->slots_lock is taken outside kvm->irq_lock, though acquiring > them together is quite rare. > > -- Unlike kvm->slots_lock, kvm->slots_arch_lock is released before > - synchronize_srcu(&kvm->srcu). Therefore kvm->slots_arch_lock > - can be taken inside a kvm->srcu read-side critical section, > - while kvm->slots_lock cannot. > - > - kvm->mn_active_invalidate_count ensures that pairs of > invalidate_range_start() and invalidate_range_end() callbacks > use the same memslots array. kvm->slots_lock and kvm->slots_arch_lock > are taken on the waiting side in install_new_memslots, so MMU notifiers > must not take either kvm->slots_lock or kvm->slots_arch_lock. > > +For SRCU: > + > +- ``synchronize_srcu(&kvm->srcu)`` is called _inside_ > + the kvm->slots_lock critical section, therefore kvm->slots_lock > + cannot be taken inside a kvm->srcu read-side critical section. > + Instead, kvm->slots_arch_lock is released before the call > + to ``synchronize_srcu()`` and _can_ be taken inside a > + kvm->srcu read-side critical section. > + > +- kvm->lock is taken inside kvm->srcu, therefore Prior to the recent Xen change, is this actually true? There are many instances where kvm->srcu is taken inside kvm->lock, but I can't find any existing cases where the reverse is true. Logically, it makes sense to take kvm->lock first since kvm->srcu can be taken deep in helpers, e.g. for accessing guest memory. It's also more consistent to take kvm->lock first since kvm->srcu is taken inside vcpu->mutex, and vcpu->mutex is taken inside kvm->lock. Disallowing synchronize_srcu(kvm->srcu) inside kvm->lock isn't probelmatic per se, but it's going to result in a weird set of rules because synchronize_scru() can, and is, called while holding a variety of other locks. In other words, IMO taking kvm->srcu outside of kvm->lock in the Xen code is the real bug. > + ``synchronize_srcu(&kvm->srcu)`` cannot be called inside > + a kvm->lock critical section. If you cannot delay the > + call until after kvm->lock is released, use ``call_srcu``. > + > On x86: > > - vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock > -- > 2.31.1 >