On Fri, 2024-06-07 at 17:06 -0700, Sean Christopherson wrote: > Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock > on x86 due to a chain of locks and SRCU synchronizations. Translating the > below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on > CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the > fairness of r/w semaphores). > > CPU0 CPU1 CPU2 > 1 lock(&kvm->slots_lock); > 2 lock(&vcpu->mutex); > 3 lock(&kvm->srcu); > 4 lock(cpu_hotplug_lock); > 5 lock(kvm_lock); > 6 lock(&kvm->slots_lock); > 7 lock(cpu_hotplug_lock); > 8 sync(&kvm->srcu); > > [...] > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> Reviewed-by: Kai Huang <kai.huang@xxxxxxxxx> Nitpickings below: > --- > Documentation/virt/kvm/locking.rst | 19 ++++++++++++------ > virt/kvm/kvm_main.c | 31 +++++++++++++++--------------- > 2 files changed, 29 insertions(+), 21 deletions(-) > > diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst > index 02880d5552d5..5e102fe5b396 100644 > --- a/Documentation/virt/kvm/locking.rst > +++ b/Documentation/virt/kvm/locking.rst > @@ -227,7 +227,13 @@ time it will be set using the Dirty tracking mechanism described above. > :Type: mutex > :Arch: any > :Protects: - vm_list > - - kvm_usage_count > + > +``kvm_usage_count`` > +^^^^^^^^^^^^^^^^^^^ kvm_usage_lock > + > +:Type: mutex > +:Arch: any > +:Protects: - kvm_usage_count > - hardware virtualization enable/disable > :Comment: KVM also disables CPU hotplug via cpus_read_lock() during > enable/disable. I think this sentence should be improved to at least mention "Exists because using kvm_lock leads to deadlock", just like the comment for vendor_module_lock below. > @@ -290,11 +296,12 @@ time it will be set using the Dirty tracking mechanism described above. > wakeup. > > ``vendor_module_lock`` > -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > +^^^^^^^^^^^^^^^^^^^^^^ > :Type: mutex > :Arch: x86 > :Protects: loading a vendor module (kvm_amd or kvm_intel) > -:Comment: Exists because using kvm_lock leads to deadlock. cpu_hotplug_lock is > - taken outside of kvm_lock, e.g. in KVM's CPU online/offline callbacks, and > - many operations need to take cpu_hotplug_lock when loading a vendor module, > - e.g. updating static calls. > +:Comment: Exists because using kvm_lock leads to deadlock. kvm_lock is taken > + in notifiers, e.g. __kvmclock_cpufreq_notifier(), that may be invoked while > + cpu_hotplug_lock is held, e.g. from cpufreq_boost_trigger_state(), and many > + operations need to take cpu_hotplug_lock when loading a vendor module, e.g. > + updating static calls.