On Tue, Sep 06, 2022 at 02:44:34PM -0700, Isaku Yamahata <isaku.yamahata@xxxxxxxxx> wrote: > On Tue, Sep 06, 2022 at 07:32:22AM +0100, > Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > On Tue, 06 Sep 2022 03:46:43 +0100, > > Yuan Yao <yuan.yao@xxxxxxxxxxxxxxx> wrote: > > > > > > On Thu, Sep 01, 2022 at 07:17:45PM -0700, isaku.yamahata@xxxxxxxxx wrote: > > > > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > > > > > > > Because kvm_count_lock unnecessarily complicates the KVM locking convention > > > > Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock for > > > > simplicity. > > > > > > > > Opportunistically add some comments on locking. > > > > > > > > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> > > > > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > > > --- > > > > Documentation/virt/kvm/locking.rst | 14 +++++------- > > > > virt/kvm/kvm_main.c | 34 ++++++++++++++++++++---------- > > > > 2 files changed, 28 insertions(+), 20 deletions(-) > > > > > > > > diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst > > > > index 845a561629f1..8957e32aa724 100644 > > > > --- a/Documentation/virt/kvm/locking.rst > > > > +++ b/Documentation/virt/kvm/locking.rst > > > > @@ -216,15 +216,11 @@ time it will be set using the Dirty tracking mechanism described above. > > > > :Type: mutex > > > > :Arch: any > > > > :Protects: - vm_list > > > > - > > > > -``kvm_count_lock`` > > > > -^^^^^^^^^^^^^^^^^^ > > > > - > > > > -:Type: raw_spinlock_t > > > > -:Arch: any > > > > -:Protects: - hardware virtualization enable/disable > > > > -:Comment: 'raw' because hardware enabling/disabling must be atomic /wrt > > > > - migration. > > > > + - kvm_usage_count > > > > + - hardware virtualization enable/disable > > > > +:Comment: Use cpus_read_lock() for hardware virtualization enable/disable > > > > + because hardware enabling/disabling must be atomic /wrt > > > > + migration. The lock order is cpus lock => kvm_lock. > > > > > > > > ``kvm->mn_invalidate_lock`` > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > > > index fc55447c4dba..082d5dbc8d7f 100644 > > > > --- a/virt/kvm/kvm_main.c > > > > +++ b/virt/kvm/kvm_main.c > > > > @@ -100,7 +100,6 @@ EXPORT_SYMBOL_GPL(halt_poll_ns_shrink); > > > > */ > > > > > > > > DEFINE_MUTEX(kvm_lock); > > > > -static DEFINE_RAW_SPINLOCK(kvm_count_lock); > > > > LIST_HEAD(vm_list); > > > > > > > > static cpumask_var_t cpus_hardware_enabled; > > > > @@ -4996,6 +4995,8 @@ static void hardware_enable_nolock(void *caller_name) > > > > int cpu = raw_smp_processor_id(); > > > > int r; > > > > > > > > + WARN_ON_ONCE(preemptible()); > > > > > > This looks incorrect, it may triggers everytime when online CPU. > > > Because patch 7 moved CPUHP_AP_KVM_STARTING *AFTER* > > > CPUHP_AP_ONLINE_IDLE as CPUHP_AP_KVM_ONLINE, then cpuhp_thread_fun() > > > runs the new CPUHP_AP_KVM_ONLINE in *non-atomic* context: > > > > > > cpuhp_thread_fun(unsigned int cpu) { > > > ... > > > if (cpuhp_is_atomic_state(state)) { > > > local_irq_disable(); > > > st->result = cpuhp_invoke_callback(cpu, state, bringup, st->node, &st->last); > > > local_irq_enable(); > > > > > > WARN_ON_ONCE(st->result); > > > } else { > > > st->result = cpuhp_invoke_callback(cpu, state, bringup, st->node, &st->last); > > > } > > > ... > > > } > > > > > > static bool cpuhp_is_atomic_state(enum cpuhp_state state) > > > { > > > return CPUHP_AP_IDLE_DEAD <= state && state < CPUHP_AP_ONLINE; > > > } > > > > > > The hardware_enable_nolock() now is called in 2 cases: > > > 1. in atomic context by on_each_cpu(). > > > 2. From non-atomic context by CPU hotplug thread. > > > > > > so how about "WARN_ONCE(preemptible() && cpu_active(cpu))" ? > > > > I suspect similar changes must be applied to the arm64 side (though > > I'm still looking for a good definition of cpu_active()). > > It seems plausible. I tested cpu online/offline on x86. Let me update arm64 code > too. On second thought, I decided to add preempt_disable/enable() instead of fixing up possible arch callback and let each arch handle it. -- Isaku Yamahata <isaku.yamahata@xxxxxxxxx>