On Wed, 09 Feb 2022 19:59:45 +0000, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > +Marc Thanks for the heads up. > > On Wed, Feb 09, 2022, Chao Gao wrote: > > The CPU STARTING section doesn't allow callbacks to fail. Move KVM's > > hotplug callback to ONLINE section so that it can abort onlining a CPU in > > certain cases to avoid potentially breaking VMs running on existing CPUs. > > For example, when kvm fails to enable hardware virtualization on the > > hotplugged CPU. > > > > Place KVM's hotplug state before CPUHP_AP_SCHED_WAIT_EMPTY as it ensures > > when offlining a CPU, all user tasks and non-pinned kernel tasks have left > > the CPU, i.e. there cannot be a vCPU task around. So, it is safe for KVM's > > CPU offline callback to disable hardware virtualization at that point. > > Likewise, KVM's online callback can enable hardware virtualization before > > any vCPU task gets a chance to run on hotplugged CPUs. > > > > KVM's CPU hotplug callbacks are renamed as well. > > > > Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > > Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx> > > --- > > include/linux/cpuhotplug.h | 2 +- > > virt/kvm/kvm_main.c | 30 ++++++++++++++++++++++-------- > > 2 files changed, 23 insertions(+), 9 deletions(-) > > > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > > index 773c83730906..14d354c8ce35 100644 > > --- a/include/linux/cpuhotplug.h > > +++ b/include/linux/cpuhotplug.h > > @@ -182,7 +182,6 @@ enum cpuhp_state { > > CPUHP_AP_CSKY_TIMER_STARTING, > > CPUHP_AP_TI_GP_TIMER_STARTING, > > CPUHP_AP_HYPERV_TIMER_STARTING, > > - CPUHP_AP_KVM_STARTING, > > CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING, > > CPUHP_AP_KVM_ARM_VGIC_STARTING, > > CPUHP_AP_KVM_ARM_TIMER_STARTING, > > This probably needs an ack from Marc. IIUC, it changes the ordering > between generic KVM enabling hardware and KVM ARM doing its vGIC and > timer stuff. Indeed, that's not great. Specially the part that enable interrupts before things are up and running on the CPU. But TBH, this area really deserves a good scrubbing, and I don't see why we need to keep these individual CPUHP notifiers. I wrote the patch below, thrown it at a test box, and nothing caught fire as I was fiddling with CPUs going up and down. It is thus obviously perfect. Feel free to take it as part of your series. Thanks, M.