On Mon, 10 Jul 2023 19:04:08 +0100, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Mon, Jul 03, 2023, Marc Zyngier wrote: > > Since 0bf50497f03b ("KVM: Drop kvm_count_lock and instead protect > > kvm_usage_count with kvm_lock"), hotplugging back a CPU whilst > > a guest is running results in a number of ugly splats as most > > of this code expects to run with preemption disabled, which isn't > > the case anymore. > > > > While the context is preemptable, it isn't migratable, which should > > be enough. But we have plenty of preemptible() checks all over > > the place, and our per-CPU accessors also disable preemption. > > > > Since this affects released versions, let's do the easy fix first, > > disabling preemption in kvm_arch_hardware_enable(). We can always > > revisit this with a more invasive fix in the future. > > > > Fixes: 0bf50497f03b ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock") > > Reported-by: Kristina Martsenko <kristina.martsenko@xxxxxxx> > > Tested-by: Kristina Martsenko <kristina.martsenko@xxxxxxx> > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > > Link: https://lore.kernel.org/r/aeab7562-2d39-e78e-93b1-4711f8cc3fa5@xxxxxxx > > Cc: stable@xxxxxxxxxxxxxxx # v6.3, v6.4 > > --- > > arch/arm64/kvm/arm.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index aaeae1145359..a28c4ffe4932 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -1894,8 +1894,17 @@ static void _kvm_arch_hardware_enable(void *discard) > > > > int kvm_arch_hardware_enable(void) > > { > > - int was_enabled = __this_cpu_read(kvm_arm_hardware_enabled); > > + int was_enabled; > > > > + /* > > + * Most calls to this function are made with migration > > + * disabled, but not with preemption disabled. The former is > > + * enough to ensure correctness, but most of the helpers > > + * expect the later and will throw a tantrum otherwise. > > + */ > > + preempt_disable(); > > + > > + was_enabled = __this_cpu_read(kvm_arm_hardware_enabled); > > IMO, this_cpu_has_cap() is at fault. E.g. why not do this? > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > index 7d7128c65161..b862477de2ce 100644 > --- a/arch/arm64/kernel/cpufeature.c > +++ b/arch/arm64/kernel/cpufeature.c > @@ -3193,7 +3193,9 @@ static void __init setup_boot_cpu_capabilities(void) > > bool this_cpu_has_cap(unsigned int n) > { > - if (!WARN_ON(preemptible()) && n < ARM64_NCAPS) { > + __this_cpu_preempt_check("has_cap"); > + > + if (n < ARM64_NCAPS) { > const struct arm64_cpu_capabilities *cap = cpu_hwcaps_ptrs[n]; > > if (cap) > Because this check is not on at all times (it relies on DEBUG_PREEMPT), and we really want it to be there. M. -- Without deviation from the norm, progress is not possible.