On 03/07/2023 10:45, Marc Zyngier wrote: > On Sat, 01 Jul 2023 18:42:28 +0100, > Oliver Upton <oliver.upton@xxxxxxxxx> wrote: >> >> Hi Kristina, >> >> Thanks for the bug report. >> >> On Sat, Jul 01, 2023 at 01:50:52PM +0100, Kristina Martsenko wrote: >>> Hi, >>> >>> When I try to online a CPU on arm64 while a KVM guest is running, I hit a >>> BUG_ON(preemptible()) (as well as a WARN_ON). See below for the full log. >>> >>> This is on kvmarm/next, but seems to have been broken since 6.3. Bisecting it >>> points at commit: >>> >>> 0bf50497f03b ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock") >> >> Makes sense. We were using a spinlock before, which implictly disables >> preemption. >> >> Well, one way to hack around the problem would be to just cram >> preempt_{disable,enable}() into kvm_arch_hardware_disable(), but that's >> kinda gross in the context of cpuhp which isn't migratable in the first >> place. Let me have a look... > > An alternative would be to replace the preemptible() checks with a one > that looks at the migration state, but I'm not sure that's much better > (it certainly looks more costly). > > There is also the fact that most of our per-CPU accessors are already > using preemption disabling, and this code has a bunch of them. So I'm > not sure there is a lot to be gained from not disabling preemption > upfront. > > Anyway, as I was able to reproduce the issue under NV, I tested the > hack below. If anything, I expect it to be a reasonable fix for > 6.3/6.4, and until we come up with a better approach. > > Thanks, > > M. > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index aaeae1145359..a28c4ffe4932 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -1894,8 +1894,17 @@ static void _kvm_arch_hardware_enable(void *discard) > > int kvm_arch_hardware_enable(void) > { > - int was_enabled = __this_cpu_read(kvm_arm_hardware_enabled); > + int was_enabled; > > + /* > + * Most calls to this function are made with migration > + * disabled, but not with preemption disabled. The former is > + * enough to ensure correctness, but most of the helpers > + * expect the later and will throw a tantrum otherwise. > + */ > + preempt_disable(); > + > + was_enabled = __this_cpu_read(kvm_arm_hardware_enabled); > _kvm_arch_hardware_enable(NULL); > > if (!was_enabled) { > @@ -1903,6 +1912,8 @@ int kvm_arch_hardware_enable(void) > kvm_timer_cpu_up(); > } > > + preempt_enable(); > + > return 0; > } This fixes the issue for me. Tested-by: Kristina Martsenko <kristina.martsenko@xxxxxxx> Thanks, Kristina