On Sat, 01 Jul 2023 18:42:28 +0100, Oliver Upton <oliver.upton@xxxxxxxxx> wrote: > > Hi Kristina, > > Thanks for the bug report. > > On Sat, Jul 01, 2023 at 01:50:52PM +0100, Kristina Martsenko wrote: > > Hi, > > > > When I try to online a CPU on arm64 while a KVM guest is running, I hit a > > BUG_ON(preemptible()) (as well as a WARN_ON). See below for the full log. > > > > This is on kvmarm/next, but seems to have been broken since 6.3. Bisecting it > > points at commit: > > > > 0bf50497f03b ("KVM: Drop kvm_count_lock and instead protect kvm_usage_count with kvm_lock") > > Makes sense. We were using a spinlock before, which implictly disables > preemption. > > Well, one way to hack around the problem would be to just cram > preempt_{disable,enable}() into kvm_arch_hardware_disable(), but that's > kinda gross in the context of cpuhp which isn't migratable in the first > place. Let me have a look... An alternative would be to replace the preemptible() checks with a one that looks at the migration state, but I'm not sure that's much better (it certainly looks more costly). There is also the fact that most of our per-CPU accessors are already using preemption disabling, and this code has a bunch of them. So I'm not sure there is a lot to be gained from not disabling preemption upfront. Anyway, as I was able to reproduce the issue under NV, I tested the hack below. If anything, I expect it to be a reasonable fix for 6.3/6.4, and until we come up with a better approach. Thanks, M. diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index aaeae1145359..a28c4ffe4932 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1894,8 +1894,17 @@ static void _kvm_arch_hardware_enable(void *discard) int kvm_arch_hardware_enable(void) { - int was_enabled = __this_cpu_read(kvm_arm_hardware_enabled); + int was_enabled; + /* + * Most calls to this function are made with migration + * disabled, but not with preemption disabled. The former is + * enough to ensure correctness, but most of the helpers + * expect the later and will throw a tantrum otherwise. + */ + preempt_disable(); + + was_enabled = __this_cpu_read(kvm_arm_hardware_enabled); _kvm_arch_hardware_enable(NULL); if (!was_enabled) { @@ -1903,6 +1912,8 @@ int kvm_arch_hardware_enable(void) kvm_timer_cpu_up(); } + preempt_enable(); + return 0; } -- Without deviation from the norm, progress is not possible.