Hi Sean, On 11/01/2024 17:00, Sean Christopherson wrote: > This is a known issue. It's mostly a KVM bug[...] (fix posted[...]), but I suspect > that a bug in the dynamic preemption model logic[...] is also contributing to the > behavior by causing KVM to yield on preempt models where it really shouldn't. I tried the following variants now, each applied on top of 6.7 (0dd3ee31): * [1], the initial patch series mentioned in the bugreport ("[PATCH 0/2] KVM: Pre-check mmu_notifier retry on x86") * [2], its v2 that you linked above ("[PATCH v2] KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing") * [3], the scheduler patch you linked above ("[PATCH] sched/core: Drop spinlocks on contention iff kernel is preemptible") * both [2] & [3] My kernel is PREEMPT_DYNAMIC and, according to /sys/kernel/debug/sched/preempt, defaults to preempt=voluntary. For case [3], I additionally tried manually switching to preempt=full. Provided I did not mess up, I get the following results for the reproducer I posted: * [1] (the initial patch series): no hangs * [2] (its v2): hangs * [3] (the scheduler patch) with preempt=voluntary: no hangs * [3] (the scheduler patch) with preempt=full: hangs * [2] & [3]: no hangs So it seems like: * [1] (the initial patch series) fixes the hangs, which is consistent with the feedback in the bugreport [4]. * But weirdly, its v2 [2] does not fix the hangs. * As long as I stay with preempt=voluntary, [3] (the scheduler patch) alone is already enough to fix the hangs in my case -- this I did not expect :) Does this make sense to you? Happy to double-check or run more tests if anything seems off. Best wishes, Friedrich [1] https://lore.kernel.org/all/20230825020733.2849862-1-seanjc@xxxxxxxxxx/ [2] https://lore.kernel.org/all/20240110012045.505046-1-seanjc@xxxxxxxxxx/ [3] https://lore.kernel.org/all/20240110214723.695930-1-seanjc@xxxxxxxxxx/ [4] https://bugzilla.kernel.org/show_bug.cgi?id=218259#c6