On Fri, Jan 24, 2025, Michal Koutný wrote: > On Fri, Jan 17, 2025 at 09:51:41AM -0800, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > That's not the problematic commit. This popped because commit 8722903cbb8f > > ("sched: Define sched_clock_irqtime as static key") in the tip tree turned > > sched_clock_irqtime into a static key (it was a simple "int"). > > > > https://lore.kernel.org/all/20250103022409.2544-2-laoar.shao@xxxxxxxxx > > Thanks for the analysis, it's spot on. What a bad luck. > > Is there a precedent for static key switching in non-preemptible > contexts? Abuse static_key_deferred to push the patching to a workqueue? > More generally, why does KVM do this tsc check in vcpu_load? The logic triggers when a vCPU migrates to a different pCPU. The code detects that the case where TSC is inconsistent between pCPUs, and would cause time to go backwards from the guest's perspective. E.g. TSC = X on CPU0, migrate to CPU1 where TSC = X - Y. > Shouldn't possible unstability for that cpu be already checked and decided at > boot (regardless of KVM)? (Unless unstability itself is not stable property. > Which means any previously measured IRQ times are skewed.) This isn't a problem that's unique to KVM. The clocksource watchdog also marks TSC unstable from non-preemptible context (presumably from IRQ context?) clocksource_watchdog() | -> spin_lock(&watchdog_lock); | -> __clocksource_unstable() | -> clocksource.mark_unstable() == tsc_cs_mark_unstable() | -> disable_sched_clock_irqtime() Uh, and sched_clock_register() toggles the static key on with IRQs disabled... /* Cannot register a sched_clock with interrupts on */ local_irq_save(flags); ... /* Enable IRQ time accounting if we have a fast enough sched_clock() */ if (irqtime > 0 || (irqtime == -1 && rate >= 1000000)) enable_sched_clock_irqtime(); local_irq_restore(flags); > (Or a third option to revert the static-keyness if Yafang doesn't have Given there are issues all over the place, either a revert or a generic fix.