On Fri, Jan 24, 2025 at 11:36 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Fri, Jan 24, 2025, Michal Koutný wrote: > > On Fri, Jan 17, 2025 at 09:51:41AM -0800, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > That's not the problematic commit. This popped because commit 8722903cbb8f > > > ("sched: Define sched_clock_irqtime as static key") in the tip tree turned > > > sched_clock_irqtime into a static key (it was a simple "int"). > > > > > > https://lore.kernel.org/all/20250103022409.2544-2-laoar.shao@xxxxxxxxx > > > > Thanks for the analysis, it's spot on. What a bad luck. > > > > Is there a precedent for static key switching in non-preemptible > > contexts? > > Abuse static_key_deferred to push the patching to a workqueue? > > > More generally, why does KVM do this tsc check in vcpu_load? > > The logic triggers when a vCPU migrates to a different pCPU. The code detects > that the case where TSC is inconsistent between pCPUs, and would cause time to go > backwards from the guest's perspective. E.g. TSC = X on CPU0, migrate to CPU1 > where TSC = X - Y. > > > Shouldn't possible unstability for that cpu be already checked and decided at > > boot (regardless of KVM)? (Unless unstability itself is not stable property. > > Which means any previously measured IRQ times are skewed.) > > This isn't a problem that's unique to KVM. The clocksource watchdog also marks > TSC unstable from non-preemptible context (presumably from IRQ context?) > > clocksource_watchdog() > | > -> spin_lock(&watchdog_lock); > | > -> __clocksource_unstable() > | > -> clocksource.mark_unstable() == tsc_cs_mark_unstable() > | > -> disable_sched_clock_irqtime() > > Uh, and sched_clock_register() toggles the static key on with IRQs disabled... > > /* Cannot register a sched_clock with interrupts on */ > local_irq_save(flags); > > ... > > /* Enable IRQ time accounting if we have a fast enough sched_clock() */ > if (irqtime > 0 || (irqtime == -1 && rate >= 1000000)) > enable_sched_clock_irqtime(); > > local_irq_restore(flags); > > > (Or a third option to revert the static-keyness if Yafang doesn't have > > Given there are issues all over the place, either a revert or a generic fix. It seems that introducing a generic fix for this static key change might not be worth it. I’ll proceed with sending a revert. -- Regards Yafang