On Wed, Oct 09, 2024, Nikunj A Dadhania wrote: > Although the kernel switches over to stable TSC clocksource instead of > kvmclock, the scheduler still keeps on using kvmclock as the sched clock. > This is due to kvm_sched_clock_init() updating the pv_sched_clock() > unconditionally. All PV clocks are affected by this, no? This seems like something that should be handled in common code, which is the point I was trying to make in v11. > Use the clock source enable/disable callbacks to initialize > kvm_sched_clock_init() and update the pv_sched_clock(). > > As the clock selection happens in the stop machine context, schedule > delayed work to update the static_call() > > Signed-off-by: Nikunj A Dadhania <nikunj@xxxxxxx> > --- > arch/x86/kernel/kvmclock.c | 34 +++++++++++++++++++++++++++++----- > 1 file changed, 29 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c > index 5b2c15214a6b..5cd3717e103b 100644 > --- a/arch/x86/kernel/kvmclock.c > +++ b/arch/x86/kernel/kvmclock.c > @@ -21,6 +21,7 @@ > #include <asm/hypervisor.h> > #include <asm/x86_init.h> > #include <asm/kvmclock.h> > +#include <asm/timer.h> > > static int kvmclock __initdata = 1; > static int kvmclock_vsyscall __initdata = 1; > @@ -148,12 +149,39 @@ bool kvm_check_and_clear_guest_paused(void) > return ret; > } > > +static u64 (*old_pv_sched_clock)(void); > + > +static void enable_kvm_sc_work(struct work_struct *work) > +{ > + u8 flags; > + > + old_pv_sched_clock = static_call_query(pv_sched_clock); > + flags = pvclock_read_flags(&hv_clock_boot[0].pvti); > + kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT); > +} > + > +static DECLARE_DELAYED_WORK(enable_kvm_sc, enable_kvm_sc_work); > + > +static void disable_kvm_sc_work(struct work_struct *work) > +{ > + if (old_pv_sched_clock) This feels like it should be a WARN condition, as IIUC, pv_sched_clock() should never be null. And it _looks_ wrong too, as it means kvm_clock will remain the sched clock if there was no old clock, which should be impossible. > + paravirt_set_sched_clock(old_pv_sched_clock);