On Mon, Oct 02, 2023 at 11:18:50AM -0700, Sean Christopherson wrote: > +PeterZ > > Thomas and Peter, > > We're trying to address an issue where KVM's paravirt kvmclock drifts from the > host's TSC-based monotonic raw clock because of historical reasons (at least, AFAICT), > even when the TSC is constant. Due to some dubious KVM behavior, KVM may sometimes > re-sync kvmclock against the host's monotonic raw clock, which causes non-trivial > jumps in time from the guest's perspective. > > Linux-as-a-guest demotes all paravirt clock sources when the TSC is constant and > nonstop, and so the goofy KVM behavior isn't likely to affect the guest's clocksource, > but the guest's sched_clock() implementation keeps using the paravirt clock. > > Irrespective of if/how we fix the KVM host-side mess, using a paravirt clock for > the scheduler when using a constant, nonstop TSC for the clocksource seems at best > inefficient, and at worst unnecessarily complex and risky. > > Is there any reason not to prefer native_sched_clock() over whatever paravirt > clock is present when the TSC is the preferred clocksource? I see none, that whole pv_clock thing is horrible crap. > Assuming the desirable > thing to do is to use native_sched_clock() in this scenario, do we need a separate > rating system, or can we simply tie the sched clock selection to the clocksource > selection, e.g. override the paravirt stuff if the TSC clock has higher priority > and is chosen? Yeah, I see no point of another rating system. Just force the thing back to native (or don't set it to that other thing).