Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes: > On Mon, Jul 29, 2019 at 12:59:26PM +0200, Vitaly Kuznetsov wrote: >> lantianyu1986@xxxxxxxxx writes: >> >> > From: Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx> >> > >> > Hyper-V guests use the default native_sched_clock() in pv_ops.time.sched_clock >> > on x86. But native_sched_clock() directly uses the raw TSC value, which >> > can be discontinuous in a Hyper-V VM. Add the generic hv_setup_sched_clock() >> > to set the sched clock function appropriately. On x86, this sets >> > pv_ops.time.sched_clock to read the Hyper-V reference TSC value that is >> > scaled and adjusted to be continuous. >> >> Hypervisor can, in theory, disable TSC page and then we're forced to use >> MSR-based clocksource but using it as sched_clock() can be very slow, >> I'm afraid. >> >> On the other hand, what we have now is probably worse: TSC can, >> actually, jump backwards (e.g. on migration) and we're breaking the >> requirements for sched_clock(). > > That (obviously) also breaks the requirements for using TSC as > clocksource. > > IOW, it breaks the entire purpose of having TSC in the first place. Currently, we mark raw TSC as unstable when running on Hyper-V (see 88c9281a9fba6), 'TSC page' (which is TSC * scale + offset) is being used instead. The problem is that 'TSC page' can be disabled by the hypervisor and in that case the only remaining clocksource is MSR-based (slow). -- Vitaly