Hi Vitaly, On Mon, Mar 29, 2021 at 01:47:59PM +0200, Vitaly Kuznetsov wrote: > When guest time is reset with KVM_SET_CLOCK(0), it is possible for > hv_clock->system_time to become a small negative number. This happens > because in KVM_SET_CLOCK handling we set kvm->arch.kvmclock_offset based > on get_kvmclock_ns(kvm) but when KVM_REQ_CLOCK_UPDATE is handled, > kvm_guest_time_update() does > > hv_clock.system_time = ka->master_kernel_ns + v->kvm->arch.kvmclock_offset; Hum, masterclock update should always precede KVM_SET_CLOCK() call. So when KVM_SET_CLOCK(0) is called, we'd like the guest (or the following): static __always_inline u64 __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src, u64 tsc) { u64 delta = tsc - src->tsc_timestamp; u64 offset = pvclock_scale_delta(delta, src->tsc_to_system_mul, src->tsc_shift); return src->system_time + offset; } To return 0 (which won't happen if pvclock_data->system_time is being initialized to a negative value). KVM_SET_CLOCK(0) kvm_gen_update_masterclock(kvm); now_ns = get_kvmclock_ns(kvm); kvm->arch.kvmclock_offset += user_ns.clock - now_ns = -now_ns; hv_clock.tsc_timestamp = ka->master_cycle_now; hv_clock.system_time = ka->master_kernel_ns + ka->kvmclock_offset; Wonder if the negative value happens due to the switch from masterclock=off -> on (get_kvmclock_ns reading kernel time directly). Perhaps this error is being added to clock when migration is performed. But just thinking out loud... > And 'master_kernel_ns' represents the last time when masterclock > got updated, it can precede KVM_SET_CLOCK() call. Normally, this is not a > problem, the difference is very small, e.g. I'm observing > hv_clock.system_time = -70 ns. The issue comes from the fact that > 'hv_clock.system_time' is stored as unsigned and 'system_time / 100' in > compute_tsc_page_parameters() becomes a very big number. Which it is (a very big number). > Use div_s64() to get the proper result when dividing maybe-negative > 'hv_clock.system_time' by 100. Well hv_clock.system_time is not negative. It is positive. > Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> > --- > arch/x86/kvm/hyperv.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c > index f98370a39936..0529b892f634 100644 > --- a/arch/x86/kvm/hyperv.c > +++ b/arch/x86/kvm/hyperv.c > @@ -1070,10 +1070,13 @@ static bool compute_tsc_page_parameters(struct pvclock_vcpu_time_info *hv_clock, > hv_clock->tsc_to_system_mul, > 100); > > - tsc_ref->tsc_offset = hv_clock->system_time; > - do_div(tsc_ref->tsc_offset, 100); > - tsc_ref->tsc_offset -= > + /* > + * Note: 'hv_clock->system_time' despite being 'u64' can hold a negative > + * value here, thus div_s64(). > + */ > + tsc_ref->tsc_offset = div_s64(hv_clock->system_time, 100) - > mul_u64_u64_shr(hv_clock->tsc_timestamp, tsc_ref->tsc_scale, 64); > + > return true; > } Won't the guest see: 0.1, 0.005, 0.0025, 0.0001, ..., 0, 0.0001, 0.0025, 0.005, 0.1, ... As system_time progresses from negative value to positive value with the above fix? While the fix seems OK in practice, perhaps the negative system_time could be fixed instead.