Il 06/01/2014 15:18, Marcelo Tosatti ha scritto: > > To fix a problem related to different resolution of TSC and system clock, > the offset in TSC units is approximated by > > delta = vcpu->hv_clock.tsc_timestamp - vcpu->last_guest_tsc > > (Guest TSC value at (Guest TSC value at last VM-exit) > the last kvm_guest_time_update > call) > > Delta is then later scaled using mult,shift pair found in hv_clock > structure (which is correct against tsc_timestamp in that > structure). > > However, if a frequency change is performed between these two points, > this delta is measured using different TSC frequencies, but scaled using > mult,shift pair for one frequency only. > > The end result is an incorrect delta. > > The bug which this code works around is not the only cause for > clock backwards events. The global accumulator is still > necessary, so remove the max_kernel_ns fix and rely on the > global accumulator for no clock backwards events. This is basically reverting commit 1d5f066 (KVM: x86: Fix a possible backwards warp of kvmclock, 2010-08-19). Your commit message basically says that the guest-side 489fb49 (x86, paravirt: Add a global synchronization point for pvclock, 2010-05-11) is the real solution to the problem that the host-side commit 1d5f066 was trying to fix. Right? This patch makes vcpu->hv_clock.tsc_timestamp write only. Please provide a follow up that drops the field entirely, then I'll apply both. In the meanwhile: Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> Thanks, Paolo > Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx> > > Index: linux-2.6.git/arch/x86/kvm/x86.c > =================================================================== > --- linux-2.6.git.orig/arch/x86/kvm/x86.c > +++ linux-2.6.git/arch/x86/kvm/x86.c > @@ -1484,7 +1484,7 @@ static int kvm_guest_time_update(struct > unsigned long flags, this_tsc_khz; > struct kvm_vcpu_arch *vcpu = &v->arch; > struct kvm_arch *ka = &v->kvm->arch; > - s64 kernel_ns, max_kernel_ns; > + s64 kernel_ns; > u64 tsc_timestamp, host_tsc; > struct pvclock_vcpu_time_info guest_hv_clock; > u8 pvclock_flags; > @@ -1543,37 +1543,6 @@ static int kvm_guest_time_update(struct > if (!vcpu->pv_time_enabled) > return 0; > > - /* > - * Time as measured by the TSC may go backwards when resetting the base > - * tsc_timestamp. The reason for this is that the TSC resolution is > - * higher than the resolution of the other clock scales. Thus, many > - * possible measurments of the TSC correspond to one measurement of any > - * other clock, and so a spread of values is possible. This is not a > - * problem for the computation of the nanosecond clock; with TSC rates > - * around 1GHZ, there can only be a few cycles which correspond to one > - * nanosecond value, and any path through this code will inevitably > - * take longer than that. However, with the kernel_ns value itself, > - * the precision may be much lower, down to HZ granularity. If the > - * first sampling of TSC against kernel_ns ends in the low part of the > - * range, and the second in the high end of the range, we can get: > - * > - * (TSC - offset_low) * S + kns_old > (TSC - offset_high) * S + kns_new > - * > - * As the sampling errors potentially range in the thousands of cycles, > - * it is possible such a time value has already been observed by the > - * guest. To protect against this, we must compute the system time as > - * observed by the guest and ensure the new system time is greater. > - */ > - max_kernel_ns = 0; > - if (vcpu->hv_clock.tsc_timestamp) { > - max_kernel_ns = vcpu->last_guest_tsc - > - vcpu->hv_clock.tsc_timestamp; > - max_kernel_ns = pvclock_scale_delta(max_kernel_ns, > - vcpu->hv_clock.tsc_to_system_mul, > - vcpu->hv_clock.tsc_shift); > - max_kernel_ns += vcpu->last_kernel_ns; > - } > - > if (unlikely(vcpu->hw_tsc_khz != this_tsc_khz)) { > kvm_get_time_scale(NSEC_PER_SEC / 1000, this_tsc_khz, > &vcpu->hv_clock.tsc_shift, > @@ -1581,14 +1550,6 @@ static int kvm_guest_time_update(struct > vcpu->hw_tsc_khz = this_tsc_khz; > } > > - /* with a master <monotonic time, tsc value> tuple, > - * pvclock clock reads always increase at the (scaled) rate > - * of guest TSC - no need to deal with sampling errors. > - */ > - if (!use_master_clock) { > - if (max_kernel_ns > kernel_ns) > - kernel_ns = max_kernel_ns; > - } > /* With all the info we got, fill in the values */ > vcpu->hv_clock.tsc_timestamp = tsc_timestamp; > vcpu->hv_clock.system_time = kernel_ns + v->kvm->arch.kvmclock_offset; > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html