On Mon, Dec 14, 2015 at 10:07:21AM -0800, Andy Lutomirski wrote: > On Fri, Dec 11, 2015 at 3:48 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > > On Fri, Dec 11, 2015 at 01:57:23PM -0800, Andy Lutomirski wrote: > >> On Thu, Dec 10, 2015 at 1:32 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > >> > On Wed, Dec 09, 2015 at 01:10:59PM -0800, Andy Lutomirski wrote: > >> >> I'm trying to clean up kvmclock and I can't get it to work at all. My > >> >> host is 4.4.0-rc3-ish on a Skylake laptop that has a working TSC. > >> >> > >> >> If I boot an SMP (2 vcpus) guest, tracing says: > >> >> > >> >> qemu-system-x86-2517 [001] 102242.610654: kvm_update_master_clock: > >> >> masterclock 0 hostclock tsc offsetmatched 0 > >> >> qemu-system-x86-2521 [000] 102242.613742: kvm_track_tsc: > >> >> vcpu_id 0 masterclock 0 offsetmatched 0 nr_online 1 hostclock tsc > >> >> qemu-system-x86-2522 [000] 102242.622959: kvm_track_tsc: > >> >> vcpu_id 1 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc > >> >> qemu-system-x86-2521 [000] 102242.645123: kvm_track_tsc: > >> >> vcpu_id 0 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc > >> >> qemu-system-x86-2522 [000] 102242.647291: kvm_track_tsc: > >> >> vcpu_id 1 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc > >> >> qemu-system-x86-2521 [000] 102242.653369: kvm_track_tsc: > >> >> vcpu_id 0 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc > >> >> qemu-system-x86-2522 [000] 102242.653429: kvm_track_tsc: > >> >> vcpu_id 1 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc > >> >> qemu-system-x86-2517 [001] 102242.653447: kvm_update_master_clock: > >> >> masterclock 0 hostclock tsc offsetmatched 1 > >> >> qemu-system-x86-2521 [000] 102242.653657: kvm_update_master_clock: > >> >> masterclock 0 hostclock tsc offsetmatched 1 > >> >> qemu-system-x86-2522 [002] 102242.664448: kvm_update_master_clock: > >> >> masterclock 0 hostclock tsc offsetmatched 1 > >> >> > >> >> > >> >> If I boot a UP guest, tracing says: > >> >> > >> >> qemu-system-x86-2567 [001] 102370.447484: kvm_update_master_clock: > >> >> masterclock 0 hostclock tsc offsetmatched 1 > >> >> qemu-system-x86-2571 [002] 102370.447688: kvm_update_master_clock: > >> >> masterclock 0 hostclock tsc offsetmatched 1 > >> >> > >> >> I suspect, but I haven't verified, that this is fallout from: > >> >> > >> >> commit 16a9602158861687c78b6de6dc6a79e6e8a9136f > >> >> Author: Marcelo Tosatti <mtosatti@xxxxxxxxxx> > >> >> Date: Wed May 14 12:43:24 2014 -0300 > >> >> > >> >> KVM: x86: disable master clock if TSC is reset during suspend > >> >> > >> >> Updating system_time from the kernel clock once master clock > >> >> has been enabled can result in time backwards event, in case > >> >> kernel clock frequency is lower than TSC frequency. > >> >> > >> >> Disable master clock in case it is necessary to update it > >> >> from the resume path. > >> >> > >> >> Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx> > >> >> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > >> >> > >> >> > >> >> Can we please stop making kvmclock more complex? It's a beast right > >> >> now, and not in a good way. It's far too tangled with the vclock > >> >> machinery on both the host and guest sides, the pvclock stuff is not > >> >> well thought out (even in principle in an ABI sense), and it's never > >> >> been clear to my what problem exactly the kvmclock stuff is supposed > >> >> to solve. > >> >> > >> >> I'm somewhat tempted to suggest that we delete kvmclock entirely and > >> >> start over. A correctly functioning KVM guest using TSC (i.e. > >> >> ignoring kvmclock entirely) > >> >> seems to work rather more reliably and > >> >> considerably faster than a kvmclock guest. > >> >> > >> >> --Andy > >> >> > >> >> -- > >> >> Andy Lutomirski > >> >> AMA Capital Management, LLC > >> > > >> > Andy, > >> > > >> > I am all for solving practical problems rather than pleasing aesthetic > >> > pleasure. > >> > > >> >> Updating system_time from the kernel clock once master clock > >> >> has been enabled can result in time backwards event, in case > >> >> kernel clock frequency is lower than TSC frequency. > >> >> > >> >> Disable master clock in case it is necessary to update it > >> >> from the resume path. > >> > > >> >> once master clock > >> >> has been enabled can result in time backwards event, in case > >> >> kernel clock frequency is lower than TSC frequency. > >> > > >> > guest visible clock = tsc_timestamp (updated at time 0) + scaled tsc reads. > >> > > >> > If the effective frequency of the kernel clock is lower (for example > >> > due to NTP correcting the TSC frequency of the system), and you resume > >> > and update the system, the following happens: > >> > > >> > guest visible clock = tsc_timestamp (updated at time 0) + scaled tsc reads=LARGE VALUE. > > > > guest reads clock to memory at location A = scaled tsc read. > > > > (note TSC is counting at frequency higher than advertised by > > processor, thats why NTP has to "slow down" the kernel clock > > which is maintained by successive reads of the TSC). > > > >> > suspend/resume event. > >> > guest visible clock = tsc_timestamp (updated at time N) + scaled tsc reads=0. ^^^^^^^^^^^^^ Err this was tsc_systemtime > > Now the guest visible clock contains a tsc_timestamp that has been > > corrected by NTP, over say 5 days. So the tiny NTP correction has > > been added up to something significant. > > > > guest reads clock to memory at location B = reads tsc_timestamp. > > > > Clock value in B (NTP corrected TSC) < clock value in A (RAW TSC) > > > > Yes? > > Sure, but I still don't see why this is a problem. Time as seen by the guest goes backwards. clock_gettime() = 1000. followed by clock_gettime() = 999. Can't allow that. > Why would the > guest compare raw TSC to NTP corrected TSC? Its "raw TSC" because thats what KVM exports to the guest, via the tsc_timestamp field. Its "corrected TSC" because thats what KVM exports to the guest, via system_time field (because the host is using TSC clocksource, and the host TSC clocksource is corrected by NTP). > > > > >> > >> I'm still not seeing the issue. > > > > I'll add two items to the three snapshots above, hopefully will make it > > clearer. > > Maybe that'll help. > > --Andy -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html