On Fri, Dec 11, 2015 at 3:48 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > On Fri, Dec 11, 2015 at 01:57:23PM -0800, Andy Lutomirski wrote: >> On Thu, Dec 10, 2015 at 1:32 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: >> > On Wed, Dec 09, 2015 at 01:10:59PM -0800, Andy Lutomirski wrote: >> >> I'm trying to clean up kvmclock and I can't get it to work at all. My >> >> host is 4.4.0-rc3-ish on a Skylake laptop that has a working TSC. >> >> >> >> If I boot an SMP (2 vcpus) guest, tracing says: >> >> >> >> qemu-system-x86-2517 [001] 102242.610654: kvm_update_master_clock: >> >> masterclock 0 hostclock tsc offsetmatched 0 >> >> qemu-system-x86-2521 [000] 102242.613742: kvm_track_tsc: >> >> vcpu_id 0 masterclock 0 offsetmatched 0 nr_online 1 hostclock tsc >> >> qemu-system-x86-2522 [000] 102242.622959: kvm_track_tsc: >> >> vcpu_id 1 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc >> >> qemu-system-x86-2521 [000] 102242.645123: kvm_track_tsc: >> >> vcpu_id 0 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc >> >> qemu-system-x86-2522 [000] 102242.647291: kvm_track_tsc: >> >> vcpu_id 1 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc >> >> qemu-system-x86-2521 [000] 102242.653369: kvm_track_tsc: >> >> vcpu_id 0 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc >> >> qemu-system-x86-2522 [000] 102242.653429: kvm_track_tsc: >> >> vcpu_id 1 masterclock 0 offsetmatched 1 nr_online 2 hostclock tsc >> >> qemu-system-x86-2517 [001] 102242.653447: kvm_update_master_clock: >> >> masterclock 0 hostclock tsc offsetmatched 1 >> >> qemu-system-x86-2521 [000] 102242.653657: kvm_update_master_clock: >> >> masterclock 0 hostclock tsc offsetmatched 1 >> >> qemu-system-x86-2522 [002] 102242.664448: kvm_update_master_clock: >> >> masterclock 0 hostclock tsc offsetmatched 1 >> >> >> >> >> >> If I boot a UP guest, tracing says: >> >> >> >> qemu-system-x86-2567 [001] 102370.447484: kvm_update_master_clock: >> >> masterclock 0 hostclock tsc offsetmatched 1 >> >> qemu-system-x86-2571 [002] 102370.447688: kvm_update_master_clock: >> >> masterclock 0 hostclock tsc offsetmatched 1 >> >> >> >> I suspect, but I haven't verified, that this is fallout from: >> >> >> >> commit 16a9602158861687c78b6de6dc6a79e6e8a9136f >> >> Author: Marcelo Tosatti <mtosatti@xxxxxxxxxx> >> >> Date: Wed May 14 12:43:24 2014 -0300 >> >> >> >> KVM: x86: disable master clock if TSC is reset during suspend >> >> >> >> Updating system_time from the kernel clock once master clock >> >> has been enabled can result in time backwards event, in case >> >> kernel clock frequency is lower than TSC frequency. >> >> >> >> Disable master clock in case it is necessary to update it >> >> from the resume path. >> >> >> >> Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx> >> >> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> >> >> >> >> >> Can we please stop making kvmclock more complex? It's a beast right >> >> now, and not in a good way. It's far too tangled with the vclock >> >> machinery on both the host and guest sides, the pvclock stuff is not >> >> well thought out (even in principle in an ABI sense), and it's never >> >> been clear to my what problem exactly the kvmclock stuff is supposed >> >> to solve. >> >> >> >> I'm somewhat tempted to suggest that we delete kvmclock entirely and >> >> start over. A correctly functioning KVM guest using TSC (i.e. >> >> ignoring kvmclock entirely) >> >> seems to work rather more reliably and >> >> considerably faster than a kvmclock guest. >> >> >> >> --Andy >> >> >> >> -- >> >> Andy Lutomirski >> >> AMA Capital Management, LLC >> > >> > Andy, >> > >> > I am all for solving practical problems rather than pleasing aesthetic >> > pleasure. >> > >> >> Updating system_time from the kernel clock once master clock >> >> has been enabled can result in time backwards event, in case >> >> kernel clock frequency is lower than TSC frequency. >> >> >> >> Disable master clock in case it is necessary to update it >> >> from the resume path. >> > >> >> once master clock >> >> has been enabled can result in time backwards event, in case >> >> kernel clock frequency is lower than TSC frequency. >> > >> > guest visible clock = tsc_timestamp (updated at time 0) + scaled tsc reads. >> > >> > If the effective frequency of the kernel clock is lower (for example >> > due to NTP correcting the TSC frequency of the system), and you resume >> > and update the system, the following happens: >> > >> > guest visible clock = tsc_timestamp (updated at time 0) + scaled tsc reads=LARGE VALUE. > > guest reads clock to memory at location A = scaled tsc read. > > (note TSC is counting at frequency higher than advertised by > processor, thats why NTP has to "slow down" the kernel clock > which is maintained by successive reads of the TSC). > >> > suspend/resume event. >> > guest visible clock = tsc_timestamp (updated at time N) + scaled tsc reads=0. > > Now the guest visible clock contains a tsc_timestamp that has been > corrected by NTP, over say 5 days. So the tiny NTP correction has > been added up to something significant. > > guest reads clock to memory at location B = reads tsc_timestamp. > > Clock value in B (NTP corrected TSC) < clock value in A (RAW TSC) > > Yes? Sure, but I still don't see why this is a problem. Why would the guest compare raw TSC to NTP corrected TSC? > >> >> I'm still not seeing the issue. > > I'll add two items to the three snapshots above, hopefully will make it > clearer. Maybe that'll help. --Andy -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html