On Fri, 2023-10-13 at 12:02 -0700, Sean Christopherson wrote: > On Fri, Oct 13, 2023, David Woodhouse wrote: > > On Fri, 2023-10-13 at 11:07 -0700, Sean Christopherson wrote: > > > I generally support the idea, but I think it needs to an opt-in from userspace. > > > Essentially a "I pinky swear to give all vCPUs the same TSC frequency, to not > > > suspend the host, and to not run software/firmware that writes IA32_TSC_ADJUST". > > > AFAICT, there are too many edge cases and assumptions about userspace for KVM to > > > safely couple kvmclock to guest TSC by default. > > > > I think IA32_TSC_ADJUST is OK, isn't it? There is a "real" TSC value > > and if vCPUs adjust themselves forward and backwards from that, it's > > just handled as a delta. > > I meant the host writing IA32_TSC_ADJUST. E.g. if a host SMM handler mucks with > TSC offsets to try and hide the time spent in the SMM handler, then the platform > owner gets to keep the pieces. Oh $DEITY yes, absolutely. > > And we solved 'give all vCPUS the same TSC frequency' by making that > > KVM-wide. > > > > Maybe suspending and resuming the host can be treated like live > > migration, where you know the host TSC is different so you have to make > > do with a delta based on CLOCK_TAI. > > > > But while I'm picking on the edge cases and suggesting that we *can* > > cope with some of them, I do agree with your suggestion that "let > > kvmclock run by itself without being clamped back to > > CLOCK_MONOTONIC_RAW" should be an opt *in* feature. > > Yeah, I'm of the mind that just because we can cope with some edge cases, doesn't > mean we should. At this point, kvmclock really should be considered deprecated > on modern hardware. I.e. needs to be supported for older VMs, but shouldn't be > advertised/used when creating entirely new VMs. > > Hence my desire to go with a low effort solution for getting kvmclock to play nice > with modern hardware. Yeah... although the kvmclock is also the *Xen* clock (and the clock on which Xen timers are based). So while I'm perfectly prepared to call those Xen guests "older VMs", I do still have to launch quite a lot of new ones the same... :) > > > > [1] Yes, I believe "back" does happen. I have test failures in my queue > > > > to look at, where guests see the "Xen" clock going backwards. > > > > > > Yeah, I assume "back" can happen based purely on the wierdness of the pvclock math.o > > > > > > What if we add a module param to disable KVM's TSC synchronization craziness > > > entirely? If we first clean up the peroidic sync mess, then it seems like it'd > > > be relatively straightforward to let kill off all of the synchronization, including > > > the synchronization of kvmclock to the host's TSC-based CLOCK_MONOTONIC_RAW. > > > > > > Not intended to be a functional patch... > > > > Will stare harder at the actual patch when it isn't Friday night. > > > > In the meantime, I do think a KVM cap that the VMM opts into is better > > than a module param? > > Hmm, yeah, I think a capability would be cleaner overall. Then KVM could return > -EINVAL instead of silently forcing synchronization if the platform conditions > aren't meant, e.g. if the TSC isn't constant or if the host timekeeping isn't > using TSC. Right. > The interaction with kvmclock_periodic_sync might be a bit awkward, but that's > easy enough to solve with a wrapper. At least that's all per-KVM already. We do also still need to deal with the mess of having a single system-wide kvm_guest_has_master_clock and different KVMs explicitly setting that to 1 or 0, don't we?
Attachment:
smime.p7s
Description: S/MIME cryptographic signature