On Thu, 2023-09-14 at 11:50 +0800, Like Xu wrote: > On 13/9/2023 10:47 pm, Sean Christopherson wrote: > > On Wed, Sep 13, 2023, Like Xu wrote: > > > I'll wait for a cooling off period to see if the maintainers need me to post v7. > > > > You should have waiting to post v5, let alone v6. Resurrecting a thread after a > > month and not waiting even 7 hours for others to respond is extremely frustrating. > > You are right. I don't seem to be keeping up with many of other issues. Sorry > for that. > Wish there were 48 hours in a day. > > Back to this issue: for commit message, I'd be more inclined to David's > understanding, The discussion that Sean and I had should probably be reflected in the commit message too. To the end of the commit log you used for v6, after the final 'To that end:…' paragraph, let's add: Note that userspace can explicitly request a *synchronization* of the TSC by writing zero. For the purpose of this patch, this counts as "setting" the TSC. If userspace then subsequently writes an explicit non-zero value which happens to be within 1 second of the previous value, it will be 'corrected'. For that case, this preserves the prior behaviour of KVM (which always applied the 1-second 'correction' regardless of user vs. kernel). > @@ -2728,27 +2729,45 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, > u64 data) > elapsed = ns - kvm->arch.last_tsc_nsec; > > if (vcpu->arch.virtual_tsc_khz) { > + /* > + * Force synchronization when creating or hotplugging a vCPU, > + * i.e. when the TSC value is '0', to help keep clocks stable. > + * If this is NOT a hotplug/creation case, skip synchronization > + * on the first write from userspace so as not to misconstrue > + * state restoration after live migration as an attempt from > + * userspace to synchronize. > + */ You cannot *misconstrue* an attempt from userspace to synchronize. If userspace writes a zero, it's a sync attempt. If it's non-zero it's a TSC value to be set. It's not very subtle :) I think the 1-second slop thing is sufficiently documented in the 'else if' clause below, so I started writing an alternative 'overall' comment to go here and found it a bit redundant. So maybe let's just drop this comment and add one back in the if (data == 0) case... > if (data == 0) { > - /* > - * detection of vcpu initialization -- need to sync > - * with other vCPUs. This particularly helps to keep > - * kvm_clock stable after CPU hotplug > - */ /* * Force synchronization when creating a vCPU, or when * userspace explicitly writes a zero value. */ > synchronizing = true; > - } else { > + } else if (kvm->arch.user_set_tsc) { > u64 tsc_exp = kvm->arch.last_tsc_write + > nsec_to_cycles(vcpu, elapsed); > u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL; > /* > - * Special case: TSC write with a small delta (1 second) > - * of virtual cycle time against real time is > - * interpreted as an attempt to synchronize the CPU. > + * Here lies UAPI baggage: when a user-initiated TSC write has > + * a small delta (1 second) of virtual cycle time against the > + * previously set vCPU, we assume that they were intended to be > + * in sync and the delta was only due to the racy nature of the > + * legacy API. > + * > + * This trick falls down when restoring a guest which genuinely > + * has been running for less time than the 1 second of imprecision > + * which we allow for in the legacy API. In this case, the first > + * value written by userspace (on any vCPU) should not be subject > + * to this 'correction' to make it sync up with values that only Missing the word 'come' here too, in '…that only *come* from…', > + * from the kernel's default vCPU creation. Make the 1-second slop > + * hack only trigger if the user_set_tsc flag is already set. > + * > + * The correct answer is for the VMM not to use the legacy API. Maybe we should drop this line, as we don't actually have a sane API yet that VMMs can use instead.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature