On 13 September 2023 16:08:22 CEST, David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote: >From: David Woodhouse <dwmw@xxxxxxxxxxxx> > >The documentation on TSC migration using KVM_VCPU_TSC_OFFSET is woefully >inadequate. It ignores TSC scaling, and ignores the fact that the host >TSC may differ from one host to the next (and in fact because of the way >the kernel calibrates it, it generally differs from one boot to the next >even on the same hardware). > >Add KVM_VCPU_TSC_SCALE to extract the actual scale ratio and frac_bits, >and attempt to document the *awful* process that we're requiring userspace >to follow to merely preserve the TSC across migration. > >I may have thrown up in my mouth a little when writing that documentation. >It's an awful API. If we do this, we should be ashamed of ourselves. >(I also haven't tested the documented process yet). Ah, I think I missed a step in that documentation of the existing requirements. Because the KVM clock KVM_CLOCK_REALTIME reports the relationship to `CLOCK_REALTIME`, you end up with bugs if you migrate during a leap second. It should actually use `CLOCK_TAI`, shouldn't it? Is that even fixable by userspace? There are literally two different points in time when the kernel could return the same value of `CLOCK_REALTIME`, aren't there? I think the only way we can document that for userspace is to say that if the real-time clock value you get from kvm_get_clock contains an ambiguous time, you should try again...