On Mon, 2022-03-21 at 14:10 +0100, Paolo Bonzini wrote: > On 3/21/22 13:16, David Woodhouse wrote: > > Hm, I still don't really understand why it's a 5-tuple and why we care > > about the *host* TSC at all. > > > > Fundamentally, guest state is *guest* state. There is a 3-tuple of > > { NTP-synchronized time of day, Guest KVM clock, Guest TSC value } > > > > (plus a TSC offset from vCPU0, for each other vCPU perhaps.) > > > > All else is redundant, and including host values is just wrong. [...] > > I don't understand why the actual *value* of the host TSC is something > > that userspace needs to see. > > Ok, I understand now. > > You're right, you don't need the value of the host TSC; you only need a > {hostTOD, guestNS, guestTSC} tuple for every vCPU recorded on the source. Actually, isn't that still redundant? All we really need is a *single* { hostTOD, guestNS } for the KVM clock, and then each vCPU has its own { guestNS, guestTSC } tuple. > If all the frequencies are the same, that can be "packed" as {hostTOD, > guestNS, anyTSC} plus N offsets. The N offsets in turn can be > KVM_VCPU_TSC_OFFSET if you use the hostTSC, or the offsets between vCPU0 > and the others if you use the vCPU0 guestTSC. > > I think reasoning in terms of the host TSC is nicer in general, because > it doesn't make vCPU0 special. But apart from the aesthetics of having > a "special" vCPU, making vCPU0 special is actually harder, because the > TSC frequencies need not be the same for all vCPUs. I think that is a > mistake in the KVM API, but it was done long before I was involved (and > long before I actually understood this stuff). If each vCPU has its own { guestNS, guestTSC } tuple that actually works out just fine even if they have different frequencies. The *common* case would be that they are all at the same frequency and have the same value at the same time. But other cases can be accommodated. I'm not averse to *reasoning* in terms of the host TSC; I just don't like exposing the actual numbers to userspace and forcing userspace to access it through some new ABI just in order to translate some other fundamental property (either the time of day, or the guest data) into that domain. And I especially don't like considering it part of 'guest state'. Right now when I'm not frowning at TSC synchronisation issues, I'm working on guest transparent live migration from actual Xen, to KVM-pretending-to-be-Xen. That kind of insanity is only really possible with a strict adherence to the design principle that "guest state is guest state", without conflating it with host/implementation details :)
Attachment:
smime.p7s
Description: S/MIME cryptographic signature