Re: [PATCH 00/10] KVM: Add idempotent controls for migrating system counter state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/06/21 17:11, Oliver Upton wrote:
Perhaps this will clarify the motivation for my approach: what if the
kernel wasn't the authoritative source for wall time in a system?
Furthermore, VMMs may wish to define their own heuristics for counter
migration (e.g. we only allow the counter to 'jump' by X seconds
during migration blackout). If a VMM tried to assert its whims on the
TSC state before handing it down to the kernel, we would inadvertently
be sampling the host counter twice again. And, anything can happen
between the time we assert elapsed time is within SLO and KVM
computing the TSC offset (scheduling, L0 hypervisor preemption).

So, Maxim's changes would address my concerns in the general case, but
maybe not as much in edge cases where an operator may make decisions
about how much time can elapse while the guest hasn't had CPU time.

I think I understand. We still need a way to get a consistent (host_TSC, nanosecond) pair on the source, the TSC offset is not enough. This is arguably not a KVM issue, but we're still the one having to provide a solution, so we would need a slightly more complicated interface.

1) In the kernel:

* KVM_GET_CLOCK should also return kvmclock_ns - realtime_ns and host_TSC. It should set two flags in struct kvm_clock_data saying that the respective fields are valid.

* KVM_SET_CLOCK checks the flag for kvmclock_ns - realtime_ns. If set, it looks at the kvmclock_ns - realtime_ns field and disregards the kvmclock_ns field.

2) On the source, userspace will:

* per-VM: invoke KVM_GET_CLOCK. Migrate kvmclock_ns - realtime_ns and kvmclock_ns. Stash host_TSC for subsequent use.

* per-vCPU: retrieve guest_TSC - host_TSC with your new ioctl. Sum it to the stashed host_TSC value; migrate the resulting value (a guest TSC).

3) On the destination:

* per-VM: Pass the migrated kvmclock_ns - realtime_ns to KVM_SET_CLOCK. Use KVM_GET_CLOCK to get a consistent pair of kvmclock_ns ("newNS" below) and host TSC ("newHostTSC"). Stash them for subsequent use, together with the migrated kvmclock_ns value ("sourceNS") that you haven't used yet.

* per-vCPU: using the data of the previous step, and the sourceGuestTSC in the migration stream, compute sourceGuestTSC + (newNS - sourceNS) * freq - newHostTSC. This is the TSC offset to be passed to your new ioctl.

Your approach still needs to use the "quirky" approach to host-initiated MSR_IA32_TSC_ADJUST writes, which write the MSR without affecting the VMCS offset. This is just a documentation issue.

Does this make sense?

Paolo




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux