On Wed, Aug 5, 2020 at 2:33 PM Oliver Upton <oupton@xxxxxxxxxx> wrote: > > On Wed, Aug 5, 2020 at 1:46 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > > On 05/08/20 18:06, Oliver Upton wrote: > > > On Tue, Jul 28, 2020 at 11:33 AM Oliver Upton <oupton@xxxxxxxxxx> wrote: > > >> > > >> On Tue, Jul 21, 2020 at 8:26 PM Oliver Upton <oupton@xxxxxxxxxx> wrote: > > >>> > > >>> To date, VMMs have typically restored the guest's TSCs by value using > > >>> the KVM_SET_MSRS ioctl for each vCPU. However, restoring the TSCs by > > >>> value introduces some challenges with synchronization as the TSCs > > >>> continue to tick throughout the restoration process. As such, KVM has > > >>> some heuristics around TSC writes to infer whether or not the guest or > > >>> host is attempting to synchronize the TSCs. > > >>> > > >>> Instead of guessing at the intentions of a VMM, it'd be better to > > >>> provide an interface that allows for explicit synchronization of the > > >>> guest's TSCs. To that end, this series introduces the > > >>> KVM_{GET,SET}_TSC_OFFSET ioctls, yielding control of the TSC offset to > > >>> userspace. > > >>> > > >>> v2 => v3: > > >>> - Mark kvm_write_tsc_offset() as static (whoops) > > >>> > > >>> v1 => v2: > > >>> - Added clarification to the documentation of KVM_SET_TSC_OFFSET to > > >>> indicate that it can be used instead of an IA32_TSC MSR restore > > >>> through KVM_SET_MSRS > > >>> - Fixed KVM_SET_TSC_OFFSET to participate in the existing TSC > > >>> synchronization heuristics, thereby enabling the KVM masterclock when > > >>> all vCPUs are in phase. > > >>> > > >>> Oliver Upton (4): > > >>> kvm: x86: refactor masterclock sync heuristics out of kvm_write_tsc > > >>> kvm: vmx: check tsc offsetting with nested_cpu_has() > > >>> selftests: kvm: use a helper function for reading cpuid > > >>> selftests: kvm: introduce tsc_offset_test > > >>> > > >>> Peter Hornyack (1): > > >>> kvm: x86: add KVM_{GET,SET}_TSC_OFFSET ioctls > > >>> > > >>> Documentation/virt/kvm/api.rst | 31 ++ > > >>> arch/x86/include/asm/kvm_host.h | 1 + > > >>> arch/x86/kvm/vmx/vmx.c | 2 +- > > >>> arch/x86/kvm/x86.c | 147 ++++--- > > >>> include/uapi/linux/kvm.h | 5 + > > >>> tools/testing/selftests/kvm/.gitignore | 1 + > > >>> tools/testing/selftests/kvm/Makefile | 1 + > > >>> .../testing/selftests/kvm/include/test_util.h | 3 + > > >>> .../selftests/kvm/include/x86_64/processor.h | 15 + > > >>> .../selftests/kvm/include/x86_64/svm_util.h | 10 +- > > >>> .../selftests/kvm/include/x86_64/vmx.h | 9 + > > >>> tools/testing/selftests/kvm/lib/kvm_util.c | 1 + > > >>> tools/testing/selftests/kvm/lib/x86_64/vmx.c | 11 + > > >>> .../selftests/kvm/x86_64/tsc_offset_test.c | 362 ++++++++++++++++++ > > >>> 14 files changed, 550 insertions(+), 49 deletions(-) > > >>> create mode 100644 tools/testing/selftests/kvm/x86_64/tsc_offset_test.c > > >>> > > >>> -- > > >>> 2.28.0.rc0.142.g3c755180ce-goog > > >>> > > >> > > >> Ping :) > > > > > > Ping > > > > Hi Oliver, > > > > I saw these on vacation and decided I would delay them to 5.10. However > > they are definitely on my list. > > > > Hope you enjoyed vacation! > > > I have one possibly very stupid question just by looking at the cover > > letter: now that you've "fixed KVM_SET_TSC_OFFSET to participate in the > > existing TSC synchronization heuristics" what makes it still not > > "guessing the intentions of a VMM"? (No snark intended, just quoting > > the parts that puzzled me a bit). > > Great point. > > I'd still posit that this series disambiguates userspace > control/synchronization of the TSCs. If a VMM wants the TSCs to be in > sync, it can write identical offsets to all vCPUs > > That said, participation in TSC synchronization is presently necessary > due to issues migrating a guest that was in the middle of a TSC sync. > In doing so, we still accomplish synchronization on the other end of > migration with a well-timed mix of host and guest writes. > > > > > My immediate reaction was that we should just migrate the heuristics > > state somehow > > Yeah, I completely agree. I believe this series fixes the > userspace-facing issues and your suggestion would address the > guest-facing issues. > > > but perhaps I'm missing something obvious. > > Not necessarily obvious, but I can think of a rather contrived example > where the sync heuristics break down. If we're running nested and get > migrated in the middle of a VMM setting up TSCs, it's possible that > enough time will pass that we believe subsequent writes to not be of > the same TSC generation. An example that has been biting us frequently in self-tests: migrate a VM with less than a second accumulated in its TSC. At the destination, the TSCs are zeroed.