[Bug 217423] TSC synchronization issue in VM restore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=217423

--- Comment #2 from robert.hoo.linux@xxxxxxxxx ---
On 5/9/2023 10:01 PM, bugzilla-daemon@xxxxxxxxxx wrote:
> Hi
> 
> We are using lightweight VM with snapshot feature, the VM will be saved with
> 100ms+, and we found restore such VM will not get correct TSC, which will
> make
> the VM world stop about 100ms+ after restore (the stop time is same as time
> when VM saved).
> 
> After Investigation, we found the issue caused by TSC synchronization in
> setting MSR_IA32_TSC. In VM save, VMM (cloud-hypervisor) will record TSC of
> each
> VCPU, then restore the TSC of VCPU in VM restore (about 100ms+ in guest
> time).
> But in KVM, setting a TSC within 1 second is identified as TSC
> synchronization,
> and the TSC offset will not be updated in stable TSC environment, this will
> cause the lapic set up a hrtimer expires after 100ms+, 

Can elaborate more on this hrtimer issue/code path?

> the restored VM now will
> in stop state about 100ms+, if no other event to wake guest kernel in NO_HZ
> mode.
> 
> More investigation show, the MSR_IA32_TSC set from guest side has disabled
> TSC
> synchronization in commit 0c899c25d754 (KVM: x86: do not attempt TSC
> synchronization on guest writes), now host side will do TSC synchronization
> when
> setting MSR_IA32_TSC.
> 
> I think setting MSR_IA32_TSC within 1 second from host side should not be
> identified as TSC synchronization, like above case, VMM set TSC from host
> side
> always should be updated as user want.

This is heuristics, I think; at the very beginning, it was 5 seconds.
Perhaps nowadays, can we have some deterministic approach to identify a 
synchronization? e.g. add a new VM ioctl?
> 
> The MSR_IA32_TSC set code is complicated and with a long history, so I come
> here
> to try to get help about whether my thought is correct. Here is my fix to
> solve
> the issue, any comments are welcomed:
> 
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ceb7c5e9cf9e..9380a88b9c1f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2722,17 +2722,6 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu,
> u64 data)
>                           * kvm_clock stable after CPU hotplug
>                           */
>                          synchronizing = true;
> -               } else {
> -                       u64 tsc_exp = kvm->arch.last_tsc_write +
> -                                               nsec_to_cycles(vcpu,
> elapsed);
> -                       u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL;
> -                       /*
> -                        * Special case: TSC write with a small delta (1
> second)
> -                        * of virtual cycle time against real time is
> -                        * interpreted as an attempt to synchronize the CPU.
> -                        */
> -                       synchronizing = data < tsc_exp + tsc_hz &&
> -                                       data + tsc_hz > tsc_exp;
>                  }
>          }
> 
This hunk of code is indeed historic and heuristic. But simply removing it 
isn't the way.
Is the interval between your "save VM" and "restore VM" less than 1s?

An alternative, I think, is to bypass this directly write IA32_MSR_TSC way 
to set/sync TSC offsets, but follow new approach introduced in your VMM by

commit 828ca89628bfcb1b8f27535025f69dd00eb55207
Author: Oliver Upton <oliver.upton@xxxxxxxxx>
Date:   Thu Sep 16 18:15:38 2021 +0000

     KVM: x86: Expose TSC offset controls to userspace

...

Documentation/virt/kvm/devices/vcpu.rst:

4.1 ATTRIBUTE: KVM_VCPU_TSC_OFFSET

:Parameters: 64-bit unsigned TSC offset

...

Specifies the guest's TSC offset relative to the host's TSC. The guest's
TSC is then derived by the following equation:

   guest_tsc = host_tsc + KVM_VCPU_TSC_OFFSET

The following describes a possible algorithm to use for this purpose
...

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux