https://bugzilla.kernel.org/show_bug.cgi?id=217423 Bug ID: 217423 Summary: TSC synchronization issue in VM restore Product: Virtualization Version: unspecified Hardware: All OS: Linux Status: NEW Severity: normal Priority: P3 Component: kvm Assignee: virtualization_kvm@xxxxxxxxxxxxxxxxxxxx Reporter: zhuangel570@xxxxxxxxx Regression: No Hi We are using lightweight VM with snapshot feature, the VM will be saved with 100ms+, and we found restore such VM will not get correct TSC, which will make the VM world stop about 100ms+ after restore (the stop time is same as time when VM saved). After Investigation, we found the issue caused by TSC synchronization in setting MSR_IA32_TSC. In VM save, VMM (cloud-hypervisor) will record TSC of each VCPU, then restore the TSC of VCPU in VM restore (about 100ms+ in guest time). But in KVM, setting a TSC within 1 second is identified as TSC synchronization, and the TSC offset will not be updated in stable TSC environment, this will cause the lapic set up a hrtimer expires after 100ms+, the restored VM now will in stop state about 100ms+, if no other event to wake guest kernel in NO_HZ mode. More investigation show, the MSR_IA32_TSC set from guest side has disabled TSC synchronization in commit 0c899c25d754 (KVM: x86: do not attempt TSC synchronization on guest writes), now host side will do TSC synchronization when setting MSR_IA32_TSC. I think setting MSR_IA32_TSC within 1 second from host side should not be identified as TSC synchronization, like above case, VMM set TSC from host side always should be updated as user want. The MSR_IA32_TSC set code is complicated and with a long history, so I come here to try to get help about whether my thought is correct. Here is my fix to solve the issue, any comments are welcomed: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ceb7c5e9cf9e..9380a88b9c1f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2722,17 +2722,6 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data) * kvm_clock stable after CPU hotplug */ synchronizing = true; - } else { - u64 tsc_exp = kvm->arch.last_tsc_write + - nsec_to_cycles(vcpu, elapsed); - u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL; - /* - * Special case: TSC write with a small delta (1 second) - * of virtual cycle time against real time is - * interpreted as an attempt to synchronize the CPU. - */ - synchronizing = data < tsc_exp + tsc_hz && - data + tsc_hz > tsc_exp; } } -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.