On 7/24/23 09:35, Like Xu wrote:
From: Like Xu <likexu@xxxxxxxxxxx> Avoid using kvm->arch.cur_tsc_offset until tsc sync is actually needed. When the vcpu is created for the first time, its tsc is 0, and KVM calculates its tsc_offset according to kvm_compute_l1_tsc_offset(). The userspace will then set the first non-zero tsc expected value for the first guest vcpu, at this time there is no need to play the tsc synchronize mechanism, the KVM should continue to write tsc_offset based on previously used kvm_compute_l1_tsc_offset(). If the tsc synchronization mechanism is incorrectly applied at this point, KVM will use the rewritten offset of the kvm->arch.cur_tsc_offset (on tsc_stable machines) to write tsc_offset, which is not in line with the expected tsc of the user space. Based on the current code, the vcpu's tsc_offset is not configured as expected, resulting in significant guest service response latency, which is observed in our production environment. Applying the tsc synchronization logic after the vcpu's tsc_generation and KVM's cur_tsc_generation have completed their first match and started keeping tracked helps to fix this issue, which also does not break any existing guest tsc test cases. Reported-by: Yong He <alexyonghe@xxxxxxxxxxx> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217423 Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> Suggested-by: Oliver Upton <oliver.upton@xxxxxxxxx> Signed-off-by: Like Xu <likexu@xxxxxxxxxxx> --- V1 -> V2 Changelog: - Test the 'kvm_vcpu_has_run(vcpu)' proposal; (Sean) - Test the kvm->arch.user_changed_tsc proposal; (Oliver) V1: https://lore.kernel.org/kvm/20230629164838.66847-1-likexu@xxxxxxxxxxx/ arch/x86/kvm/x86.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a6b9bea62fb8..4724dacea2df 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2721,7 +2721,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 data) * kvm_clock stable after CPU hotplug */ synchronizing = true; - } else { + } else if (kvm->arch.nr_vcpus_matched_tsc) { u64 tsc_exp = kvm->arch.last_tsc_write + nsec_to_cycles(vcpu, elapsed); u64 tsc_hz = vcpu->arch.virtual_tsc_khz * 1000LL;
I am not sure, when will kvm->arch.nr_vcpus_matched_tsc ever be nonzero again once a new generation starts? If this patch ever causes synchronizing to be false, matched will also be false when calling __kvm_synchronize_tsc.
What was wrong with Oliver's proposed patch? Paolo