Re: KVM: x86: reset lapic_timer.expired_tscdeadline at SET_LAPIC time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18/06/2016, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:

Alan Jenkins reports hang at
https://bugzilla.redhat.com/show_bug.cgi?id=1337667.

* APIC write: expiration = 1000.
* LAPIC tsc deadline code sets timer to 1000-30.
* Timer fires at 970.
* Guest writes TSC=w.

Guest fails to VM-entry to process signal to perform
"vmload" in userspace.

Case 1: w > 970:
Guest entry can be performed.

Case 2: w < 970:
Guest entry should not be performed because "An interrupt is generated
when the logical processor’s time-stamp counter equals or exceeds the
target value in the IA32_TSC_DEADLINE MSR."

Setting of APIC test resets all APIC state, including
any pending interrupts.

Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ea306ad..89be6e9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2991,6 +2991,7 @@ static int kvm_vcpu_ioctl_set_lapic(struct kvm_vcpu
*vcpu,
 {
 	kvm_apic_post_state_restore(vcpu, s);
 	update_cr8_intercept(vcpu);
+	vcpu->arch.apic->lapic_timer.expired_tscdeadline = 0;

 	return 0;
 }

Hi

I'm happy to check my own test if this is considered a solution. (And apologies for how my own efforts go... flailing around and not always communicating effectively...)


1) I think the commit message could be improved. Specifically it says the problem is due to TSC write, then that we need to fix LAPIC reset. It doesn't say what the connection is between the two.

(I.e. you would have to read the bug report, & then realize that it _happens_ to involve a LAPIC state restore as well, because it's triggered by reloading the entire VM state from a snapshot)

Doing so might make the intention of this commit clearer. I've been trying to work out a high quality fix, which prevents the lockups revealed by this case. I think this commit would fix my bug nicely, but it doesn't help e.g. if the LAPIC restore was omitted from the sequence.

(Both TSC and the value of TSC deadline are set by MSRs, not by writing to the LAPIC. So I don't think the LAPIC state restore is necessary to trigger the issue).


2) If this approach is considered expedient, I have two comments -

i) isn't it better to set lapic_timer.expired_tscdeadline in lapic.c, i.e. in apic_post_state_restore() ?

ii) then, wouldn't it be more correct to set it between hrtimer_cancel() and start_apic_timer()? It doesn't sound right to clear expired_tscdeadline, in case it was set immediately by start_apic_timer()

Alan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux