Re: [PATCH v2] KVM: x86: reset lapic_timer.expired_tscdeadline at SET_LAPIC time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 20, 2016 at 04:20:56PM +0100, Alan Jenkins wrote:
> On 20/06/16 14:05, Marcelo Tosatti wrote:
> >Alan Jenkins reports hang at
> >https://bugzilla.redhat.com/show_bug.cgi?id=1337667,
> >due to guest TSC being set far behind than
> >lapic_timer.expired_tscdeadline, when restoring VM state
> >on top of currently active VM.
> >
> >It is not possible to disable LAPIC timer advancement
> >(by setting lapic_timer.expired_tscdeadline = 0), at
> >guest TSC write
> 
> I like that it acknowledges (though only implicitly) the guest can
> trigger arbitrary lockups of host CPUs.
> 
> >because:
> >
> >* APIC write: expiration = 1000.
> >* LAPIC tsc deadline code sets timer to 1000-30.
> >* Timer fires at 970.
> >* Guest writes TSC=w.
> >
> >Guest fails to VM-entry to process signal to perform
> >"vmload" in userspace.
> >
> >Case 1: w > 970:
> >Guest entry can be performed.
> >
> >Case 2: w < 970:
> >Guest entry should not be performed because "An interrupt is generated
> >when the logical processor’s time-stamp counter equals or exceeds the
> >target value in the IA32_TSC_DEADLINE MSR."
> >
> >In case 2, hardware would not fire an interrupt.
> >
> >To fix the problem, disable timer advancement when
> >userspace sets the LAPIC state. Setting of APIC
> >resets all APIC state, including
> >any pending interrupt.
> >
> >Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> >Reported-by: Alan Jenkins <alan.christopher.jenkins@xxxxxxxxx>
> 
> However I feel this doesn't admit (even implicitly) that host
> software (not necessarily root) can still hard-lockup the CPU. It
> depends on the sequence of operations, and the message doesn't show
> that sequence explicitly. I now understand what the sequence that
> _is_ in the message shows, but it's unfortunately distracting.
> 
> I.e. if you restore the LAPIC first (or omit to do so at all), then
> restore the TSC deadline MSR, then the TSC MSR.
> 
> The patch assumes that the LAPIC is restored after the MSRs so it
> can clear the incorrect value of expired_tscdeadline, right?

Yes.

> I didn't know whether this patch would work until I tested it,
> because I didn't try to nail down the exact sequence QEMU is using.

You are right about the host lockups -- sent another patch to fix that
one.

Thanks for the reports.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux