Re: [PATCH v3 11/11] KVM: nVMX: Wake L2 from HLT when nested posted-interrupt pending

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 24, 2020 at 01:55:15AM +0000, Sean Christopherson wrote:
> On Mon, Nov 23, 2020 at 04:13:49PM -0800, Oliver Upton wrote:
> > On Mon, Nov 23, 2020 at 4:10 PM Oliver Upton <oupton@xxxxxxxxxx> wrote:
> > >
> > > On Mon, Nov 23, 2020 at 2:42 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
> > > >
> > > > On 23/11/20 20:22, Oliver Upton wrote:
> > > > > The pi_pending bit works rather well as it is only a hint to KVM that it
> > > > > may owe the guest a posted-interrupt completion. However, if we were to
> > > > > set the guest's nested PINV as pending in the L1 IRR it'd be challenging
> > > > > to infer whether or not it should actually be injected in L1 or result
> > > > > in posted-interrupt processing for L2.
> > > >
> > > > Stupid question: why does it matter?  The behavior when the PINV is
> > > > delivered does not depend on the time it enters the IRR, only on the
> > > > time that it enters ISR.  If that happens while the vCPU while in L2, it
> > > > would trigger posted interrupt processing; if PINV moves to ISR while in
> > > > L1, it would be delivered normally as an interrupt.
> > > >
> > > > There are various special cases but they should fall in place.  For
> > > > example, if PINV is delivered during L1 vmentry (with IF=0), it would be
> > > > delivered at the next inject_pending_event when the VMRUN vmexit is
> > > > processed and interrupts are unmasked.
> > > >
> > > > The tricky case is when L0 tries to deliver the PINV to L1 as a posted
> > > > interrupt, i.e. in vmx_deliver_nested_posted_interrupt.  Then the
> > > >
> > > >                  if (!kvm_vcpu_trigger_posted_interrupt(vcpu, true))
> > > >                          kvm_vcpu_kick(vcpu);
> > > >
> > > > needs a tweak to fall back to setting the PINV in L1's IRR:
> > > >
> > > >                  if (!kvm_vcpu_trigger_posted_interrupt(vcpu, true)) {
> > > >                          /* set PINV in L1's IRR */
> > > >                         kvm_vcpu_kick(vcpu);
> > > >                 }
> > >
> > > Yeah, I think that's fair. Regardless, the pi_pending bit should've
> > > only been set if the IPI was actually sent. Though I suppose
> > 
> > Didn't finish my thought :-/
> > 
> > Though I suppose pi_pending was set unconditionally (and skipped the
> > IRR) since until recently KVM completely bungled handling the PINV
> > correctly when in the L1 IRR.
> > 
> > >
> > > > but you also have to do the same *in the PINV handler*
> > > > sysvec_kvm_posted_intr_nested_ipi too, to handle the case where the
> > > > L2->L0 vmexit races against sending the IPI.
> > >
> > > Indeed, there is a race but are we assured that the target vCPU thread
> > > is scheduled on the target CPU when that IPI arrives?
> > >
> > > I believe there is a 1-to-many relationship here, which is why I said
> > > each CPU would need to maintain a linked list of possible vCPUs to
> > > iterate and find the intended recipient.
> 
> Ya, the concern is that it's theoretically possible for the PINV to arrive in L0
> after a different vCPU has been loaded (or even multiple different vCPUs).      
> E.g. if the sending pCPU is hit with an NMI after checking vcpu->mode, and the  
> NMI runs for some absurd amount of time.  If that happens, the PINV handler     
> won't know which vCPU(s) should get an IRQ injected into L1 without additional  
> tracking.  KVM would need to set something like nested.pi_pending before doing  
> kvm_vcpu_trigger_posted_interrupt(), i.e. nothing really changes, it just gets  
> more complex.

If we do want to do something fancy with the L0 PINV handler, I think we could
avoid a list by having the dest vCPU busy wait before exiting guest_mode if
there is one or more PINV IPIs that are in the process of being sent, and then
wait again in vcpu_put() for the PINV to be handled.  The latter in particular
would rarely come into play, i.e. shouldn't impact performance.  That would
prevent having a somewhat unbounded et of vCPUs that may or may not have an
oustanding PINV.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux