RE: Add vfio-platform support for ONESHOT irq forwarding?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eric,

> -----Original Message-----
> From: Eric Auger <eric.auger@xxxxxxxxxx>
> Sent: Thursday, August 4, 2022 7:45 AM
> To: Liu, Rong L <rong.l.liu@xxxxxxxxx>; Dmytro Maluka
> <dmy@xxxxxxxxxxxx>; Christopherson,, Sean <seanjc@xxxxxxxxxx>
> Cc: Micah Morton <mortonm@xxxxxxxxxxxx>; Alex Williamson
> <alex.williamson@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; Paolo Bonzini
> <pbonzini@xxxxxxxxxx>; Tomasz Nowicki <tn@xxxxxxxxxxxx>; Grzegorz
> Jaszczyk <jaz@xxxxxxxxxxxx>; Dmitry Torokhov <dtor@xxxxxxxxxx>
> Subject: Re: Add vfio-platform support for ONESHOT irq forwarding?
> 
> Hi,
> On 7/12/22 18:02, Liu, Rong L wrote:
> > Hi Sean and Dmytro,
> >
> >> -----Original Message-----
> >> From: Dmytro Maluka <dmy@xxxxxxxxxxxx>
> >> Sent: Thursday, July 7, 2022 10:39 AM
> >> To: Christopherson,, Sean <seanjc@xxxxxxxxxx>
> >> Cc: Auger Eric <eric.auger@xxxxxxxxxx>; Micah Morton
> >> <mortonm@xxxxxxxxxxxx>; Alex Williamson
> >> <alex.williamson@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; Paolo Bonzini
> >> <pbonzini@xxxxxxxxxx>; Liu, Rong L <rong.l.liu@xxxxxxxxx>; Tomasz
> >> Nowicki <tn@xxxxxxxxxxxx>; Grzegorz Jaszczyk <jaz@xxxxxxxxxxxx>;
> >> Dmitry Torokhov <dtor@xxxxxxxxxx>
> >> Subject: Re: Add vfio-platform support for ONESHOT irq forwarding?
> >>
> >> On 7/7/22 17:00, Sean Christopherson wrote:
> >>> On Thu, Jul 07, 2022, Dmytro Maluka wrote:
> >>>> Hi Sean,
> >>>>
> >>>> On 7/6/22 10:39 PM, Sean Christopherson wrote:
> >>>>> On Wed, Jul 06, 2022, Dmytro Maluka wrote:
> >>>>>> This is not a problem on native, since for oneshot irq we keep the
> >>>>>> interrupt masked until the thread exits, so that the EOI at the end
> >>>>>> of hardirq doesn't result in immediate re-assert. In vfio + KVM
> >>>>>> case, however, the host doesn't check that the interrupt is still
> >>>>>> masked in the guest, so
> >>>>>> vfio_platform_unmask() is called regardless.
> >>>>> Isn't not checking that an interrupt is unmasked the real bug?
> >>>>> Fudging around vfio (or whatever is doing the premature
> unmasking)
> >>>>> bugs by delaying an ack notification in KVM is a hack, no?
> >>>> Yes, not checking that an interrupt is unmasked is IMO a bug, and
> my
> >>>> patch actually adds this missing checking, only that it adds it in
> >>>> KVM, not in VFIO. :)
> >>>>
> >>>> Arguably it's not a bug that VFIO is not checking the guest interrupt
> >>>> state on its own, provided that the resample notification it receives
> >>>> is always a notification that the interrupt has been actually acked.
> >>>> That is the motivation behind postponing ack notification in KVM in
> >>>> my patch: it is to ensure that KVM "ack notifications" are always
> >>>> actual ack notifications (as the name suggests), not just "eoi
> >> notifications".
> >>> But EOI is an ACK.  It's software saying "this interrupt has been
> >> consumed".
> >>
> >> Ok, I see we've had some mutual misunderstanding of the term "ack"
> >> here.
> >> EOI is an ACK in the interrupt controller sense, while I was talking
> about
> >> an ACK in the device sense, i.e. a device-specific action, done in a
> device
> >> driver's IRQ handler, which makes the device de-assert the IRQ line (so
> >> that the IRQ will not get re-asserted after an EOI or unmask).
> >>
> >> So the problem I'm trying to fix stems from the peculiarity of
> "oneshot"
> >> interrupts: an ACK in the device sense is done in a threaded handler,
> i.e.
> >> after an ACK in the interrupt controller sense, not before it.
> >>
> >> In the meantime I've realized one more reason why my patch is wrong.
> >> kvm_notify_acked_irq() is an internal KVM thing which is supposed to
> >> notify interested parts of KVM about an ACK rather in the interrupt
> >> controller sense, i.e. about an EOI as it is doing already.
> >>
> >> VFIO, on the other hand, rather expects a notification about an ACK in
> the
> >> device sense. So it still seems a good idea to me to postpone sending
> >> notifications to VFIO until an IRQ gets unmasked, but this postponing
> >> should be done not for the entire kvm_notify_acked_irq() but only for
> >> eventfd_signal() on resamplefd in irqfd_resampler_ack().
> >>
> >> Thanks for making me think about that.
> >>
> >>>> That said, your idea of checking the guest interrupt status in VFIO
> >>>> (or whatever is listening on the resample eventfd) makes sense to
> me
> >>>> too. The problem, though, is that it's KVM that knows the guest
> >>>> interrupt status, so KVM would need to let VFIO/whatever know it
> >>>> somehow. (I'm assuming we are focusing on the case of KVM kernel
> >>>> irqchip, not userspace or split irqchip.) So do you have in mind
> >>>> adding something like "maskfd" and "unmaskfd" to KVM IRQFD
> >> interface,
> >>>> in addition to resamplefd? If so, I'm actually in favor of such an
> >>>> idea, as I think it would be also useful for other purposes, regardless
> >> of oneshot interrupts.
> >>> Unless I'm misreading things, KVM already provides a mask notifier,
> >>> irqfd just needs to be wired up to use
> >> kvm_(un)register_irq_mask_notifier().
> >>
> > Interesting...  I initially thought that kvm doesn't "trap" on ioapic's
> mmio
> > write.  However, I just traced kvm/ioapic.c and it turns out
> > ioapic_write_indirect() was called many times.   Does trapping on
> ioapic's mmio
> > write cause vmexit on edge-triggered interrupt exit?  It seems the case
> because
> > IOREGSEL and IOWIN of IOAPIC are memory mapped but not the
> redirection entry
> > register for each IRQ (that is why the name indirect_write), in order to
> unmask
> > redirection entry register on the exit of each interrupt (edge-triggered
> or
> > level-triggered), kernel needs to write to IORESEL, which means vmexit
> if kvm
> > traps on ioapic's mmio write.  However, for pass-thru device which
> uses
> > edge-triggered interrupt (handled by vfio or something similar),
> interrupt
> > (pIRQ) is enabled by vfio and it seems unnecessary to cause a vmexit
> when guest
> > updates virtual ioapic.  I think the situation is similar for level-triggered
> > interrupt.  So 2 vmexits for each level-triggered interrupt completion,
> one for
> > EOI on lapic and another for unmask of IOAPIC register.  Does this
> sound right?
> > I thought with vfio (or similar architecture), there is no vmexit
> necessary on
> > edge-triggered interrupt completion and only one vmexit for level
> triggered
> > interrupt completion, except the caveats of oneshot interrupt.  Maybe I
> > misunderstand something?
> Currently, no vmexit for edge-sensitive and 1 vmexit for level-sensitive
> is what happens on ARM shared peripheral interrupts at least.
> Note there is one setup that could remove the need for the vmexit on
> vEOI: irq_bypass mode
> (https://www.linux-kvm.org/images/a/a8/01x04-ARMdevice.pdf slide
> 12-14):
> on GIC you have a mode that allows automatic completion of the
> physical
> IRQ when the corresponding vIRQ is completed. This mode would not be
> compatible with oneshort_irq.
> At some point we worked on this enablement but given the lack of
> vfio-platform customers this work was paused so we still have the
> mask/unmask vfio dance.
> 
> Thanks
> 
> Eric
> 

Thanks for the info.  I am not familiar with ARM but it is interesting to know
the difference between 2 architectures.

> >
> >> Thanks for the tip. I'll take a look into implementing this idea.
> >>
> >> It seems you agree that fixing this issue via a change in KVM (in irqfd,
> not
> >> in ioapic) seems to be the way to go.
> >>
> >> An immediate problem I see with
> kvm_(un)register_irq_mask_notifier()
> >> is that it is currently available for x86 only. Also, mask notifiers are
> called
> >> under ioapic->lock (I'm not sure yet if that is good or bad news for us).
> >>
> >> Thanks,
> >> Dmytro





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux