Hi Eric, > -----Original Message----- > From: Eric Auger <eric.auger@xxxxxxxxxx> > Sent: Thursday, August 4, 2022 7:39 AM > To: Liu, Rong L <rong.l.liu@xxxxxxxxx>; Dmytro Maluka > <dmy@xxxxxxxxxxxx>; Micah Morton <mortonm@xxxxxxxxxxxx> > Cc: Alex Williamson <alex.williamson@xxxxxxxxxx>; > kvm@xxxxxxxxxxxxxxx; Christopherson,, Sean <seanjc@xxxxxxxxxx>; > Paolo Bonzini <pbonzini@xxxxxxxxxx>; Tomasz Nowicki > <tn@xxxxxxxxxxxx>; Grzegorz Jaszczyk <jaz@xxxxxxxxxxxx>; Dmitry > Torokhov <dtor@xxxxxxxxxx> > Subject: Re: Add vfio-platform support for ONESHOT irq forwarding? > > H Liu, > > On 7/26/22 00:03, Liu, Rong L wrote: > > Hi Eric, > > > >> -----Original Message----- > >> From: Dmytro Maluka <dmy@xxxxxxxxxxxx> > >> Sent: Thursday, July 7, 2022 2:16 AM > >> To: eric.auger@xxxxxxxxxx; Micah Morton > <mortonm@xxxxxxxxxxxx> > >> Cc: Alex Williamson <alex.williamson@xxxxxxxxxx>; > >> kvm@xxxxxxxxxxxxxxx; Christopherson,, Sean <seanjc@xxxxxxxxxx>; > >> Paolo Bonzini <pbonzini@xxxxxxxxxx>; Liu, Rong L > >> <rong.l.liu@xxxxxxxxx>; Tomasz Nowicki <tn@xxxxxxxxxxxx>; > Grzegorz > >> Jaszczyk <jaz@xxxxxxxxxxxx>; Dmitry Torokhov <dtor@xxxxxxxxxx> > >> Subject: Re: Add vfio-platform support for ONESHOT irq forwarding? > >> > >> Hi Eric, > >> > >> On 7/7/22 10:25 AM, Eric Auger wrote: > >>>> Again, this doesn't seem to be true. Just as explained in my above > >>>> reply to Alex, the guest deactivates (EOI) the vIRQ already after the > >>>> completion of the vIRQ hardirq handler, not the vIRQ thread. > >>>> > >>>> So VFIO unmask handler gets called too early, before the interrupt > >>>> gets serviced and acked in the vIRQ thread. > >>> Fair enough, on vIRQ hardirq handler the physical IRQ gets unmasked. > >>> This event occurs on guest EOI, which triggers the resamplefd. But > what > >>> is the state of the vIRQ? Isn't it stil masked until the vIRQ thread > >>> completes, preventing the physical IRQ from being propagated to the > >> guest? > >> > >> Even if vIRQ is still masked by the time when > >> vfio_automasked_irq_handler() signals the eventfd (which in itself is > >> not guaranteed, I guess), I believe KVM is buffering this event, so > >> after the vIRQ is unmasked, this new IRQ will be injected to the guest > >> anyway. > >> > >>>> It seems the obvious fix is to postpone sending irq ack notifications > >>>> in KVM from EOI to unmask (for oneshot interrupts only). Luckily, > we > >>>> don't need to provide KVM with the info that the given interrupt is > >>>> oneshot. KVM can just find it out from the fact that the interrupt is > >>>> masked at the time of EOI. > >>> you mean the vIRQ right? > >> Right. > >> > >>> Before going further and we invest more time in that thread, please > >>> could you give us additional context info and confidence > >>> in/understanding of the stakes. This thread is from Jan 2021 and was > >>> discontinued for a while. vfio-platform currently only is enabled on > >> ARM > >>> and maintained for very few devices which properly implement reset > >>> callbacks and duly use an underlying IOMMU. > > Do you have more questions about this issue after following info and > POC from > > Dmytro? > > I agree that we tried to extend the vfio infrastructure to x86 and a few > more > > devices which may not "traditionally" supported by current vfio > implementation. > > However if we view vfio as a general infrastructure to be used for pass- > thru > > devices (this is what we intend to do, implementation may vary), > Oneshot > > interrupt is not properly handled. > > sorry for the delay, I was out of office and it took me some time to > catch up. > > Yes the problem and context is clearer now after the last emails. I now > understand the vEOI (inducing the VFIO pIRQ unmask) is done before the > device interrupt line is deasserted by the threaded handler and the vIRQ > unmask is done, causing spurious hits of the same oneshot IRQ. > > Thanks > > Eric No problem. Good summary of the problem:-) And thanks for confirming that you agree oneshot interrupt handling is an issue in virtualized environment. Thanks, Rong > > > > From this discussion when oneshot interrupt is first upstreamed: > > https://lkml.iu.edu/hypermail/linux/kernel/0908.1/02114.html it says: > "... we > > need to keep the interrupt line masked until the threaded handler has > executed. > > ... The interrupt line is unmasked after the thread handler function has > been > > executed." using today's vfio architecture, (physical) interrupt line is > > unmasked by vfio after EOI introduced vmexit, instead after the > threaded > > function has been executed (or in x86 world, when virtual interrupt is > > unmasked): this totally violates how oneshot irq should be used. We > have a few > > internal discussions and we couldn't find a solution which are both > correct and > > efficient. But at least we can target a "correct" solution first and that > will > > help us resolve bugs we have on our products now. > > > >> Sure. We are not really using vfio-platform for the devices we are > >> concerned with, since those are not DMA capable devices, and some > of > >> them are not really platform devices but I2C or SPI devices. Instead we > >> are using (hopefully temporarily) Micah's module for forwarding > >> arbitrary IRQs [1][2] which mostly reimplements the VFIO irq > forwarding > >> mechanism. > >> > >> Also with a few simple hacks I managed to use vfio-platform for the > same > >> thing (just as a PoC) and confirmed, unsurprisingly, that the problems > >> with oneshot interrupts are observed with vfio-platform as well. > >> > >> [1] > >> > https://chromium.googlesource.com/chromiumos/third_party/kernel/+/ > >> refs/heads/chromeos-5.10-manatee/virt/lib/platirqforward.c > >> > >> [2] > >> https://lkml.kernel.org/kvm/CAJ- > >> > EccPU8KpU96PM2PtroLjdNVDbvnxwKwWJr2B+RBKuXEr7Vw@mail.gmail > >> .com/T/ > >> > >> Thanks, > >> Dmytro