Re: [RFC PATCH] kvm: Extend irqfd to support level interrupts

Alex Williamson <alex.williamson@xxxxxxxxxx> · Mon, 18 Jun 2012 08:00:36 -0600

On Mon, 2012-06-18 at 09:00 +0300, Michael S. Tsirkin wrote:
> On Sun, Jun 17, 2012 at 04:15:44PM -0600, Alex Williamson wrote:
> > On Sun, Jun 17, 2012 at 3:38 PM, Alex Williamson
> > <alex.williamson@xxxxxxxxxx> wrote:
> > > On Sun, 2012-06-17 at 21:44 +0300, Michael S. Tsirkin wrote:
> > >> On Sat, Jun 16, 2012 at 10:34:39AM -0600, Alex Williamson wrote:
> > >> > I'm looking for opinions on this approach.  For vfio device assignment
> > >> > we minimally need a way to get EOIs from the in-kernel irqchip out to
> > >> > userspace.  Getting that out via an eventfd would allow us to bounce
> > >> > all level interrupts out to userspace, where we would de-assert the
> > >> > device interrupt in qemu and unmask the physical device.  Ideally we
> > >> > could deassert the interrupt in KVM, which allows us to send the EOI
> > >> > directly to vfio.  To do that, we need to use a new IRQ source ID so
> > >> > the guest sees the logical OR of qemu requested state and external
> > >> > device state.
> > >>
> > >> Given that yopu want to involve userspace anyway, why insist on irqfd
> > >> for this?  You can simply use KVM_IRQ_LINE_STATUS from qemu, no?
> > >
> > > Well, actually I'd like to have a way to bypass userspace, which the
> > > combination of an irqfd + eventfd w/ deassert does.
> 
> 
> Hmm but above you say
> 	> >> > Getting that out via an eventfd would allow us to bounce
> 	> >> > all level interrupts out to userspace, where we would de-assert the
> 	> >> > device interrupt in qemu and unmask the physical device.
> so what is the plan?

Sorry if this wasn't clear, I was attempting to state the minimal
required support vs a more ideal scenario.  The patch here ignores the
minimal support and implements the higher performance route.

> >  I'm not quite sure
> > > I understand how KVM_IRQ_LINE_STATUS would work for this.  AIUI, that
> > > effectively gives us a way to post an interrupt AND let us know whether
> > > it was masked, coalesced, or delivered.  So I'd have to poll by posting
> > > a potentially spurious interrupt and if it was spurious unmask the
> > > physical device and wait for a real interrupt?  What am I missing,
> > > because that seems barely functional?  Thanks,
> > 
> > Just to clarify, setting the interrupt from qemu isn't a problem.  We
> > can do that just like any other device.  The unique aspect is that we
> > need to know when the guest has issued an EOI so that we can unmask
> > the physical device interrupt and wait for it to fire again.  This is
> > where I don't understand how KVM_IRQ_LINE_STATUS helps us.
> > The minimal support I mention above just requires informing userspace
> > about the EOI, then we can deassert and unmask from qemu.  That means
> > we issue two more ioctl before we're enabled for the next interrupt.
> 
> Exactly.
> 
> > Rather than invent a new interface for a sub-optimal implementation,
> > fixing irqfd to support level triggered interrupts is potentially more
> > useful and I think this implementation is not specific to device
> > assignment.  BTW, what happens with vhost use of irqfd when the guest
> > runs out of MSI vectors?  Could it use this interface for that?
> > Thanks,
> > 
> > Alex
> 
> 
> Sure. OTOH this never was a real issue - if it was
> we could teach Linux to share MSI interrupt.

Or optimize the route without changing the guest.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html