Re: [RFC][PATCH] KVM: Introduce direct MSI message injection for in-kernel irqchips

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 24, 2011 at 03:11:25PM +0200, Jan Kiszka wrote:
> On 2011-10-24 14:43, Michael S. Tsirkin wrote:
> > On Mon, Oct 24, 2011 at 02:06:08PM +0200, Jan Kiszka wrote:
> >> On 2011-10-24 13:09, Avi Kivity wrote:
> >>> On 10/24/2011 12:19 PM, Jan Kiszka wrote:
> >>>>>
> >>>>> With the new feature it may be worthwhile, but I'd like to see the whole
> >>>>> thing, with numbers attached.
> >>>>
> >>>> It's not a performance issue, it's a resource limitation issue: With the
> >>>> new API we can stop worrying about user space device models consuming
> >>>> limited IRQ routes of the KVM subsystem.
> >>>>
> >>>
> >>> Only if those devices are in the same process (or have access to the
> >>> vmfd).  Interrupt routing together with irqfd allows you to disaggregate
> >>> the device model.  Instead of providing a competing implementation with
> >>> new limitations, we need to remove the limitations of the old
> >>> implementation.
> >>
> >> That depends on where we do the cut. Currently we let the IRQ source
> >> signal an abstract edge on a pre-allocated pseudo IRQ line. But we
> >> cannot build correct MSI-X on top of the current irqfd model as we lack
> >> the level information (for PBA emulation). *)
> > 
> > 
> > I don't agree here. IMO PBA emulation would need to
> > clear pending bits on interrupt status register read.
> > So clearing pending bits could be done by ioctl from qemu
> > while setting them would be done from irqfd.
> 
> How should QEMU know if the reason for "pending" has been cleared at
> device level if the device is outside the scope of QEMU? This model only
> works for PV devices when you agree that spurious IRQs are OK.

A read or irq status clears pending in the same way it clears
irq line for level.  I don't think this generates spurious irqs. Yes it
only works for PV.

For assigned devices, the only way I see to implement PBA
correctly is by masking the vector in the device
and looking at the actual pending bit.

> > 
> >> So we either need to
> >> extend the existing model anyway -- or push per-vector masking back to
> >> the IRQ source. In the latter case, it would be a very good chance to
> >> give up on limited pseudo GSIs with static routes and do MSI messaging
> >> from external IRQ sources to KVM directly.
> >> But all those considerations affect different APIs than what I'm
> >> proposing here. We will always need a way to inject MSIs in the context
> >> of the VM as there will always be scenarios where devices are better run
> >> in that very same context, for performance or simplicity or whatever
> >> reasons. E.g., I could imagine that one would like to execute an
> >> emulated IRQ remapper rather in the hypervisor context than
> >> "over-microkernelized" in a separate process.
> >>
> >> Jan
> >>
> >> *) Realized this while trying to generalize the proposed MSI-X MMIO
> >> acceleration for assigned devices to arbitrary device models, vhost-net,
> > 
> > I'm actually working on a qemu patch to get pba emulation working correctly.
> > I think it's doable with existing irqfd.
> 
> irqfd has no notion of level. You can only communicate a rising edge and
> then need a side channel for the state of the edge reason.

True. But we only need that for PBA read which is unused ATM.
So kvm can just send the read to userspace, have qemu query
vfio or whatever.

> > 
> >> and specifically vfio.
> > 
> > Interesting. How would you clear the pseudo interrupt level?
> 
> Ideally: not at all (for MSI). If we manage the mask at device level, we
> only need to send the message if there is actually something to deliver
> to the interrupt controller and masked input events would be lost on
> real HW as well.

Not sure I understand. we certainly shouldn't send masked
interrupts to the APIC if for no other reason that
the message value is invalid while masked.

> That said, we still need to address the irqfd level topic for the finite
> amount of legacy interrupt lines. If a line is masked at an IRQ
> controller, the device need to keep the controller up to date /wrt to
> the line state, or the controller has to poll the current state on
> unmask to avoid spurious injections.
> 
> Jan

Yes, level interrupts are tricky.

> -- 
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux