Re: [RFC PATCH 0/4] ARM: KVM: Enable the ioeventfd capability of KVM on ARM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 06, 2014 at 02:57:33PM +0200, Antonios Motakis wrote:
> On Tue, May 6, 2014 at 2:35 PM, Christoffer Dall
> <christoffer.dall@xxxxxxxxxx> wrote:
> > On Tue, May 06, 2014 at 01:11:37PM +0200, Antonios Motakis wrote:
> >> On Mon, May 5, 2014 at 4:52 PM, Eric Auger <eric.auger@xxxxxxxxxx> wrote:
> >> >
> >> > Dear All,
> >> >
> >> > I will try to sketch a global view of how to assign a physical IRQ to a
> >> > KVM guest using the following subsystems:
> >> >
> >> > - on kernel side: VFIO driver, KVM IRQFD and GSI routing
> >>
> >> On the kernel side there are two sides:
> >>
> >> 1) The VFIO eventfd support which is underway for v6, as mentioned in
> >> the VFIO patches.
> >> 2) KVM IRQFD support, which depends on GSI routing as we discussed here.
> >>
> >> For (1) it is not a different story than what VFIO already does on
> >> x86. We are going to support the same API already documented in
> >> include/uapi/linux/vfio.h and will be part of the next version of the
> >> VFIO_PLATFORM patches.
> >>
> >> For (2) we would like to offer an RFC patch, as soon as we have some
> >> kind of agreement with that what we have started is the right
> >> direction and desirable, which is why we started this thread.
> >>
> >> > - on user side: QEMU system image featuring VFIO QEMU device.
> >> >
> >> > It aims at sharing knowledge and checking that the understanding of the
> >> > legacy is correct (MSI routing is out of scope).
> >> >
> >> > GSI routing table:
> >> >
> >> > Each VM has its own routing table. This later aims at storing how a
> >> > physical IRQ (the gsi) is connected to a guest.
> >> >
> >> > GSI routing entries contain the following fields (not exhaustive):
> >> > - gsi (the physical IRQ)
> >> > - irqchip (the virtual interrupt controller)
> >> > - irqchip.pin (the interrupt controller input the gsi is routed to).
> >> > - set() is the method that enables to trigger the virtual interrupt for
> >> > this entry (basically depends on the irqchip, IOAPI, PIC, GIC, ...).
> >> > The complete definition can be found in include/linux/kvm_host.h.
> >>
> >> set() is internal to the kernel and will not be exposed to userspace
> >> (see discussions on the API previously in the thread)
> >>
> >> >
> >> > IRQFD:
> >> >
> >> > irqfd framework makes it possible to assign physical IRQs (the gsi
> >> > above) to KVM guests. An eventfd is associated to a physical IRQ (the
> >> > gsi). When the eventfd is signaled (typically by a VFIO driver ISR), the
> >> > irqfd framework has the role to inject the virtual IRQ associated to
> >> > this physical IRQ. This is done on kernel side. The injection is made
> >> > through the virtual interrupt controller and made visible on next VM
> >> > runtime window.
> >> >
> >> > the _irqfd struct (eventfd.c) itself stores the VM it applies to and the
> >> > gsi it is linked with.
> >> >
> >> > The irqfd framework depends on the KVM routing table to find the
> >> > remaining information needed to perform the guest injection:
> >> > - the virtual interrupt controller used for injection (irqchip)
> >> > - the input pin of the interrupt controller (irqchip.pin)
> >> > - the set() function irdfq must call to trigger the virtual IRQ
> >> >
> >> > Although the _irqfd struct has a routing entry field (irq_entry), this
> >> > one is not used for GSI (it is for MSI). Thus when the eventfd is
> >> > signaled, the routing table is searched for the associated GSI and
> >> > eventually set() is called.
> >> >
> >> > Note that irqfd allows you to define a second (optional) eventfd (called
> >> > the resampler), which can be signaled by KVM when the virtual IRQ is
> >> > completed (EOI).
> >>
> >> The resamplerfd part is actually very important to properly support
> >> level sensitive (automasked) interrupts. For edge triggered interrupts
> >> IRQFD is just a matter of efficiency, but level-sensitive interrupts
> >> absolutely need a IRQFD and a RESAMPLEFD to be implemented in a sane
> >> way.
> >
> > Are you saying you cannot support passthrough level-triggered interrupts
> > without IRQFD?  Can you be more concrete?
> 
> This has to do with the way VFIO treats level sensitive interrupts.
> Instead of firing the interrupt continuously, it is automatically
> masked as soon as it fires. It is up to userspace to unmask it when it
> wishes to receive new interrupts from the device.
> 
> Leaving the interrupt always unmasked is not desirable, since then we
> would have the guest stalling due to an infinite number of interrupts
> being injected.

the entire system would more or less stall I assume, at least on UP.

> 
> With a userspace VFIO driver, we can know the device semantics and
> unmask accordingly, but with QEMU we want to be more generic (and we
> want to take advantage of MMAPing the device regions directly to the
> guest intermediate physical address space). There are two workarounds
> around, one is unmasking periodically with a timer, the other
> unmasking every time the guest accesses a device region. Each one with
> its own disadvantages. Needless to say this is hack.

Not so sure, I think x86 actually has setups in production that does
stuff like that, and the generic solution in QEMU may not end up being
so generic after all, but yes, sure, we want to support irqfd for
performance, just want to be clear about whether we are doing things as
the result of a performance or functional requirement.

> 
> The proper way to do this is with RESAMPLEFD support, which is an
> extra capability of IRQFD. With this feature we can set an eventfd to
> fire whenever the guest does an EOI on the VGIC. QEMU can pass this
> eventfd to VFIO, and unmask the interrupt in a safe and device
> independent fashion.

I think you'll find that trapping on the EOI is going to be too costly
and we need to introduce priorities for this to make sense and set the
hw bit in the link registers.  Eric is working on this already.

> 
> As far as I know KVM does not include a way to notify userspace for an
> EOI or equivalent by the guest. The current hacks work, but I don't
> consider them permanent solutions.
> 
> Also on top of those contraints, IRQFD is still very very desirable
> since it allows us to inject interrupts from VFIO to KVM by skipping
> userspace completely. QEMU will just pass the right eventfds around on
> setup. Both IRQFD and RESAMPLEFD implementations are pretty generic in
> the KVM codebase, but we need IRQ routing to use them.
> 
> To be more precise, we need to expose the VGIC via the irqchip.c
> interface for KVM IRQ chips, which however is very coupled with IRQ
> routing. In fact, IRQ routing is implemented as a feature of that
> interface - but we still need to provide some glue with it and the
> default routing for our platform. The complication stems from the fact
> that we have to worry about PPIs and their target CPU in addition to
> plain SPIs - which other platforms don't.
> 
> Hope that makes things more clear.
> 
Yes, indeed, we are on the same page and I agree that we should support
IRQ routing and IRQFDs, just wanted to be clear about the rationale.

-Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm




[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux