Re: [RFC PATCH 0/4] ARM: KVM: Enable the ioeventfd capability of KVM on ARM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 6, 2014 at 2:35 PM, Christoffer Dall
<christoffer.dall@xxxxxxxxxx> wrote:
> On Tue, May 06, 2014 at 01:11:37PM +0200, Antonios Motakis wrote:
>> On Mon, May 5, 2014 at 4:52 PM, Eric Auger <eric.auger@xxxxxxxxxx> wrote:
>> >
>> > Dear All,
>> >
>> > I will try to sketch a global view of how to assign a physical IRQ to a
>> > KVM guest using the following subsystems:
>> >
>> > - on kernel side: VFIO driver, KVM IRQFD and GSI routing
>>
>> On the kernel side there are two sides:
>>
>> 1) The VFIO eventfd support which is underway for v6, as mentioned in
>> the VFIO patches.
>> 2) KVM IRQFD support, which depends on GSI routing as we discussed here.
>>
>> For (1) it is not a different story than what VFIO already does on
>> x86. We are going to support the same API already documented in
>> include/uapi/linux/vfio.h and will be part of the next version of the
>> VFIO_PLATFORM patches.
>>
>> For (2) we would like to offer an RFC patch, as soon as we have some
>> kind of agreement with that what we have started is the right
>> direction and desirable, which is why we started this thread.
>>
>> > - on user side: QEMU system image featuring VFIO QEMU device.
>> >
>> > It aims at sharing knowledge and checking that the understanding of the
>> > legacy is correct (MSI routing is out of scope).
>> >
>> > GSI routing table:
>> >
>> > Each VM has its own routing table. This later aims at storing how a
>> > physical IRQ (the gsi) is connected to a guest.
>> >
>> > GSI routing entries contain the following fields (not exhaustive):
>> > - gsi (the physical IRQ)
>> > - irqchip (the virtual interrupt controller)
>> > - irqchip.pin (the interrupt controller input the gsi is routed to).
>> > - set() is the method that enables to trigger the virtual interrupt for
>> > this entry (basically depends on the irqchip, IOAPI, PIC, GIC, ...).
>> > The complete definition can be found in include/linux/kvm_host.h.
>>
>> set() is internal to the kernel and will not be exposed to userspace
>> (see discussions on the API previously in the thread)
>>
>> >
>> > IRQFD:
>> >
>> > irqfd framework makes it possible to assign physical IRQs (the gsi
>> > above) to KVM guests. An eventfd is associated to a physical IRQ (the
>> > gsi). When the eventfd is signaled (typically by a VFIO driver ISR), the
>> > irqfd framework has the role to inject the virtual IRQ associated to
>> > this physical IRQ. This is done on kernel side. The injection is made
>> > through the virtual interrupt controller and made visible on next VM
>> > runtime window.
>> >
>> > the _irqfd struct (eventfd.c) itself stores the VM it applies to and the
>> > gsi it is linked with.
>> >
>> > The irqfd framework depends on the KVM routing table to find the
>> > remaining information needed to perform the guest injection:
>> > - the virtual interrupt controller used for injection (irqchip)
>> > - the input pin of the interrupt controller (irqchip.pin)
>> > - the set() function irdfq must call to trigger the virtual IRQ
>> >
>> > Although the _irqfd struct has a routing entry field (irq_entry), this
>> > one is not used for GSI (it is for MSI). Thus when the eventfd is
>> > signaled, the routing table is searched for the associated GSI and
>> > eventually set() is called.
>> >
>> > Note that irqfd allows you to define a second (optional) eventfd (called
>> > the resampler), which can be signaled by KVM when the virtual IRQ is
>> > completed (EOI).
>>
>> The resamplerfd part is actually very important to properly support
>> level sensitive (automasked) interrupts. For edge triggered interrupts
>> IRQFD is just a matter of efficiency, but level-sensitive interrupts
>> absolutely need a IRQFD and a RESAMPLEFD to be implemented in a sane
>> way.
>
> Are you saying you cannot support passthrough level-triggered interrupts
> without IRQFD?  Can you be more concrete?

This has to do with the way VFIO treats level sensitive interrupts.
Instead of firing the interrupt continuously, it is automatically
masked as soon as it fires. It is up to userspace to unmask it when it
wishes to receive new interrupts from the device.

Leaving the interrupt always unmasked is not desirable, since then we
would have the guest stalling due to an infinite number of interrupts
being injected.

With a userspace VFIO driver, we can know the device semantics and
unmask accordingly, but with QEMU we want to be more generic (and we
want to take advantage of MMAPing the device regions directly to the
guest intermediate physical address space). There are two workarounds
around, one is unmasking periodically with a timer, the other
unmasking every time the guest accesses a device region. Each one with
its own disadvantages. Needless to say this is hack.

The proper way to do this is with RESAMPLEFD support, which is an
extra capability of IRQFD. With this feature we can set an eventfd to
fire whenever the guest does an EOI on the VGIC. QEMU can pass this
eventfd to VFIO, and unmask the interrupt in a safe and device
independent fashion.

As far as I know KVM does not include a way to notify userspace for an
EOI or equivalent by the guest. The current hacks work, but I don't
consider them permanent solutions.

Also on top of those contraints, IRQFD is still very very desirable
since it allows us to inject interrupts from VFIO to KVM by skipping
userspace completely. QEMU will just pass the right eventfds around on
setup. Both IRQFD and RESAMPLEFD implementations are pretty generic in
the KVM codebase, but we need IRQ routing to use them.

To be more precise, we need to expose the VGIC via the irqchip.c
interface for KVM IRQ chips, which however is very coupled with IRQ
routing. In fact, IRQ routing is implemented as a feature of that
interface - but we still need to provide some glue with it and the
default routing for our platform. The complication stems from the fact
that we have to worry about PPIs and their target CPU in addition to
plain SPIs - which other platforms don't.

Hope that makes things more clear.

>
> -Christoffer



-- 
Antonios Motakis
Virtual Open Systems
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm




[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux