Re: Add vfio-platform support for ONESHOT irq forwarding?

Auger Eric <eric.auger@xxxxxxxxxx> · Tue, 26 Jan 2021 09:48:17 +0100

Hi Micah,

On 1/25/21 7:32 PM, Micah Morton wrote:
> On Mon, Jan 25, 2021 at 12:31 PM Auger Eric <eric.auger@xxxxxxxxxx> wrote:
>>
>> Hi Micah,
>>
>> On 1/25/21 4:46 PM, Micah Morton wrote:
>>> Hi Eric,
>>>
>>> I was recently looking into some vfio-platform passthrough stuff and
>>> came across a device I wanted to assign to a guest that uses a ONESHOT
>>> type interrupt (these type of interrupts seem to be quite common, on
>>> ARM at least). The semantics for ONESHOT interrupts are a bit
>>> different from regular level triggered interrupts as I'll describe
>>> here:
>>>
>>> The normal generic code flow for level-triggered interrupts is as follows:
>>>
>>> - regular type[1]: mask[2] the irq, then run the handler, then
>>> unmask[3] the irq and done
>>
>> VFIO level sensitive interrupts are "automasked". See slide 10 of
>> https://www.linux-kvm.org/images/a/a8/01x04-ARMdevice.pdf if this can help.
> 
> When you say "automasked" / "mask and deactivate" are you referring to
> the disable_irq_nosync() call in vfio_platform
> (https://elixir.bootlin.com/linux/v5.10.7/source/drivers/vfio/platform/vfio_platform_irq.c#L154)?
Yes that's what I meant
> 
>>
>> When the guest deactivates the virtual IRQ, this causes a maintenance
>> interrupt on host. This occurence causes kvm_notify_acked_irq() to be
>> called and this latter unmasks the physical IRQ.
> 
> Are you saying kvm_notify_acked_irq() causes
> vfio_platform_unmask_handler()
> (https://elixir.bootlin.com/linux/v5.10.7/source/drivers/vfio/platform/vfio_platform_irq.c#L94)
> to be called? Makes sense if so.
Yes this ends up calling irqfd_resampler_ack which toggles the virtual
irq line down and calls the unmask eventfd
> 
>>>
>>> - fasteoi type[4]: run the handler, then eoi[5] the irq and done
>>>
>>> Note: IIUC the fasteoi type doesn't do any irq masking/unmasking
>>> because that is assumed to be handled transparently by "modern forms
>>> of interrupt handlers, which handle the flow details in hardware"
>>>
>>> ONESHOT type interrupts are a special case of the fasteoi type
>>> described above. They rely on the driver registering a threaded
>>> handler for the interrupt and assume the irq line will remain masked
>>> until the threaded handler completes, at which time the line will be
>>> unmasked. TL;DR:
>>>
>>> - mask[6] the irq, run the handler, and potentially eoi[7] the irq,
>>> then unmask[8] later when the threaded handler has finished running.
>>
>> could you point me to the exact linux handler (is it
>> handle_fasteoi_irq?)  or Why do you say "potentially". Given the details
>> above, the guest EOI would unmask the IRQ at physical level. We do not
>> have any hook KVM/VFIO on the guest unmap.
> 
> Yep. handle_fasteoi_irq(). By "potentially" I just meant the eoi is
> only called for ONESHOT irqs if this check passes: !(chip->flags &
> IRQCHIP_EOI_THREADED). Not actually sure what this check is about
> since I didn't look into it.
I checked IRQCHIP_EOI_THREADED and it does not seem to be supported by
GIC irqchip. so this would mean that the EOI is not called in our case?

> 
> So far I was just talking about irq operations that happen in the
> host. Not sure I quite understand your last two sentences about guest
> EOI and guest unmap.
sorry I meant unmask instead of unmap :-( The closure of the physical
IRQ is triggered by the deactivation of the guest virtual IRQ.
Let's first assume you keep the existing "auto-masked" level sensitive
IRQ on host side (vfio platform driver), on the guest driver side you
are going to have a oneshot IRQ. What needs to be understood is when
does the deactivation happen on guest side.
> 
>>>
>>> For vfio-platform irq forwarding, there is no existing function in
>>> drivers/vfio/platform/vfio_platform_irq.c[9] that is a good candidate
>>> for registering as the threaded handler for a ONESHOT interrupt in the
>>> case we want to request the ONESHOT irq with
>>> request_threaded_irq()[10]. Moreover, we can't just register a
>>> threaded function that simply immediately returns IRQ_HANDLED (as is
>>> done in vfio_irq_handler()[11] and vfio_automasked_irq_handler()[12]),
>>> since that would cause the IRQ to be unmasked[13] immediately, before
>> sorry I know you reworked that several times for style issue but [13]
>> does not match unmask().
> 
> That's actually the link I intended, irq_finalize_oneshot() will be
> called after the threaded interrupt handler returns, and that's when
> the IRQ will be unmasked.
Hum this points to something within irq_thread_check_affinity(). Anyway.
> 
>>> the userspace/guest driver has had any chance to service the
>>> interrupt.
>>>
>>> The most obvious way I see to handle this is to add a threaded handler
>>> to vfio_platform_irq.c that waits until the userspace/guest driver has
>>> serviced the interrupt and the unmask_handler[14] has been called, at
>>> which point it returns IRQ_HANDLED so the generic IRQ code in the host
>>> can finally unmask the interrupt.
>> this is done on guest EOI at the moment.
> 
> By "this", I think you mean enable_irq() in vfio_platform_irq.c?
right
> 
>>>
>>> Does this sound like a reasonable approach and something you would be
>>> fine with adding to vfio-platform?
>> Well I think it is interesting to do a pre-study and make sure we agree
>> on the terminology, IRQ flow and problems that would need to be solved.
>> Then we need to determine if it is worth the candle, if this support
>> would speed up the vfio-platform usage (I am not sure at this point as I
>> think there are more important drags like the lack of specification/dt
>> integration for instance).
>>
>> Besides we need to make sure we are not launching into a huge effort for
>> attempting to assign a device that does not fit well into the vfio
>> platform scope (dma capable, reset, device tree).
> 
> Yes, agreed.
> 
>>
>>  If so I could get started looking
>>> at the implementation for how to sleep in the threaded handler in
>>> vfio-platform until the unmask_handler is called. The most tricky/ugly
>>> part of this is that DT has no knowledge of irq ONESHOT-ness, as it
>>> only contains info regarding active-low vs active-high and edge vs
>>> level trigger. That means that vfio-platform can't figure out that a
>>> device uses a ONESHOT irq in a similar way to how it queries[15] the
>>> trigger type, and by extension QEMU can't learn this information
>>> through the VFIO_DEVICE_GET_IRQ_INFO ioctl, but must have another way
>>> of knowing (i.e. command line option to QEMU).
>> Indeed that's not really appealing.
>>>
>>> I guess potentially another option would be to treat ONESHOT
>>> interrupts like regular level triggered interrupts from the
>>> perspective of vfio-platform, but somehow ensure the interrupt stays
>>> masked during injection to the guest, rather than just disabled.
>> I need this to be clarified actually. I am confused by the automasked
>> terminology now.
> 
> My note below tries to explain a bit but AFAICT there's some confusion
> in the VFIO code (both platform and PCI) mixing the IRQ terms
> mask/unmask with disable/enable, which I think are not quite
> interchangeable concepts. IIUC irq enable/disable is a high(er) level
> concept that means that even if the irqchip HW sees interrupts coming
> in, the kernel refrains from calling the interrupt handler for the IRQ
> -- but the kernel keeps track of interrupts that come in while an IRQ
> is disabled, in order to know (once the IRQ gets enabled again by the
> driver) that there are pending interrupts from the HW. On the other
> hand, IRQ masking is a lower level irqchip operation that truly masks
> the IRQs at the hardware level, so the kernel doesn't even know or
> care if the line is being activated/deactivated while the IRQ is
> masked.

Looking again at the genirq code, irq/chip.c __irq_disable, actual
implementation at HW level depends on the irqchip caps. GIC does not
implement the irq_enable() cb so this ends up calling mask_irq which
eventually calls gic_mask_irq and toggles the enable state for the given
IRQ.

Thanks

Eric
> 
> I don't have a great explanation for why ONESHOT interrupts are even a
> thing in the first place, maybe the HW design means you're going to
> get a bunch of garbage/meaningless ups and downs on the interrupt line
> while the interrupt is being serviced and they should never be taken
> into consideration in any way by the kernel until the IRQ is unmasked
> again?
> 
>>
>> Thanks
>>
>> Eric
>>
>>  I'm
>>> not sure whether this could cause legitimate interrupts coming from
>>> devices to be missed while the injection for an existing interrupt is
>>> underway, but maybe this is a rare enough scenario that we wouldn't
>>> care. The main issue with this approach is that handle_level_irq()[16]
>>> will try to unmask the irq out from under us after we start the
>>> injection (as it is already masked before
>>> vfio_automasked_irq_handler[17] runs anyway). Not sure if masking at
>>> the irqchip level supports nesting or not.
>>>
>>> Let me know if you think either of these are viable options for adding
>>> ONESHOT interrupt forwarding support to vfio-platform?
>>>
>>> Thanks,
>>> Micah
>>>
>>>
>>>
>>>
>>> Additional note about level triggered vs ONESHOT irq forwarding:
>>> For the regular type of level triggered interrupt described above, the
>>> vfio handler will call disable_irq_nosync()[18] before the
>>> handle_level_irq() function unmasks the irq and returns. This ensures
>>> if new interrupts come in on the line while the existing one is being
>>> handled by the guest (and the irq is therefore disabled), that the
>>> vfio_automasked_irq_handler() isn’t triggered again until the
>>> vfio_platform_unmask_handler() function has been triggered by the
>>> guest (causing the irq to be re-enabled[19]). In other words, the
>>> purpose of the irq enable/disable that already exists in vfio-platform
>>> is a higher level concept that delays handling of additional
>>> level-triggered interrupts in the host until the current one has been
>>> handled in the guest.
>>>
>>> This means that the existing level triggered interrupt forwarding
>>> logic in vfio/vfio-platform is not sufficient for handling ONESHOT
>>> interrupts (i.e. we can’t just treat a ONESHOT interrupt like a
>>> regular level triggered interrupt in the host and use the existing
>>> vfio forwarding code). The masking that needs to happen for ONESHOT
>>> interrupts is at the lower level of the irqchip mask/unmask in that
>>> the ONESHOT irq needs to remain masked (not just disabled) until the
>>> driver’s threaded handler has completed.
>>>
>>>
>>>
>>>
>>> [1] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/chip.c#L642
>>> [2] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/chip.c#L414
>>> [3] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/chip.c#L619
>>> [4] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/chip.c#L702
>>> [5] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/chip.c#L688
>>> [6] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/chip.c#L724
>>> [7] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/chip.c#L688
>>> [8] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/manage.c#L1028
>>> [9] https://elixir.bootlin.com/linux/v5.10.7/source/drivers/vfio/platform/vfio_platform_irq.c
>>> [10] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/manage.c#L2038
>>> [11] https://elixir.bootlin.com/linux/v5.10.7/source/drivers/vfio/platform/vfio_platform_irq.c#L167
>>> [12] https://elixir.bootlin.com/linux/v5.10.7/source/drivers/vfio/platform/vfio_platform_irq.c#L142
>>> [13] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/manage.c#L1028
>>> [14] https://elixir.bootlin.com/linux/v5.10.7/source/drivers/vfio/platform/vfio_platform_irq.c#L94
>>> [15] https://elixir.bootlin.com/linux/v5.10.7/source/drivers/vfio/platform/vfio_platform_irq.c#L310
>>> [16] https://elixir.bootlin.com/linux/v5.10.7/source/kernel/irq/chip.c#L642
>>> [17] https://elixir.bootlin.com/linux/v5.10.7/source/drivers/vfio/platform/vfio_platform_irq.c#L142
>>> [18] https://elixir.bootlin.com/linux/v5.10.7/source/drivers/vfio/platform/vfio_platform_irq.c#L154
>>> [19] https://elixir.bootlin.com/linux/v5.10.7/source/drivers/vfio/platform/vfio_platform_irq.c#L87
>>>
>>
>