Hi Alex, On 2/6/2024 3:19 PM, Alex Williamson wrote: > On Tue, 6 Feb 2024 14:22:04 -0800 > Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote: >> On 2/6/2024 2:03 PM, Alex Williamson wrote: >>> On Tue, 6 Feb 2024 13:46:37 -0800 >>> Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote: >>>> On 2/5/2024 2:35 PM, Alex Williamson wrote: >>>>> On Thu, 1 Feb 2024 20:57:09 -0800 >>>>> Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote: >>>> >>>> .. >>>> >>>>>> @@ -715,13 +724,13 @@ static int vfio_pci_set_intx_trigger(struct vfio_pci_core_device *vdev, >>>>>> if (is_intx(vdev)) >>>>>> return vfio_irq_set_block(vdev, start, count, fds, index); >>>>>> >>>>>> - ret = vfio_intx_enable(vdev); >>>>>> + ret = vfio_intx_enable(vdev, start, count, index); >>>>> >>>>> Please trace what happens when a user calls SET_IRQS to setup a trigger >>>>> eventfd with start = 0, count = 1, followed by any other combination of >>>>> start and count values once is_intx() is true. vfio_intx_enable() >>>>> cannot be the only place we bounds check the user, all of the INTx >>>>> callbacks should be an error or nop if vector != 0. Thanks, >>>>> >>>> >>>> Thank you very much for catching this. I plan to add the vector >>>> check to the device_name() and request_interrupt() callbacks. I do >>>> not think it is necessary to add the vector check to disable() since >>>> it does not operate on a range and from what I can tell it depends on >>>> a successful enable() that already contains the vector check. Similar, >>>> free_interrupt() requires a successful request_interrupt() (that will >>>> have vector check in next version). >>>> send_eventfd() requires a valid interrupt context that is only >>>> possible if enable() or request_interrupt() succeeded. >>> >>> Sounds reasonable. >>> >>>> If user space creates an eventfd with start = 0 and count = 1 >>>> and then attempts to trigger the eventfd using another combination then >>>> the changes in this series will result in a nop while the current >>>> implementation will result in -EINVAL. Is this acceptable? >>> >>> I think by nop, you mean the ioctl returns success. Was the call a >>> success? Thanks, >> >> Yes, I mean the ioctl returns success without taking any >> action (nop). >> >> It is not obvious to me how to interpret "success" because from what I >> understand current INTx and MSI/MSI-x are behaving differently when >> considering this flow. If I understand correctly, INTx will return >> an error if user space attempts to trigger an eventfd that has not >> been set up while MSI and MSI-x will return 0. >> >> I can restore existing INTx behavior by adding more logic and a return >> code to the send_eventfd() callback so that the different interrupt types >> can maintain their existing behavior. > > Ah yes, I see the dilemma now. INTx always checked start/count were > valid but MSI/X plowed through regardless, and with this series we've > standardized the loop around the MSI/X flow. > > Tricky, but probably doesn't really matter. Unless we break someone. > > I can ignore that INTx can be masked and signaling a masked vector > doesn't do anything, but signaling an unconfigured vector feels like an > error condition and trying to create verbiage in the uAPI header to > weasel out of that error and unconditionally return success makes me > cringe. > > What if we did this: > > uint8_t *bools = data; > ... > for (i = start; i < start + count; i++) { > if ((flags & VFIO_IRQ_SET_DATA_NONE) || > ((flags & VFIO_IRQ_SET_DATA_BOOL) && bools[i - start])) { > ctx = vfio_irq_ctx_get(vdev, i); > if (!ctx || !ctx->trigger) > return -EINVAL; > intr_ops[index].send_eventfd(vdev, ctx); > } > } > This looks good. Thank you very much. Will do. I studied the code more and have one more observation related to this portion of the flow: >From what I can tell this change makes the INTx code more robust. If I understand current implementation correctly it seems possible to enable INTx but not have interrupt allocated. In this case the interrupt context (ctx) will exist but ctx->trigger will be NULL. Current vfio_pci_set_intx_trigger()->vfio_send_intx_eventfd() only checks if ctx is valid. It looks like it may call eventfd_signal(NULL) where pointer is dereferenced. If this is correct then I think a separate fix that can easily be backported may be needed. Something like: diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 237beac83809..17ec46d8ab29 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -92,7 +92,7 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused) struct vfio_pci_irq_ctx *ctx; ctx = vfio_irq_ctx_get(vdev, 0); - if (WARN_ON_ONCE(!ctx)) + if (WARN_ON_ONCE(!ctx || !ctx->trigger)) return; eventfd_signal(ctx->trigger); } > And we note the behavior change for MSI/X in the commit log and if > someone shouts that we broke them, we can make that an -errno or > continue based on is_intx(). Sound ok? Thanks, I'll be sure to highlight the impact on MSI/MSI-x. Please do expect this in the final patch "vfio/pci: Remove duplicate interrupt management flow" though since that is where the different flows are merged. I am not familiar with how all user space interacts with this flow and if/how this may break things. I did look at Qemu code and I was not able to find where it intentionally triggers MSI/MSI-x interrupts, I could only find it for INTx. If this does break things I would like to also consider moving the different behavior into the interrupt type's respective send_eventfd() callback instead of adding interrupt type specific code (like is_intx()) into the shared flow. Thank you. Reinette