Hi Alex, On 2/5/2024 2:34 PM, Alex Williamson wrote: > On Thu, 1 Feb 2024 20:56:57 -0800 > Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote: > >> vfio_pci_set_irqs_ioctl() is the entrypoint for interrupt management >> via the VFIO_DEVICE_SET_IRQS ioctl(). The igate mutex is obtained >> before calling vfio_pci_set_irqs_ioctl() for management of all interrupt >> types to protect against concurrent changes to the eventfds associated >> with device request notification and error interrupts. >> >> The igate mutex is not acquired consistently. The mutex is always >> (for all interrupt types) acquired from within vfio_pci_ioctl_set_irqs() >> before calling vfio_pci_set_irqs_ioctl(), but vfio_pci_set_irqs_ioctl() is >> called via vfio_pci_core_disable() without the mutex held. The latter >> is expected to be correct if the code flow can be guaranteed that >> the provided interrupt type is not a device request notification or error >> interrupt. > > The latter is correct because it's always a physical interrupt type > (INTx/MSI/MSIX), vdev->irq_type dictates this, and the interrupt code > prevents the handler from being called after the interrupt is disabled. Thank you for confirming. > It's intentional that we don't acquire igate here since we only need to > prevent a race with concurrent user access, which cannot occur in the > fd release path. The igate mutex is acquired consistently, where it's > required. Thank you. I do think it will be helpful to document some of this in the code to help newcomers distinguish the scenarios (more below). > It would be more forthcoming to describe that potential future emulated > device interrupts don't make the same guarantees, but if that's true, > why can't they? As I understand an emulated interrupt will be triggered by VFIO PCI driver as a result from, for example, a mmio write from user space. I thus expect similar locking to existing device request notification and error interrupts. I would like to focus this series on existing flows though. >> Move igate mutex acquire and release into vfio_pci_set_irqs_ioctl() >> to make the locking consistent irrespective of interrupt type. >> This is one step closer to contain the interrupt management locking >> internals within the interrupt management code so that the VFIO PCI >> core can trigger management of the eventfds associated with device >> request notification and error interrupts without needing to access >> and manipulate VFIO interrupt management locks and data. > > If all we want to do is move the mutex into vfio_pci_intr.c then we > could rename to __vfio_pci_set_irqs_ioctl() and create a wrapper around > it that grabs the mutex. The disable path could use the lockless > version and we wouldn't need to clutter the exit path unlocking the > mutex as done below. Thanks, Will do. This creates an opportunity to document the flows involving the mutex (essentially adding comments that includes your description above). Reinette