Hi Alex, On 3/16/2023 2:56 PM, Alex Williamson wrote: > On Wed, 15 Mar 2023 13:59:20 -0700 Reinette Chatre > <reinette.chatre@xxxxxxxxx> wrote: > ... >> == Why is this an RFC ? == >> >> vfio support for dynamic MSI-X needs to work with existing user >> space as well as upcoming user space that takes advantage of this >> feature. I would appreciate guidance on the expectations and >> requirements surrounding error handling when considering existing >> user space. >> >> For example, consider the following scenario: Start: Consider a >> passthrough device that supports 10 MSI-X interrupts (0 to 9) and >> existing Qemu allocated interrupts 0 to 4. >> >> Scenario: Qemu (hypothetically) attempts to associate a new action >> to interrupts 0 to 7. Early checking of this range in >> vfio_set_irqs_validate_and_prepare() will pass since it is a valid >> range for the device. What happens after the early checking is >> considered next: >> >> Current behavior (before this series): Since the provided range, 0 >> - 7, exceeds the allocated range, no action will be taken on >> existing allocated interrupts 0 - 4 and the Qemu request will fail >> without making any state changes. > > I must be missing something, it was described correctly earlier that > QEMU will disable MSI-X and re-enable with a new vector count. Not > only does QEMU not really have a way to fail this change, pretty > much nothing would work if we did. Thank you very much for confirming Qemu behavior. One of my goals is to ensure that these kernel changes do not break existing user space in any way. The only area I could find where existing user space may encounter new behavior is if user space makes the (pre dynamic MSI-X) mistake of attempting to associate triggers to interrupts that have not been allocated. Thank you for confirming that this is not a valid scenario for Qemu. > What happens in this case is that the QEMU vfio-pci driver gets a > vector_use callback for one of these new vectors {5,6,7}, > vfio_msix_vector_do_use() checks that's greater than we have > enabled, disables MSI-X, and re-enables it for the new vector count. > Worst case we'll do that 3 times if the vectors are presented in > ascending order. Indeed. While testing this work I modified Qemu's vfio_msix_vector_do_use() to consider VFIO_IRQ_INFO_NORESIZE. I saw the vector_use callback arriving for each new vector and having Qemu just associate a new action with the new vector (call vfio_set_irq_signaling()) instead of disabling MSI-X followed by enabling all vectors worked well with the changes I present here. A single interrupt was dynamically allocated and enabled without impacting others. > Based on above, there really can never be an error if we expect the > device to work, so I think there's a misread of the current status. > Dynamic MSI-X support should simply reduce the disruption and chance > of lost interrupts at the device, but the points where we risk that > the host cannot provide the configuration we need are the same. Thank you very much Alex. In this case, please do consider this submission as a submission for inclusion. I'd be happy to resubmit without the "RFC" prefix if that is preferred. Reinette