On Sat, 1 Apr 2023 09:15:33 +0000 "Xu, Terrence" <terrence.xu@xxxxxxxxx> wrote: > > -----Original Message----- > > From: intel-gvt-dev <intel-gvt-dev-bounces@xxxxxxxxxxxxxxxxxxxxx> On > > Behalf Of Alex Williamson > > Sent: Saturday, April 1, 2023 1:50 AM > > > > On Fri, 31 Mar 2023 17:27:27 +0000 > > "Xu, Terrence" <terrence.xu@xxxxxxxxx> wrote: > > > > > > -----Original Message----- > > > > From: Liu, Yi L <yi.l.liu@xxxxxxxxx> > > > > Sent: Monday, March 27, 2023 5:35 PM > > > > > > > > VFIO_DEVICE_PCI_HOT_RESET requires user to pass an array of group > > > > fds to prove that it owns all devices affected by resetting the > > > > calling device. This series introduces several extensions to allow > > > > the ownership check better aligned with iommufd and coming vfio device > > cdev support. > > > > > > > > First, resetting an unopened device is always safe given nobody is > > > > using it. So relax the check to allow such devices not covered by > > > > group fd array. [1] > > > > > > > > When iommufd is used we can simply verify that all affected devices > > > > are bound to a same iommufd then no need for the user to provide > > > > extra fd information. This is enabled by the user passing a > > > > zero-length fd array and moving forward this should be the preferred > > > > way for hot reset. [2] > > > > > > > > However the iommufd method has difficulty working with noiommu > > > > devices since those devices don't have a valid iommufd, unless the > > > > noiommu device is in a singleton dev_set hence no ownership check is > > > > required. [3] > > > > > > > > For noiommu backward compatibility a 3rd method is introduced by > > > > allowing the user to pass an array of device fds to prove ownership. > > > > [4] > > > > > > > > As suggested by Jason [5], we have this series to introduce the > > > > above stuffs to the vfio PCI hot reset. Per the dicussion in [6], > > > > this series also adds a new _INFO ioctl to get hot reset scope for given > > device. > > > > > > > > [1] https://lore.kernel.org/kvm/Y%2FdobS6gdSkxnPH7@xxxxxxxxxx/ > > > > [2] https://lore.kernel.org/kvm/Y%2FZOOClu8nXy2toX@xxxxxxxxxx/#t > > > > [3] https://lore.kernel.org/kvm/ZACX+Np%2FIY7ygqL5@xxxxxxxxxx/ > > > > [4] > > > > > > https://lore.kernel.org/kvm/DS0PR11MB7529BE88460582BD599DC1F7C3B19 > > > > @DS0PR11MB7529.namprd11.prod.outlook.com/#t > > > > [5] https://lore.kernel.org/kvm/ZAcvzvhkt9QhCmdi@xxxxxxxxxx/ > > > > [6] https://lore.kernel.org/kvm/ZBoYgNq60eDpV9Un@xxxxxxxxxx/ > > > > > > > > Change log: > > > > > > > > v2: > > > > - Split the patch 03 of v1 to be 03, 04 and 05 of v2 (Jaon) > > > > - Add r-b from Kevin and Jason > > > > - Add patch 10 to introduce a new _INFO ioctl for the usage of device > > > > fd passing usage in cdev path (Jason, Alex) > > > > > > > > v1: > > > > https://lore.kernel.org/kvm/20230316124156.12064-1-yi.l.liu@xxxxxxxx > > > > m/ > > > > > > > > Regards, > > > > Yi Liu > > > > > > > > Yi Liu (10): > > > > vfio/pci: Update comment around group_fd get in > > > > vfio_pci_ioctl_pci_hot_reset() > > > > vfio/pci: Only check ownership of opened devices in hot reset > > > > vfio/pci: Move the existing hot reset logic to be a helper > > > > vfio-iommufd: Add helper to retrieve iommufd_ctx and devid for > > > > vfio_device > > > > vfio/pci: Allow passing zero-length fd array in > > > > VFIO_DEVICE_PCI_HOT_RESET > > > > vfio: Refine vfio file kAPIs for vfio PCI hot reset > > > > vfio: Accpet device file from vfio PCI hot reset path > > > > vfio/pci: Renaming for accepting device fd in hot reset path > > > > vfio/pci: Accept device fd in VFIO_DEVICE_PCI_HOT_RESET ioctl > > > > vfio/pci: Add VFIO_DEVICE_GET_PCI_HOT_RESET_GROUP_INFO > > > > > > > > drivers/iommu/iommufd/device.c | 12 ++ > > > > drivers/vfio/group.c | 32 ++-- > > > > drivers/vfio/iommufd.c | 16 ++ > > > > drivers/vfio/pci/vfio_pci_core.c | 244 ++++++++++++++++++++++++---- > > --- > > > > drivers/vfio/vfio.h | 2 + > > > > drivers/vfio/vfio_main.c | 44 ++++++ > > > > include/linux/iommufd.h | 3 + > > > > include/linux/vfio.h | 14 ++ > > > > include/uapi/linux/vfio.h | 65 +++++++- > > > > 9 files changed, 364 insertions(+), 68 deletions(-) > > > > > > > > -- > > > > 2.34.1 > > > > > > Verified this series by "Intel GVT-g GPU device mediated passthrough". > > > Passed VFIO legacy mode / compat mode / cdev mode basic functionality > > and GPU force reset test. > > > > > > Tested-by: Terrence Xu <terrence.xu@xxxxxxxxx> > > > > Seems like only this "GPU force reset test" is relevant to the new > > functionality of this series, GVT-g does not and has no reason to support the > > HOT_RESET ioctls used here. Can you provide more details of the force-reset > > test? What userspace driver is being used? Thanks, > > > > Alex > Hi Alex, about the "GPU force reset test", I used the "i915_hangman" > test from intel-gpu-tools, it is for GPU force hang / reset. It is an > important regression test scenario for this patch series. To test the > HOT_RESET ioctls itself, need to wait the corresponding Qemu changes > from Yi. But i915 exists on the host root bus, we fundamentally cannot perform a bus reset of the root bus. So how exactly is testing with GVT-g, which doesn't use the vfio-pci-core hot-reset ioctl, or GVT-d, which can't do a bus reset because it exists on the root bus, relevant to this series? Is this some novel use of a dGPU i915 with out-of-tree drivers? Obviously any regression testing is fine and appreciated, but if this is intended to express some validation of the new interface, I'm failing to see how. Thanks, Alex