On Fri, Jul 23, 2021 at 10:05:43AM +0200, Christoph Hellwig wrote: > On Tue, Jul 20, 2021 at 02:42:55PM -0300, Jason Gunthorpe wrote: > > Keep track of all the vfio_devices that have been added to the device set > > and use this list in vfio_pci_try_bus_reset() instead of trying to work > > backwards from the pci_device. > > > > The dev_set->lock directly prevents devices from joining/leaving the set, > > which further implies the pci_device cannot change drivers or that the > > vfio_device be freed, eliminating the need for get/put's. > > > > Completeness of the device set can be directly measured by checking if > > every PCI device in the reset group is also in the device set - which > > proves that VFIO drivers are attached to everything. > > > > This restructuring corrects a call to pci_dev_driver() without holding the > > device_lock() and removes a hard wiring to &vfio_pci_driver. > > > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > > I think the addition of the list to the dev_set should be a different > patch. Or maybe even go into the one adding the dev_set concept. OK > > +static int vfio_pci_check_all_devices_bound(struct pci_dev *pdev, void *data) > > { > > + struct vfio_device_set *dev_set = data; > > + struct vfio_device *cur; > > > > + lockdep_assert_held(&dev_set->lock); > > > > + list_for_each_entry(cur, &dev_set->device_list, dev_set_list) > > + if (cur->dev == &pdev->dev) > > + return 0; > > + return -EBUSY; > > I don't understand this logic. If there is any device in the set that > does now have the same struct device we're in trouble? Please clearly > document what this is trying to do. If the bound in the name makes sense > you probably want to check the driver instead. The PCI reset this code is tring to do effects a set of PCI devices, due to how the HW works. Along with the vfio_pci_for_each_slot_or_bus() this is computing a set wise 'is superset' between the list of pci_dev's the reset will affect (the reset group) and the list of vfio_devices that we have locking control over to sequence the reset (the dev_set). If every PCI device we will reset is under the dev_set then we directly know it is safe to trigger the reset. If any PCI device is not in this dev_set then we cannot use the reset as we can't know what will happen to the device that we don't control. Let's use a different word than bound? vfio_pci_check_device_in_set()? > > static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev) > > { > > + /* All VFIO devices have a closed FD */ > > + list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list) > > + if (cur->vdev.open_count) > > + return; > > + > > + /* All devices in the group to be reset need VFIO devices */ > > + if (vfio_pci_for_each_slot_or_bus( > > + vdev->pdev, vfio_pci_check_all_devices_bound, dev_set, > > + !pci_probe_reset_slot(vdev->pdev->slot))) > > + return; > > > > /* Does at least one need a reset? */ > > These checks look a little strange, and the comments don't make much > sense. What about an incremental patch like this? Sure, it can go in a function > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index fbc20f6d2dd412..d8375a5e77e07c 100644 > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -2188,10 +2188,34 @@ static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data) > return 0; > } > > +static struct pci_dev *vfio_pci_reset_target(struct vfio_pci_device *vdev) > +{ > + struct vfio_device_set *dev_set = vdev->vdev.dev_set; > + struct vfio_pci_device *cur; > + > + /* none of the device is allowed to be currently open: */ > + list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list) > + if (cur->vdev.open_count) > + return NULL; > + > + /* all devices in the group to be reset need to be VFIO devices: */ It is not "need to be VFIO devices" it is "need to be in our dev_set". Have the PCI dev bound to, say, a mdev VFIO device isn't good enough. Thanks, Jason