On 9/8/22 2:44 PM, Jason Gunthorpe wrote: > The basic issue is that the iommu_group is being used by VFIO after all > the device drivers have been removed. > > In part this is caused by bad logic inside the iommu core that doesn't > sequence removing the device from the group properly, and in another part > this is bad logic in VFIO continuing to use device->iommu_group after all > VFIO device drivers have been removed. > > Fix both situations. Either fix alone should fix the bug reported, but > both together bring a nice robust design to this area. > > This is a followup from this thread: > > https://lore.kernel.org/kvm/20220831201236.77595-1-mjrosato@xxxxxxxxxxxxx/ > > Matthew confirmed an earlier version of the series solved the issue, it > would be best if he would test this as well to confirm the various changes > are still OK. FYI I've been running this series (+ the incremental to patch 4 you mentioned) against my original repro scenario in a loop overnight, looks good. > > The iommu patch is independent of the other patches, it can go through the > iommu rc tree. > > Jason Gunthorpe (4): > vfio: Simplify vfio_create_group() > vfio: Move the sanity check of the group to vfio_create_group() > vfio: Follow a strict lifetime for struct iommu_group * > iommu: Fix ordering of iommu_release_device() > > drivers/iommu/iommu.c | 36 ++++++-- > drivers/vfio/vfio_main.c | 172 +++++++++++++++++++++------------------ > 2 files changed, 120 insertions(+), 88 deletions(-) > > > base-commit: 245898eb9275ce31942cff95d0bdc7412ad3d589