On 10/7/22 9:37 AM, Jason Gunthorpe wrote: > On Thu, Oct 06, 2022 at 07:28:53PM -0400, Matthew Rosato wrote: > >>> Oh, I'm surprised the s390 testing didn't hit this!! >> >> Huh, me too, at least eventually - I think it's because we aren't >> pinning everything upfront but rather on-demand so the missing the >> type1 release / vfio_iommu_unmap_unpin_all wouldn't be so obvious. >> I definitely did multiple VM (re)starts and hot (un)plugs. But >> while my test workloads did some I/O, the long-running one was >> focused on the plug/unplug scenarios to recreate the initial issue >> so the I/O (and thus pinning) done would have been minimal. > > That explains ccw/ap a bit but for PCI the iommu ownership wasn't > released so it becomes impossible to re-attach a container to the > group. eg a 2nd VM can never be started > > Ah well, thanks! > > Jason Well, this bugged me enough that I traced the v1 series without fixup and vfio-pci on s390 was OK because it was still calling detach_container on vm shutdown via this chain: vfio_pci_remove vfio_pci_core_unregister_device vfio_unregister_group_dev vfio_device_remove_group vfio_group_detach_container I'd guess non-s390 vfio-pci would do the same. Alex also had the mtty mdev, maybe that's relevant.