On 10/7/22 10:39 AM, Jason Gunthorpe wrote: > On Fri, Oct 07, 2022 at 10:37:11AM -0400, Matthew Rosato wrote: >> On 10/7/22 9:37 AM, Jason Gunthorpe wrote: >>> On Thu, Oct 06, 2022 at 07:28:53PM -0400, Matthew Rosato wrote: >>> >>>>> Oh, I'm surprised the s390 testing didn't hit this!! >>>> >>>> Huh, me too, at least eventually - I think it's because we aren't >>>> pinning everything upfront but rather on-demand so the missing the >>>> type1 release / vfio_iommu_unmap_unpin_all wouldn't be so obvious. >>>> I definitely did multiple VM (re)starts and hot (un)plugs. But >>>> while my test workloads did some I/O, the long-running one was >>>> focused on the plug/unplug scenarios to recreate the initial issue >>>> so the I/O (and thus pinning) done would have been minimal. >>> >>> That explains ccw/ap a bit but for PCI the iommu ownership wasn't >>> released so it becomes impossible to re-attach a container to the >>> group. eg a 2nd VM can never be started >>> >>> Ah well, thanks! >>> >>> Jason >> >> Well, this bugged me enough that I traced the v1 series without fixup and vfio-pci on s390 was OK because it was still calling detach_container on vm shutdown via this chain: >> >> vfio_pci_remove >> vfio_pci_core_unregister_device >> vfio_unregister_group_dev >> vfio_device_remove_group >> vfio_group_detach_container >> >> I'd guess non-s390 vfio-pci would do the same. Alex also had the mtty mdev, maybe that's relevant. > > As long as you are unplugging a driver the v1 series would work. The > failure mode is when you don't unplug the driver and just run a VM > twice in a row. > > Jason Oh, duh - And yep all of my tests are using managed libvirt so its unbinding from vfio-pci back to the default host driver on VM shutdown. OK, if I force the point and leave vfio-pci bound the 2nd guest boot indeed fails setting up the container with unmodified v1. I'll try again with the new v2 now