On Thu, Oct 06, 2022 at 01:53:15PM -0600, Alex Williamson wrote: > On Thu, 6 Oct 2022 09:40:35 -0300 > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > Testing has shown that virtnodedevd will leave the group FD open for long > > periods, even after all the cdevs have been destroyed. This blocks > > destruction of the VFIO device and is undesirable. > > > > That approach was selected to accomodate SPAPR which has an broken > > lifecyle model for the iommu_group. However, we can accomodate SPAPR by > > realizing that it doesn't use the iommu core at all, so rules about > > iommu_group lifetime do not apply to it. > > > > Giving the KVM code its own kref on the iommu_group allows the VFIO core > > code to release its iommu_group reference earlier and we can remove the > > sleep that only existed for SPAPR. > > > > Jason Gunthorpe (3): > > vfio: Add vfio_file_is_group() > > vfio: Hold a reference to the iommu_group in kvm for SPAPR > > vfio: Make the group FD disassociate from the iommu_group > > > > drivers/vfio/pci/vfio_pci_core.c | 2 +- > > drivers/vfio/vfio.h | 1 - > > drivers/vfio/vfio_main.c | 90 +++++++++++++++++++++----------- > > include/linux/vfio.h | 1 + > > virt/kvm/vfio.c | 45 +++++++++++----- > > 5 files changed, 94 insertions(+), 45 deletions(-) > > Containers aren't getting cleaned up with this series, starting and > shutting down a libvirt managed VM with vfio-pci devices, an mtty mdev > device, and making use of hugepages, /proc/meminfo shows the hugepages > are not released on VM shutdown and I'm unable to subsequently restart > the VM. Thanks, Oh, I'm surprised the s390 testing didn't hit this!! This hunk should remain since not all cases are closures due to device hot unplug diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index f9cb734d3991b3..62aba3a128fb8d 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -954,6 +954,13 @@ static int vfio_group_fops_release(struct inode *inode, struct file *filep) filep->private_data = NULL; mutex_lock(&group->group_lock); + /* + * Device FDs hold a group file reference, therefore the group release + * is only called when there are no open devices. + */ + WARN_ON(group->notifier.head); + if (group->container) + vfio_group_detach_container(group); group->opened_file = NULL; mutex_unlock(&group->group_lock); return 0;