On 1/5/23 6:34 PM, Jason Gunthorpe wrote: > On Thu, Jan 05, 2023 at 03:09:30PM -0700, Alex Williamson wrote: >> On Thu, 19 May 2022 14:33:11 -0400 >> Matthew Rosato <mjrosato@xxxxxxxxxxxxx> wrote: >> >>> Rather than relying on a notifier for associating the KVM with >>> the group, let's assume that the association has already been >>> made prior to device_open. The first time a device is opened >>> associate the group KVM with the device. >>> >>> This fixes a user-triggerable oops in GVT. >> >> It seems this has traded an oops for a deadlock, which still exists >> today in both GVT-g and vfio-ap. These are the only vfio drivers that >> care about kvm, so they make use of kvm_{get,put}_kvm(), where the vfio-pci-zdev also >> latter is called by their .close_device() callbacks. Huh, I've never seen this deadlock with vfio-pci-zdev or vfio-ap, but I see what you're saying... I guess it's not seen under typical circumstances with QEMU because kvm_vfio_group_del would have already been triggered via KVM_DEV_VFIO_GROUP_DEL by the time we close the device, such that the group would not be found during the kvm_vfio_destroy call? (I'm not at all suggesting that we should rely on userspace behaving in this order, just wondering why I never saw it while testing) > > Bleck > > It is pretty common to run the final part of 'put' from a workqueue > specifically to avoid stuff like this, eg fput does it > > Maybe that is the simplest? Yeah, this is also what I was thinking, replace the direct kvm_put_kvm calls with, say, schedule_delayed_work in each driver, where the delayed task just does the kvm_put_kvm (along with a brief comment explaining why we handle the put asynchronously). Other than that.. The goal of this patch originally was to get the kvm reference at first open_device and release it with the very last close_device, so the only other option I could think of would be to take the responsibility back from the vfio drivers and do the kvm_get_kvm and kvm_put_kvm directly in vfio_main after dropping the (but that would result in some ugly symbol linkage and would acquire kvm references that a driver maybe does not care about so I don't really like that idea)