> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Friday, January 20, 2023 3:07 AM > > On Tue, Jan 17, 2023 at 05:49:34AM -0800, Yi Liu wrote: > > This is to avoid a circular refcount problem between the kvm struct and > > the device file. KVM modules holds device/group file reference when the > > device/group is added and releases it per removal or the last kvm > reference > > is released. This reference model is ok for the group since there is no > > kvm reference in the group paths. > > > > But it is a problem for device file since the vfio devices may get kvm > > reference in the device open path and put it in the device file release. > > e.g. Intel kvmgt. This would result in a circular issue since the kvm > > side won't put the device file reference if kvm reference is not 0, while > > the vfio device side needs to put kvm reference in the release callback. > > > > To solve this problem for device file, let vfio provide release() which > > would be called once kvm file is closed, it won't depend on the last kvm > > reference. Hence avoid circular refcount problem. > > > > Suggested-by: Kevin Tian <kevin.tian@xxxxxxxxx> > > Signed-off-by: Yi Liu <yi.l.liu@xxxxxxxxx> > > --- > > virt/kvm/vfio.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > > From Alex's remarks please revise the commit message and add a Fixes > line of some kind that this solves the deadlock Matthew was working > on, and send it stand alone right away Hi Kevin, Jason, I got a minor question. Let me check your opinions. So after this change. Say we have thread A, which is the kvm-vfio device release. It will hold the kvm_lock and delete the kvm-vfio device from the kvm-device list. It will also call into vfio to set KVM==NULL. So it will try to hold group_lock. The locking order is kvm_lock -> group_lock. Say in the same time, we have thread B closes device, it will hold group_lock first and then calls kvm_put_kvm() which is the last reference, then it would loop the kvm-device list. Currently, it is not holding kvm_lock. But it also manipulating the kvm-device list, should it hold kvm_lock? If yes, then the locking order is group_lock -> kvm_lock. Then we will have A-B B-A lock issue. Regards, Yi Liu