On Wed, Mar 10, 2021 at 08:23:40AM +0100, Christoph Hellwig wrote: > On Tue, Mar 09, 2021 at 05:38:43PM -0400, Jason Gunthorpe wrote: > > The vfio_device is using a 'sleep until all refs go to zero' pattern for > > its lifetime, but it is indirectly coded by repeatedly scanning the group > > list waiting for the device to be removed on its own. > > > > Switch this around to be a direct representation, use a refcount to count > > the number of places that are blocking destruction and sleep directly on a > > completion until that counter goes to zero. kfree the device after other > > accesses have been excluded in vfio_del_group_dev(). This is a fairly > > common Linux idiom. > > > > This simplifies a couple of things: > > > > - kref_put_mutex() is very rarely used in the kernel. Here it is being > > used to prevent a zero ref device from being seen in the group > > list. Instead allow the zero ref device to continue to exist in the > > device_list and use refcount_inc_not_zero() to exclude it once refs go > > to zero. > > > > - get/putting the group while get/putting the device. The device already > > holds a reference to the group, set during vfio_group_create_device(), > > there is no need for extra reference traffic. Cleanly have the balancing > > group put in vfio_del_group_dev() before the kfree(). > > Might it be worth to split this further up into separate patches for > each of the changes? I split the put/get and it looks nicer. The other stuff is more tightly coupled, doesn't look like there is much win from splitting further Thanks, Jason