On Tue, Mar 09, 2021 at 05:38:43PM -0400, Jason Gunthorpe wrote: > The vfio_device is using a 'sleep until all refs go to zero' pattern for > its lifetime, but it is indirectly coded by repeatedly scanning the group > list waiting for the device to be removed on its own. > > Switch this around to be a direct representation, use a refcount to count > the number of places that are blocking destruction and sleep directly on a > completion until that counter goes to zero. kfree the device after other > accesses have been excluded in vfio_del_group_dev(). This is a fairly > common Linux idiom. > > This simplifies a couple of things: > > - kref_put_mutex() is very rarely used in the kernel. Here it is being > used to prevent a zero ref device from being seen in the group > list. Instead allow the zero ref device to continue to exist in the > device_list and use refcount_inc_not_zero() to exclude it once refs go > to zero. > > - get/putting the group while get/putting the device. The device already > holds a reference to the group, set during vfio_group_create_device(), > there is no need for extra reference traffic. Cleanly have the balancing > group put in vfio_del_group_dev() before the kfree(). Might it be worth to split this further up into separate patches for each of the changes? The actual changes do look good, though: Reviewed-by: Christoph Hellwig <hch@xxxxxx>