Re: [PATCH v2 02/14] vfio: Simplify the lifetime logic for vfio_device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 17, 2021 at 09:12:44AM +0100, Cornelia Huck wrote:
> On Tue, 16 Mar 2021 14:24:54 -0600
> Alex Williamson <alex.williamson@xxxxxxxxxx> wrote:
> 
> > On Tue, 16 Mar 2021 07:38:09 +0000
> > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> > 
> > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > > Sent: Saturday, March 13, 2021 8:56 AM
> > > > 
> > > > The vfio_device is using a 'sleep until all refs go to zero' pattern for
> > > > its lifetime, but it is indirectly coded by repeatedly scanning the group
> > > > list waiting for the device to be removed on its own.
> > > > 
> > > > Switch this around to be a direct representation, use a refcount to count
> > > > the number of places that are blocking destruction and sleep directly on a
> > > > completion until that counter goes to zero. kfree the device after other
> > > > accesses have been excluded in vfio_del_group_dev(). This is a fairly
> > > > common Linux idiom.
> > > > 
> > > > Due to this we can now remove kref_put_mutex(), which is very rarely used
> > > > in the kernel. Here it is being used to prevent a zero ref device from
> > > > being seen in the group list. Instead allow the zero ref device to
> > > > continue to exist in the device_list and use refcount_inc_not_zero() to
> > > > exclude it once refs go to zero.
> > > > 
> > > > This patch is organized so the next patch will be able to alter the API to
> > > > allow drivers to provide the kfree.
> > > > 
> > > > Reviewed-by: Christoph Hellwig <hch@xxxxxx>
> > > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > >  drivers/vfio/vfio.c | 79 ++++++++++++++-------------------------------
> > > >  1 file changed, 25 insertions(+), 54 deletions(-)
> > > > 
> 
> > > > @@ -935,32 +916,18 @@ void *vfio_del_group_dev(struct device *dev)
> > > >  	WARN_ON(!unbound);
> > > > 
> > > >  	vfio_device_put(device);
> > > > -
> > > > -	/*
> > > > -	 * If the device is still present in the group after the above
> > > > -	 * 'put', then it is in use and we need to request it from the
> > > > -	 * bus driver.  The driver may in turn need to request the
> > > > -	 * device from the user.  We send the request on an arbitrary
> > > > -	 * interval with counter to allow the driver to take escalating
> > > > -	 * measures to release the device if it has the ability to do so.
> > > > -	 */    
> > > 
> > > Above comment still makes sense even with this patch. What about
> > > keeping it? otherwise:  
> > 
> > The comment is not exactly correct after this code change either, the
> > device will always be present in the group after this 'put'.  Instead,
> > the completion now indicates the reference count has reached zero.  If
> > it's worthwhile to keep more context to the request callback, perhaps:
> > 
> > 	/*
> > 	 * If there are still outstanding device references, such as
> > 	 * from the device being in use, periodically kick the optional
> > 	 * device request callback while waiting.
> > 	 */
> 
> I like that comment; I don't think it hurts to be a bit verbose here.

I would prefer the comment explain why the driver should return from
request with refs held and what it is supposed to do on later
calls. This loop mechanism is strange, I didn't look at what the
drivers implement under this.

I don't see this approach in other places that are able to disconnect
their HW drivers from the uAPI (in RDMA land we call this
disassociation)

Jason



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux