Re: [PATCH 1/5] vfio: Delete vfio_get/put_group from vfio_iommu_group_notifier()

Alex Williamson <alex.williamson@xxxxxxxxxx> · Mon, 4 Oct 2021 22:01:52 -0600

On Mon, 4 Oct 2021 19:34:31 -0300
Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:

> On Mon, Oct 04, 2021 at 04:25:32PM -0600, Alex Williamson wrote:
> > On Fri,  1 Oct 2021 20:22:20 -0300
> > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> >   
> > > iommu_group_register_notifier()/iommu_group_unregister_notifier() are
> > > built using a blocking_notifier_chain which integrates a rwsem. The
> > > notifier function cannot be running outside its registration.
> > > 
> > > When considering how the notifier function interacts with create/destroy
> > > of the group there are two fringe cases, the notifier starts before
> > > list_add(&vfio.group_list) and the notifier runs after the kref
> > > becomes 0.
> > > 
> > > Prior to vfio_create_group() unlocking and returning we have
> > >    container_users == 0
> > >    device_list == empty
> > > And this cannot change until the mutex is unlocked.
> > > 
> > > After the kref goes to zero we must also have
> > >    container_users == 0
> > >    device_list == empty
> > > 
> > > Both are required because they are balanced operations and a 0 kref means
> > > some caller became unbalanced. Add the missing assertion that
> > > container_users must be zero as well.
> > > 
> > > These two facts are important because when checking each operation we see:
> > > 
> > > - IOMMU_GROUP_NOTIFY_ADD_DEVICE
> > >    Empty device_list avoids the WARN_ON in vfio_group_nb_add_dev()
> > >    0 container_users ends the call
> > > - IOMMU_GROUP_NOTIFY_BOUND_DRIVER
> > >    0 container_users ends the call
> > > 
> > > Finally, we have IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER, which only deletes
> > > items from the unbound list. During creation this list is empty, during
> > > kref == 0 nothing can read this list, and it will be freed soon.
> > > 
> > > Since the vfio_group_release() doesn't hold the appropriate lock to
> > > manipulate the unbound_list and could race with the notifier, move the
> > > cleanup to directly before the kfree.
> > > 
> > > This allows deleting all of the deferred group put code.
> > > 
> > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > >  drivers/vfio/vfio.c | 89 +++++----------------------------------------
> > >  1 file changed, 9 insertions(+), 80 deletions(-)
> > > 
> > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > > index 08b27b64f0f935..32a53cb3598524 100644
> > > +++ b/drivers/vfio/vfio.c
> > > @@ -324,12 +324,20 @@ static void vfio_container_put(struct vfio_container *container)
> > >  
> > >  static void vfio_group_unlock_and_free(struct vfio_group *group)
> > >  {
> > > +	struct vfio_unbound_dev *unbound, *tmp;
> > > +
> > >  	mutex_unlock(&vfio.group_lock);
> > >  	/*
> > >  	 * Unregister outside of lock.  A spurious callback is harmless now
> > >  	 * that the group is no longer in vfio.group_list.
> > >  	 */  
> > 
> > This comment is indirectly referencing the vfio_group_try_get() in the
> > notifier callback, but as you describe in the commit log, it's actually
> > the container_users value that prevents this from racing group release
> > now.  Otherwise, tricky but looks good.  Thanks,  
> 
> Do you think the comment should be deleted in this commit? I think I
> got it later on..

I think the commit log argument is that notifies racing the group
release are harmless so long as the container is unused, and releasing
a group with active container users would be unbalanced, which
justifies the WARN_ON added here.  This comment does get removed in the
final patch, but maybe that logic can be noted in a comment by the new
sanity test.  Thanks,

Alex