On Mon, Oct 04, 2021 at 04:25:32PM -0600, Alex Williamson wrote: > On Fri, 1 Oct 2021 20:22:20 -0300 > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > iommu_group_register_notifier()/iommu_group_unregister_notifier() are > > built using a blocking_notifier_chain which integrates a rwsem. The > > notifier function cannot be running outside its registration. > > > > When considering how the notifier function interacts with create/destroy > > of the group there are two fringe cases, the notifier starts before > > list_add(&vfio.group_list) and the notifier runs after the kref > > becomes 0. > > > > Prior to vfio_create_group() unlocking and returning we have > > container_users == 0 > > device_list == empty > > And this cannot change until the mutex is unlocked. > > > > After the kref goes to zero we must also have > > container_users == 0 > > device_list == empty > > > > Both are required because they are balanced operations and a 0 kref means > > some caller became unbalanced. Add the missing assertion that > > container_users must be zero as well. > > > > These two facts are important because when checking each operation we see: > > > > - IOMMU_GROUP_NOTIFY_ADD_DEVICE > > Empty device_list avoids the WARN_ON in vfio_group_nb_add_dev() > > 0 container_users ends the call > > - IOMMU_GROUP_NOTIFY_BOUND_DRIVER > > 0 container_users ends the call > > > > Finally, we have IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER, which only deletes > > items from the unbound list. During creation this list is empty, during > > kref == 0 nothing can read this list, and it will be freed soon. > > > > Since the vfio_group_release() doesn't hold the appropriate lock to > > manipulate the unbound_list and could race with the notifier, move the > > cleanup to directly before the kfree. > > > > This allows deleting all of the deferred group put code. > > > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > > drivers/vfio/vfio.c | 89 +++++---------------------------------------- > > 1 file changed, 9 insertions(+), 80 deletions(-) > > > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > > index 08b27b64f0f935..32a53cb3598524 100644 > > +++ b/drivers/vfio/vfio.c > > @@ -324,12 +324,20 @@ static void vfio_container_put(struct vfio_container *container) > > > > static void vfio_group_unlock_and_free(struct vfio_group *group) > > { > > + struct vfio_unbound_dev *unbound, *tmp; > > + > > mutex_unlock(&vfio.group_lock); > > /* > > * Unregister outside of lock. A spurious callback is harmless now > > * that the group is no longer in vfio.group_list. > > */ > > This comment is indirectly referencing the vfio_group_try_get() in the > notifier callback, but as you describe in the commit log, it's actually > the container_users value that prevents this from racing group release > now. Otherwise, tricky but looks good. Thanks, Do you think the comment should be deleted in this commit? I think I got it later on.. Jason