On Wed, Jul 26, 2023 at 08:36:31PM -0300, Jason Gunthorpe wrote: > On Wed, Jul 26, 2023 at 01:50:28PM -0700, Nicolin Chen wrote: > > > > > > rc = iopt_add_access(&new_ioas->iopt, access); > > > > if (rc) { > > > > - mutex_unlock(&access->ioas_lock); > > > > iommufd_put_object(&new_ioas->obj); > > > > + if (cur_ioas) > > > > + WARN_ON(iommufd_access_change_pt(access, > > > > + cur_ioas->obj.id)); > > > > > > We've already dropped our ref to cur_ioas, so this is also racy with > > > destroy. > > > > Would it be better by calling iommufd_access_detach() that holds > > the same mutex in the iommufd_access_destroy_object()? We could > > also unwrap the detach and delay the refcount_dec, as you did in > > your attaching patch. > > It is better just to integrate it with this algorithm so we don't have > the refcounting issues, like I did OK. I will have a patch adding the iommufd_access_change_ioas first, and it can update iommufd_access_destroy_object() too. > > > This is what I came up with: > > > > > > diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c > > > index 57c0e81f5073b2..e55d6e902edb98 100644 > > > --- a/drivers/iommu/iommufd/device.c > > > +++ b/drivers/iommu/iommufd/device.c > > > @@ -758,64 +758,101 @@ void iommufd_access_destroy(struct iommufd_access *access) > > > } > > > EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD); > > > > > > -void iommufd_access_detach(struct iommufd_access *access) > > > +static int iommufd_access_change_ioas(struct iommufd_access *access, > > > + struct iommufd_ioas *new_ioas) > > > { > > > struct iommufd_ioas *cur_ioas = access->ioas; > > > + int rc; > > > + > > > + lockdep_assert_held(&access->ioas_lock); > > > + > > > + /* We are racing with a concurrent detach, bail */ > > > + if (access->ioas_unpin) > > > + return -EBUSY; > > > > I think this should check access->ioas too? I mean: > > > > > + /* We are racing with a concurrent detach, bail */ > > + if (!access->ioas && access->ioas_unpin) > > + return -EBUSY; > > Oh, yes, that should basically be 'cur_ioas != access->ioas_unpin' - > ie any difference means we are racing with the unmap call. Yea, will update to 'cur_ioas != access->ioas_unpin'. > > > + if (new_ioas) { > > > + rc = iopt_add_access(&new_ioas->iopt, access); > > > + if (rc) { > > > + iommufd_put_object(&new_ioas->obj); > > > + access->ioas = cur_ioas; > > > + return rc; > > > + } > > > + iommufd_ref_to_users(&new_ioas->obj); > > > + } > > > + > > > + access->ioas = new_ioas; > > > + access->ioas_unpin = new_ioas; > > > iopt_remove_access(&cur_ioas->iopt, access); > > > > There was a bug in my earlier version, having the same flow by > > calling iopt_add_access() prior to iopt_remove_access(). But, > > doing that would override the access->iopt_access_list_id and > > it would then get unset by the following iopt_remove_access(). > > Ah, I was wondering about that order but didn't check it. > > Maybe we just need to pass the ID into iopt_remove_access and keep the > right version on the stack. > > > So, I came up with this version calling an iopt_remove_access() > > prior to iopt_add_access(), which requires an add-back the old > > ioas upon an failure at iopt_add_access(new_ioas). > > That is also sort of reasonable if the refcounting is organized like > this does. I just realized that either my v8 or your version calls unmap() first at the entire cur_ioas. So, there seems to be no point in doing that fallback re-add routine since the cur_ioas isn't the same, which I don't feel quite right... Perhaps we should pass the ID into iopt_add/remove_access like you said above. And then we attach the new_ioas, piror to the detach the cur_ioas? Thanks Nicolin