On Thu, Aug 15, 2024 at 04:08:48PM -0300, Jason Gunthorpe wrote: > On Wed, Aug 07, 2024 at 01:10:46PM -0700, Nicolin Chen wrote: > > > +int iommufd_viommu_set_vdev_id(struct iommufd_ucmd *ucmd) > > +{ > > + struct iommu_viommu_set_vdev_id *cmd = ucmd->cmd; > > + struct iommufd_hwpt_nested *hwpt_nested; > > + struct iommufd_vdev_id *vdev_id, *curr; > > + struct iommufd_hw_pagetable *hwpt; > > + struct iommufd_viommu *viommu; > > + struct iommufd_device *idev; > > + int rc = 0; > > + > > + if (cmd->vdev_id > ULONG_MAX) > > + return -EINVAL; > > + > > + idev = iommufd_get_device(ucmd, cmd->dev_id); > > + if (IS_ERR(idev)) > > + return PTR_ERR(idev); > > + hwpt = idev->igroup->hwpt; > > + > > + if (hwpt == NULL || hwpt->obj.type != IOMMUFD_OBJ_HWPT_NESTED) { > > + rc = -EINVAL; > > + goto out_put_idev; > > + } > > + hwpt_nested = container_of(hwpt, struct iommufd_hwpt_nested, common); > > This doesn't seem like a necessary check, the attached hwpt can change > after this is established, so this can't be an invariant we enforce. > > If you want to do 1:1 then somehow directly check if the idev is > already linked to a viommu. But idev can't link to a viommu without a proxy hwpt_nested? Even the stage-2 only configuration should have an identity hwpt_nested right? > > +static struct device * > > +iommufd_viommu_find_device(struct iommufd_viommu *viommu, u64 id) > > +{ > > + struct iommufd_vdev_id *vdev_id; > > + > > + xa_lock(&viommu->vdev_ids); > > + vdev_id = xa_load(&viommu->vdev_ids, (unsigned long)id); > > + xa_unlock(&viommu->vdev_ids); > > This lock doesn't do anything > > > + if (!vdev_id || vdev_id->vdev_id != id) > > + return NULL; > > And this is unlocked > > > + return vdev_id->dev; > > +} > > This isn't good.. We can't return the struct device pointer here as > there is no locking for it anymore. We can't even know it is still > probed to VFIO anymore. > > It has to work by having the iommu driver directly access the xarray > and the entirely under the spinlock the iommu driver can translate the > vSID to the pSID and the let go and push the invalidation to HW. No > races. Maybe the iommufd_viommu_invalidate ioctl handler should hold that xa_lock around the viommu->ops->cache_invalidate, and then add lock assert in iommufd_viommu_find_device? > > +int iommufd_viommu_unset_vdev_id(struct iommufd_ucmd *ucmd) > > +{ > > + struct iommu_viommu_unset_vdev_id *cmd = ucmd->cmd; > > + struct iommufd_vdev_id *vdev_id; > > + struct iommufd_viommu *viommu; > > + struct iommufd_device *idev; > > + int rc = 0; > > + > > + idev = iommufd_get_device(ucmd, cmd->dev_id); > > + if (IS_ERR(idev)) > > + return PTR_ERR(idev); > > + > > + viommu = iommufd_get_viommu(ucmd, cmd->viommu_id); > > + if (IS_ERR(viommu)) { > > + rc = PTR_ERR(viommu); > > + goto out_put_idev; > > + } > > + > > + if (idev->dev != iommufd_viommu_find_device(viommu, cmd->vdev_id)) { > > Swap the order around != to be more kernely Ack. > > + rc = -EINVAL; > > + goto out_put_viommu; > > + } > > + > > + vdev_id = xa_erase(&viommu->vdev_ids, cmd->vdev_id); > > And this whole thing needs to be done under the xa_lock too. > > xa_lock(&viommu->vdev_ids); > vdev_id = xa_load(&viommu->vdev_ids, cmd->vdev_id); > if (!vdev_id || vdev_id->vdev_id != cmd->vdev_id (????) || vdev_id->dev != idev->dev) > err > __xa_erase(&viommu->vdev_ids, cmd->vdev_id); > xa_unlock((&viommu->vdev_ids); I've changed to xa_cmpxchg() in my local tree. Would it be simpler? Thanks Nicolin