> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Wednesday, October 26, 2022 2:12 AM > > +/** > + * iommufd_device_bind - Bind a physical device to an iommu fd > + * @ictx: iommufd file descriptor > + * @dev: Pointer to a physical PCI device struct > + * @id: Output ID number to return to userspace for this device > + * > + * A successful bind establishes an ownership over the device and returns > + * struct iommufd_device pointer, otherwise returns error pointer. > + * > + * A driver using this API must set driver_managed_dma and must not touch > + * the device until this routine succeeds and establishes ownership. > + * > + * Binding a PCI device places the entire RID under iommufd control. > + * Then what about non-PCI device? clearer to say that calling this interface just places the entire physical device under iommufd control > + * FIXME: This is conceptually broken for iommufd since we want to > allow > + * userspace to change the domains, eg switch from an identity IOAS > to a > + * DMA IOAs. There is currently no way to create a MSI window that IOAs -> IOAS > + rc = iommu_get_msi_cookie(hwpt->domain, sw_msi_start); > + if (rc && rc != -ENODEV) > + return rc; I know this is copied from VFIO but a comment is appreciated why -ENODEV is considered sane to move forward. > + /* > + * Otherwise the platform has a MSI window that is not isolated. For > + * historical compat with VFIO allow a module parameter to ignore > the > + * insecurity. > + */ > + if (!(flags & > IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT)) > + return -EPERM; Throw out a warning similar to vfio. > + > + rc = iopt_table_enforce_group_resv_regions(&hwpt->ioas->iopt, > idev->dev, > + idev->group, > &sw_msi_start); > + if (rc) > + goto out_iova; goto out_unlock since the function internally already called __iopt_remove_reserved_iova() upon error. > + /* > + * There is no differentiation when domains are allocated, so any > domain > + * that is willing to attach to the device is interchangeable with any > + * other. > + */ > + mutex_lock(&ioas->mutex); > + list_for_each_entry(hwpt, &ioas->hwpt_list, hwpt_item) { > + if (!hwpt->auto_domain || > + !refcount_inc_not_zero(&hwpt->obj.users)) users cannot be zero at this point. a new hwpt is added to the list with users==1 in attach. detach first removes the hwpt and then decrement users. Both are conducted by holding ioas->mutex. > + continue; > + > + rc = iommufd_device_do_attach(idev, hwpt, flags); > + refcount_dec(&hwpt->obj.users); with above I also wonder whether refcount_inc/dec are just redundant here... > +int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id, > + unsigned int flags) > +{ > + struct iommufd_object *pt_obj; > + int rc; > + > + pt_obj = iommufd_get_object(idev->ictx, *pt_id, > IOMMUFD_OBJ_ANY); > + if (IS_ERR(pt_obj)) > + return PTR_ERR(pt_obj); Is there a reason why get_object() is not required for idev? concurrent attach/unbind could lead to use-after-free here given users for idev is added only in the end of the function. > + > + switch (pt_obj->type) { > + case IOMMUFD_OBJ_HW_PAGETABLE: { > + struct iommufd_hw_pagetable *hwpt = > + container_of(pt_obj, struct iommufd_hw_pagetable, > obj); > + > + rc = iommufd_device_do_attach(idev, hwpt, flags); > + if (rc) > + goto out_put_pt_obj; > + > + mutex_lock(&hwpt->ioas->mutex); > + list_add_tail(&hwpt->hwpt_item, &hwpt->ioas->hwpt_list); > + mutex_unlock(&hwpt->ioas->mutex); > + break; > + } > + case IOMMUFD_OBJ_IOAS: { > + struct iommufd_ioas *ioas = > + container_of(pt_obj, struct iommufd_ioas, obj); > + > + rc = iommufd_device_auto_get_domain(idev, ioas, flags); > + if (rc) > + goto out_put_pt_obj; > + break; > + } > + default: > + rc = -EINVAL; > + goto out_put_pt_obj; > + } > + > + refcount_inc(&idev->obj.users); > + *pt_id = idev->hwpt->obj.id; What is the value of returning hwpt id of auto domain to user? IMHO this will add more complexity to the life cycle of auto domain. w/o allowing it the life cycle is simple i.e. the first attach which doesn't find a matching domain creates a new auto domain then the last detach frees it. now if allowing user to populate auto domain similar to user-created hwpt then detach also need get_object() to wait for concurrent ioctls similar to iommufd_destroy() does and more trick on destroying the object. If no big gain probably hiding it from userspace is simpler? > +void iommufd_device_detach(struct iommufd_device *idev) > +{ > + struct iommufd_hw_pagetable *hwpt = idev->hwpt; > + > + mutex_lock(&hwpt->ioas->mutex); > + mutex_lock(&hwpt->devices_lock); > + list_del(&idev->devices_item); > + if (!iommufd_hw_pagetable_has_group(hwpt, idev->group)) { > + if (list_empty(&hwpt->devices)) { > + iopt_table_remove_domain(&hwpt->ioas->iopt, > + hwpt->domain); > + list_del(&hwpt->hwpt_item); > + } > + iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev); this is always required. not just for last device in a group.