> From: Liu, Yi L <yi.l.liu@xxxxxxxxx> > Sent: Monday, February 27, 2023 7:12 PM [...] > +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, > + unsigned long arg) > +{ > + struct vfio_device *device = df->device; > + struct vfio_device_bind_iommufd bind; > + struct iommufd_ctx *iommufd = NULL; > + unsigned long minsz; > + int ret; > + > + minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid); > + > + if (copy_from_user(&bind, (void __user *)arg, minsz)) > + return -EFAULT; > + > + if (bind.argsz < minsz || bind.flags) > + return -EINVAL; > + > + if (!device->ops->bind_iommufd) > + return -ENODEV; Hi Jason, Per the comment in vfio_iommufd_bind(), such device driver won't provide .bind_iommufd(). So shall we allow this ioctl to go longer to call .open_device() instead of failing it here? I think we need to allow it to go further. E.g. leave the check to be in vfio_iommufd_bind(). Otherwise, user may not able to use such devices. Is it? > + > + ret = vfio_device_block_group(device); > + if (ret) > + return ret; > + > + mutex_lock(&device->dev_set->lock); > + /* > + * If already been bound to an iommufd, or already set noiommu > + * then fail it. > + */ > + if (df->iommufd || df->noiommu) { > + ret = -EINVAL; > + goto out_unlock; > + } > + > + /* iommufd < 0 means noiommu mode */ > + if (bind.iommufd < 0) { > + if (!capable(CAP_SYS_RAWIO)) { > + ret = -EPERM; > + goto out_unlock; > + } > + df->noiommu = true; > + } else { > + iommufd = vfio_get_iommufd_from_fd(bind.iommufd); > + if (IS_ERR(iommufd)) { > + ret = PTR_ERR(iommufd); > + goto out_unlock; > + } > + } > + > + /* > + * Before the device open, get the KVM pointer currently > + * associated with the device file (if there is) and obtain > + * a reference. This reference is held until device closed. > + * Save the pointer in the device for use by drivers. > + */ > + vfio_device_get_kvm_safe(df); > + > + df->iommufd = iommufd; > + ret = vfio_device_open(df, &bind.out_devid, NULL); > + if (ret) > + goto out_put_kvm; [...] > > /* --------------- IOCTLs for DEVICE file descriptors --------------- */ > > +/* > + * VFIO_DEVICE_BIND_IOMMUFD - _IOR(VFIO_TYPE, VFIO_BASE + 19, > + * struct vfio_device_bind_iommufd) > + * > + * Bind a vfio_device to the specified iommufd. > + * > + * The user should provide a device cookie when calling this ioctl. The > + * cookie is carried only in event e.g. I/O fault reported to userspace > + * via iommufd. The user should use devid returned by this ioctl to mark > + * the target device in other ioctls (e.g. capability query via iommufd). > + * > + * User is not allowed to access the device before the binding operation > + * is completed. > + * > + * Unbind is automatically conducted when device fd is closed. > + * > + * @argsz: user filled size of this data. > + * @flags: reserved for future extension. > + * @dev_cookie: a per device cookie provided by userspace. > + * @iommufd: iommufd to bind. a negative value means noiommu. > + * @out_devid: the device id generated by this bind. > + * > + * Return: 0 on success, -errno on failure. > + */ > +struct vfio_device_bind_iommufd { > + __u32 argsz; > + __u32 flags; > + __aligned_u64 dev_cookie; > + __s32 iommufd; > + __u32 out_devid; As above, for the devices that do not do DMA, there is no .bind_iommufd op, hence no iommufd_device generated. This means no good value can be filled in this out_devid field. So this field is optional. Only for the devices which do DMA, should this out_devid field return a valid ID otherwise an invalid ID would be filled (e.g. value #0 is an invalid value in the iommufd object id pool). Userspace needs to check if the out_devid is valid or not before use. This ID can be further used in iommufd uAPIs like IOMMU_HWPT_ALLOC, IOMMU_DEVICE_GET_INFO and etc. > +}; > + > +#define VFIO_DEVICE_BIND_IOMMUFD _IO(VFIO_TYPE, VFIO_BASE > + 19) > + > /** > * VFIO_DEVICE_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 7, > * struct vfio_device_info) > -- > 2.34.1 Regards, Yi Liu