> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Wednesday, October 26, 2022 2:12 AM > > +int iommufd_ioas_allow_iovas(struct iommufd_ucmd *ucmd) > +{ > + struct iommu_ioas_allow_iovas *cmd = ucmd->cmd; > + struct rb_root_cached allowed_iova = RB_ROOT_CACHED; > + struct interval_tree_node *node; > + struct iommufd_ioas *ioas; > + struct io_pagetable *iopt; > + int rc = 0; > + > + ioas = iommufd_get_ioas(ucmd, cmd->ioas_id); > + if (IS_ERR(ioas)) > + return PTR_ERR(ioas); > + iopt = &ioas->iopt; Missed the check of __reserved field > + > +int iommufd_ioas_copy(struct iommufd_ucmd *ucmd) > +{ > + struct iommu_ioas_copy *cmd = ucmd->cmd; > + struct iommufd_ioas *src_ioas; > + struct iommufd_ioas *dst_ioas; > + unsigned int flags = 0; > + LIST_HEAD(pages_list); > + unsigned long iova; > + int rc; > + > + if ((cmd->flags & > + ~(IOMMU_IOAS_MAP_FIXED_IOVA | > IOMMU_IOAS_MAP_WRITEABLE | > + IOMMU_IOAS_MAP_READABLE))) > + return -EOPNOTSUPP; > + if (cmd->length >= ULONG_MAX) > + return -EOVERFLOW; and overflow on cmd->dest_iova/src_iova > + > + src_ioas = iommufd_get_ioas(ucmd, cmd->src_ioas_id); > + if (IS_ERR(src_ioas)) > + return PTR_ERR(src_ioas); > + rc = iopt_get_pages(&src_ioas->iopt, cmd->src_iova, cmd->length, > + &pages_list); > + iommufd_put_object(&src_ioas->obj); > + if (rc) > + goto out_pages; direct return given iopt_get_pages() already called iopt_free_pages_list() upon error. > +int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd) > +{ > + struct iommu_ioas_unmap *cmd = ucmd->cmd; > + struct iommufd_ioas *ioas; > + unsigned long unmapped = 0; > + int rc; > + > + ioas = iommufd_get_ioas(ucmd, cmd->ioas_id); > + if (IS_ERR(ioas)) > + return PTR_ERR(ioas); > + > + if (cmd->iova == 0 && cmd->length == U64_MAX) { > + rc = iopt_unmap_all(&ioas->iopt, &unmapped); > + if (rc) > + goto out_put; > + } else { > + if (cmd->iova >= ULONG_MAX || cmd->length >= > ULONG_MAX) { > + rc = -EOVERFLOW; > + goto out_put; > + } Above check can be moved before iommufd_get_ioas(). > +static int iommufd_option(struct iommufd_ucmd *ucmd) > +{ > + struct iommu_option *cmd = ucmd->cmd; > + int rc; > + lack of __reserved check > static struct iommufd_ioctl_op iommufd_ioctl_ops[] = { > IOCTL_OP(IOMMU_DESTROY, iommufd_destroy, struct > iommu_destroy, id), > + IOCTL_OP(IOMMU_IOAS_ALLOC, iommufd_ioas_alloc_ioctl, > + struct iommu_ioas_alloc, out_ioas_id), > + IOCTL_OP(IOMMU_IOAS_ALLOW_IOVAS, iommufd_ioas_allow_iovas, > + struct iommu_ioas_allow_iovas, allowed_iovas), > + IOCTL_OP(IOMMU_IOAS_COPY, iommufd_ioas_copy, struct > iommu_ioas_copy, > + src_iova), > + IOCTL_OP(IOMMU_IOAS_IOVA_RANGES, iommufd_ioas_iova_ranges, > + struct iommu_ioas_iova_ranges, out_iova_alignment), > + IOCTL_OP(IOMMU_IOAS_MAP, iommufd_ioas_map, struct > iommu_ioas_map, > + __reserved), > + IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct > iommu_ioas_unmap, > + length), > + IOCTL_OP(IOMMU_OPTION, iommufd_option, struct iommu_option, > + val64), > }; Just personal preference - it reads better to me if the above order (and the enum definition in iommufd.h) can be same as how those commands are defined/explained in iommufd.h. > +/** > + * struct iommu_ioas_iova_ranges - ioctl(IOMMU_IOAS_IOVA_RANGES) > + * @size: sizeof(struct iommu_ioas_iova_ranges) > + * @ioas_id: IOAS ID to read ranges from > + * @num_iovas: Input/Output total number of ranges in the IOAS > + * @__reserved: Must be 0 > + * @allowed_iovas: Pointer to the output array of struct iommu_iova_range > + * @out_iova_alignment: Minimum alignment required for mapping IOVA > + * > + * Query an IOAS for ranges of allowed IOVAs. Mapping IOVA outside these > ranges > + * is not allowed. out_num_iovas will be set to the total number of iovas > and > + * the out_valid_iovas[] will be filled in as space permits. out_num_iovas and out_valid_iovas[] are stale. > + * > + * The allowed ranges are dependent on the HW path the DMA operation > takes, and > + * can change during the lifetime of the IOAS. A fresh empty IOAS will have a > + * full range, and each attached device will narrow the ranges based on that > + * devices HW restrictions. Detatching a device can widen the ranges. devices -> device's > +/** > + * struct iommu_ioas_allow_iovas - ioctl(IOMMU_IOAS_ALLOW_IOVAS) > + * @size: sizeof(struct iommu_ioas_allow_iovas) > + * @ioas_id: IOAS ID to allow IOVAs from missed num_iovas and __reserved > + * @allowed_iovas: Pointer to array of struct iommu_iova_range > + * > + * Ensure a range of IOVAs are always available for allocation. If this call > + * succeeds then IOMMU_IOAS_IOVA_RANGES will never return a list of > IOVA ranges > + * that are narrower than the ranges provided here. This call will fail if > + * IOMMU_IOAS_IOVA_RANGES is currently narrower than the given ranges. > + * > + * When an IOAS is first created the IOVA_RANGES will be maximally sized, > and as > + * devices are attached the IOVA will narrow based on the device > restrictions. > + * When an allowed range is specified any narrowing will be refused, ie > device > + * attachment can fail if the device requires limiting within the allowed > range. > + * > + * Automatic IOVA allocation is also impacted by this call. MAP will only > + * allocate within the allowed IOVAs if they are present. According to iopt_check_iova() FIXED_IOVA can specify an iova which is not in allowed list but in the list of reported IOVA_RANGES. Is it correct or make more sense to have FIXED_IOVA also under guard of the allowed list (if violating then fail the map call)? > +/** > + * struct iommu_ioas_unmap - ioctl(IOMMU_IOAS_UNMAP) > + * @size: sizeof(struct iommu_ioas_unmap) > + * @ioas_id: IOAS ID to change the mapping of > + * @iova: IOVA to start the unmapping at > + * @length: Number of bytes to unmap, and return back the bytes > unmapped > + * > + * Unmap an IOVA range. The iova/length must be a superset of a > previously > + * mapped range used with IOMMU_IOAS_PAGETABLE_MAP or COPY. remove 'PAGETABLE' > +/** > + * enum iommufd_option > + * @IOMMU_OPTION_RLIMIT_MODE: > + * Change how RLIMIT_MEMLOCK accounting works. The caller must have > privilege > + * to invoke this. Value 0 (default) is user based accouting, 1 uses process > + * based accounting. Global option, object_id must be 0 > + * @IOMMU_OPTION_HUGE_PAGES: > + * Value 1 (default) allows contiguous pages to be combined when > generating > + * iommu mappings. Value 0 disables combining, everything is mapped to > + * PAGE_SIZE. This can be useful for benchmarking. This is a per-IOAS > + * option, the object_id must be the IOAS ID. What about HWPT ID? Is there value of supporting HWPT's with different mapping size attached to the same IOAS? > +/** > + * @size: sizeof(struct iommu_option) > + * @option_id: One of enum iommufd_option > + * @op: One of enum iommufd_option_ops > + * @__reserved: Must be 0 > + * @object_id: ID of the object if required > + * @val64: Option value to set or value returned on get > + * > + * Change a simple option value. This multiplexor allows controlling a > options > + * on objects. IOMMU_OPTION_OP_SET will load an option and > IOMMU_OPTION_OP_GET > + * will return the current value. > + */ This is quite generic. Does it imply that future device capability reporting can be also implemented based on this cmd, i.e. have OP_GET on a device object?