On Fri, Nov 04, 2022 at 08:32:30AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Sent: Wednesday, October 26, 2022 2:12 AM > > > > +int iommufd_ioas_allow_iovas(struct iommufd_ucmd *ucmd) > > +{ > > + struct iommu_ioas_allow_iovas *cmd = ucmd->cmd; > > + struct rb_root_cached allowed_iova = RB_ROOT_CACHED; > > + struct interval_tree_node *node; > > + struct iommufd_ioas *ioas; > > + struct io_pagetable *iopt; > > + int rc = 0; > > + > > + ioas = iommufd_get_ioas(ucmd, cmd->ioas_id); > > + if (IS_ERR(ioas)) > > + return PTR_ERR(ioas); > > + iopt = &ioas->iopt; > > Missed the check of __reserved field Done > > + > > +int iommufd_ioas_copy(struct iommufd_ucmd *ucmd) > > +{ > > + struct iommu_ioas_copy *cmd = ucmd->cmd; > > + struct iommufd_ioas *src_ioas; > > + struct iommufd_ioas *dst_ioas; > > + unsigned int flags = 0; > > + LIST_HEAD(pages_list); > > + unsigned long iova; > > + int rc; > > + > > + if ((cmd->flags & > > + ~(IOMMU_IOAS_MAP_FIXED_IOVA | > > IOMMU_IOAS_MAP_WRITEABLE | > > + IOMMU_IOAS_MAP_READABLE))) > > + return -EOPNOTSUPP; > > + if (cmd->length >= ULONG_MAX) > > + return -EOVERFLOW; > > and overflow on cmd->dest_iova/src_iova Yep > > + > > + src_ioas = iommufd_get_ioas(ucmd, cmd->src_ioas_id); > > + if (IS_ERR(src_ioas)) > > + return PTR_ERR(src_ioas); > > + rc = iopt_get_pages(&src_ioas->iopt, cmd->src_iova, cmd->length, > > + &pages_list); > > + iommufd_put_object(&src_ioas->obj); > > + if (rc) > > + goto out_pages; > > direct return given iopt_get_pages() already called > iopt_free_pages_list() upon error. Ok > > +int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd) > > +{ > > + struct iommu_ioas_unmap *cmd = ucmd->cmd; > > + struct iommufd_ioas *ioas; > > + unsigned long unmapped = 0; > > + int rc; > > + > > + ioas = iommufd_get_ioas(ucmd, cmd->ioas_id); > > + if (IS_ERR(ioas)) > > + return PTR_ERR(ioas); > > + > > + if (cmd->iova == 0 && cmd->length == U64_MAX) { > > + rc = iopt_unmap_all(&ioas->iopt, &unmapped); > > + if (rc) > > + goto out_put; > > + } else { > > + if (cmd->iova >= ULONG_MAX || cmd->length >= > > ULONG_MAX) { > > + rc = -EOVERFLOW; > > + goto out_put; > > + } > > Above check can be moved before iommufd_get_ioas(). They have to be after this: if (cmd->iova == 0 && cmd->length == U64_MAX) { Or it will false trigger > > +static int iommufd_option(struct iommufd_ucmd *ucmd) > > +{ > > + struct iommu_option *cmd = ucmd->cmd; > > + int rc; > > + > > lack of __reserved check Done > > static struct iommufd_ioctl_op iommufd_ioctl_ops[] = { > > IOCTL_OP(IOMMU_DESTROY, iommufd_destroy, struct > > iommu_destroy, id), > > + IOCTL_OP(IOMMU_IOAS_ALLOC, iommufd_ioas_alloc_ioctl, > > + struct iommu_ioas_alloc, out_ioas_id), > > + IOCTL_OP(IOMMU_IOAS_ALLOW_IOVAS, iommufd_ioas_allow_iovas, > > + struct iommu_ioas_allow_iovas, allowed_iovas), > > + IOCTL_OP(IOMMU_IOAS_COPY, iommufd_ioas_copy, struct > > iommu_ioas_copy, > > + src_iova), > > + IOCTL_OP(IOMMU_IOAS_IOVA_RANGES, iommufd_ioas_iova_ranges, > > + struct iommu_ioas_iova_ranges, out_iova_alignment), > > + IOCTL_OP(IOMMU_IOAS_MAP, iommufd_ioas_map, struct > > iommu_ioas_map, > > + __reserved), > > + IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct > > iommu_ioas_unmap, > > + length), > > + IOCTL_OP(IOMMU_OPTION, iommufd_option, struct iommu_option, > > + val64), > > }; > > Just personal preference - it reads better to me if the above order (and > the enum definition in iommufd.h) can be same as how those commands > are defined/explained in iommufd.h. I prefer "keep sorted" for these kinds of lists, it is much easier to maintain in the long run > > + * Query an IOAS for ranges of allowed IOVAs. Mapping IOVA outside these > > ranges > > + * is not allowed. out_num_iovas will be set to the total number of iovas > > and > > + * the out_valid_iovas[] will be filled in as space permits. > > out_num_iovas and out_valid_iovas[] are stale. Done > > + * > > + * The allowed ranges are dependent on the HW path the DMA operation > > takes, and > > + * can change during the lifetime of the IOAS. A fresh empty IOAS will have a > > + * full range, and each attached device will narrow the ranges based on that > > + * devices HW restrictions. Detatching a device can widen the ranges. > > devices -> device's Ok > > +/** > > + * struct iommu_ioas_allow_iovas - ioctl(IOMMU_IOAS_ALLOW_IOVAS) > > + * @size: sizeof(struct iommu_ioas_allow_iovas) > > + * @ioas_id: IOAS ID to allow IOVAs from > > missed num_iovas and __reserved Hurm. how do I get make W=1 to cover the header files? > > + * @allowed_iovas: Pointer to array of struct iommu_iova_range > > + * > > + * Ensure a range of IOVAs are always available for allocation. If this call > > + * succeeds then IOMMU_IOAS_IOVA_RANGES will never return a list of > > IOVA ranges > > + * that are narrower than the ranges provided here. This call will fail if > > + * IOMMU_IOAS_IOVA_RANGES is currently narrower than the given ranges. > > + * > > + * When an IOAS is first created the IOVA_RANGES will be maximally sized, > > and as > > + * devices are attached the IOVA will narrow based on the device > > restrictions. > > + * When an allowed range is specified any narrowing will be refused, ie > > device > > + * attachment can fail if the device requires limiting within the allowed > > range. > > + * > > + * Automatic IOVA allocation is also impacted by this call. MAP will only > > + * allocate within the allowed IOVAs if they are present. > > According to iopt_check_iova() FIXED_IOVA can specify an iova which > is not in allowed list but in the list of reported IOVA_RANGES. Is it > correct or make more sense to have FIXED_IOVA also under guard of > the allowed list (if violating then fail the map call)? The concept was the allow list only really impacts domain attachment. When a user uses FIXED they have to know what they are doing. There is not a good reason to deny the user to use any IOVA that is not restricted by the reserved list. > > + * @length: Number of bytes to unmap, and return back the bytes > > unmapped > > + * > > + * Unmap an IOVA range. The iova/length must be a superset of a > > previously > > + * mapped range used with IOMMU_IOAS_PAGETABLE_MAP or COPY. > > remove 'PAGETABLE' Done > > > +/** > > + * enum iommufd_option > > + * @IOMMU_OPTION_RLIMIT_MODE: > > + * Change how RLIMIT_MEMLOCK accounting works. The caller must have > > privilege > > + * to invoke this. Value 0 (default) is user based accouting, 1 uses process > > + * based accounting. Global option, object_id must be 0 > > + * @IOMMU_OPTION_HUGE_PAGES: > > + * Value 1 (default) allows contiguous pages to be combined when > > generating > > + * iommu mappings. Value 0 disables combining, everything is mapped to > > + * PAGE_SIZE. This can be useful for benchmarking. This is a per-IOAS > > + * option, the object_id must be the IOAS ID. > > What about HWPT ID? Is there value of supporting HWPT's with different > mapping size attached to the same IOAS? This is a global IOAS flag, it just makes everything use PAGE_SIZE. It is really only interesting for debugging and benchmarking. The test suite and syzkaller have both made use of this to improve coverage. > > +/** > > + * @size: sizeof(struct iommu_option) > > + * @option_id: One of enum iommufd_option > > + * @op: One of enum iommufd_option_ops > > + * @__reserved: Must be 0 > > + * @object_id: ID of the object if required > > + * @val64: Option value to set or value returned on get > > + * > > + * Change a simple option value. This multiplexor allows controlling a > > options > > + * on objects. IOMMU_OPTION_OP_SET will load an option and > > IOMMU_OPTION_OP_GET > > + * will return the current value. > > + */ > > This is quite generic. Does it imply that future device capability reporting > can be also implemented based on this cmd, i.e. have OP_GET on a > device object? I don't view this as a way to do capabilities. I think we will have a capability ioctl as well. This is really for something that can be get & set. Thanks, Jason