On Mon, Jul 24, 2023 at 04:03:57AM -0700, Yi Liu wrote: > + switch (pt_obj->type) { > + case IOMMUFD_OBJ_IOAS: > + ioas = container_of(pt_obj, struct iommufd_ioas, obj); > + break; > + case IOMMUFD_OBJ_HW_PAGETABLE: > + /* pt_id points HWPT only when hwpt_type is !IOMMU_HWPT_TYPE_DEFAULT */ > + if (cmd->hwpt_type == IOMMU_HWPT_TYPE_DEFAULT) { > + rc = -EINVAL; > + goto out_put_pt; > + } > + > + parent = container_of(pt_obj, struct iommufd_hw_pagetable, obj); > + /* > + * Cannot allocate user-managed hwpt linking to auto_created > + * hwpt. If the parent hwpt is already a user-managed hwpt, > + * don't allocate another user-managed hwpt linking to it. > + */ > + if (parent->auto_domain || parent->parent) { > + rc = -EINVAL; > + goto out_put_pt; > + } > + ioas = parent->ioas; Why do we set ioas here? I would think it should be NULL. I think it is looking like a mistake to try and re-use iommufd_hw_pagetable_alloc() directly for the nested case. It should not have a IOAS argument, it should not do enforce_cc, or iopt_* functions So must of the function is removed. Probably better to make a new ioas-less function for the nesting case. > diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c > index 510db114fc61..5f4420626421 100644 > --- a/drivers/iommu/iommufd/main.c > +++ b/drivers/iommu/iommufd/main.c > @@ -426,7 +426,7 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = { > IOCTL_OP(IOMMU_GET_HW_INFO, iommufd_get_hw_info, struct iommu_hw_info, > __reserved), > IOCTL_OP(IOMMU_HWPT_ALLOC, iommufd_hwpt_alloc, struct iommu_hwpt_alloc, > - __reserved), > + data_uptr), Nono, these can never change once we put them it. It specifies the hard minimum size that userspace must provide. If userspace gives less than this then the ioctl always fails. Changing it breaks all the existing software. The core code ensures that the trailing part of the cmd struct is zero'd the extended implementation must cope with Zero'd values, which this does. > diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h > index f2026cde2d64..73bf9af91e99 100644 > --- a/include/uapi/linux/iommufd.h > +++ b/include/uapi/linux/iommufd.h > @@ -364,12 +364,27 @@ enum iommu_hwpt_type { > * @pt_id: The IOAS to connect this HWPT to > * @out_hwpt_id: The ID of the new HWPT > * @__reserved: Must be 0 > + * @hwpt_type: One of enum iommu_hwpt_type > + * @data_len: Length of the type specific data > + * @data_uptr: User pointer to the type specific data > * > * Explicitly allocate a hardware page table object. This is the same object > * type that is returned by iommufd_device_attach() and represents the > * underlying iommu driver's iommu_domain kernel object. > * > - * A HWPT will be created with the IOVA mappings from the given IOAS. > + * A kernel-managed HWPT will be created with the mappings from the given > + * IOAS via the @pt_id. The @hwpt_type for this allocation can be set to > + * either IOMMU_HWPT_TYPE_DEFAULT or a pre-defined type corresponding to > + * an I/O page table type supported by the underlying IOMMU hardware. > + * A user-managed HWPT will be created from a given parent HWPT via the > + * @pt_id, in which the parent HWPT must be allocated previously via the > + * same ioctl from a given IOAS (@pt_id). In this case, the @hwpt_type > + * must be set to a pre-defined type corresponding to an I/O page table > + * type supported by the underlying IOMMU hardware. > + * > + * If the @hwpt_type is set to IOMMU_HWPT_TYPE_DEFAULT, both the @data_len > + * and the @data_uptr will be ignored. Otherwise, both must be > given. If the @hwpt_type is set to IOMMU_HWPT_TYPE_DEFAULT then @data_len must be zero. Jason