> -----Original Message----- > From: Baolu Lu <baolu.lu@xxxxxxxxxxxxxxx> > Sent: Friday, May 12, 2023 1:39 PM > To: Liu, Yi L <yi.l.liu@xxxxxxxxx>; joro@xxxxxxxxxx; alex.williamson@xxxxxxxxxx; > jgg@xxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; robin.murphy@xxxxxxx > Cc: baolu.lu@xxxxxxxxxxxxxxx; cohuck@xxxxxxxxxx; eric.auger@xxxxxxxxxx; > nicolinc@xxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; mjrosato@xxxxxxxxxxxxx; > chao.p.peng@xxxxxxxxxxxxxxx; yi.y.sun@xxxxxxxxxxxxxxx; peterx@xxxxxxxxxx; > jasowang@xxxxxxxxxx; shameerali.kolothum.thodi@xxxxxxxxxx; lulu@xxxxxxxxxx; > suravee.suthikulpanit@xxxxxxx; iommu@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > linux-kselftest@xxxxxxxxxxxxxxx; Duan, Zhenzhong <zhenzhong.duan@xxxxxxxxx> > Subject: Re: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO > > On 5/11/23 10:30 PM, Yi Liu wrote: > > Under nested IOMMU translation, userspace owns the stage-1 translation > > table (e.g. the stage-1 page table of Intel VT-d or the context table > > of ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific, > > and needs to be compatiable with the underlying IOMMU hardware. Hence, > > userspace should know the IOMMU hardware capability before creating and > > configuring the stage-1 translation table to kernel. > > > > This adds IOMMU_DEVICE_GET_HW_INFO to query the IOMMU hardware > information > > for a given device. The returned data is vendor specific, userspace needs > > to decode it with the structure mapped by the @out_data_type field. > > > > As only physical devices have IOMMU hardware, so this will return error > > if the given device is not a physical device. > > > > Co-developed-by: Nicolin Chen <nicolinc@xxxxxxxxxx> > > Signed-off-by: Nicolin Chen <nicolinc@xxxxxxxxxx> > > Signed-off-by: Yi Liu <yi.l.liu@xxxxxxxxx> > > --- > > drivers/iommu/iommufd/device.c | 72 +++++++++++++++++++++++++ > > drivers/iommu/iommufd/iommufd_private.h | 1 + > > drivers/iommu/iommufd/main.c | 3 ++ > > include/uapi/linux/iommufd.h | 37 +++++++++++++ > > 4 files changed, 113 insertions(+) > > > > diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c > > index 051bd8e99858..bc99d092de8f 100644 > > --- a/drivers/iommu/iommufd/device.c > > +++ b/drivers/iommu/iommufd/device.c > > @@ -263,6 +263,78 @@ u32 iommufd_device_to_id(struct iommufd_device *idev) > > } > > EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, IOMMUFD); > > > > +static int iommufd_zero_fill_user(u64 ptr, int bytes) > > +{ > > + int index = 0; > > + > > + for (; index < bytes; index++) { > > + if (put_user(0, (uint8_t __user *)u64_to_user_ptr(ptr + index))) > > + return -EFAULT; > > + } > > + return 0; > > +} > > + > > +int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd) > > +{ > > + struct iommu_hw_info *cmd = ucmd->cmd; > > + unsigned int length = 0, data_len; > > + struct iommufd_device *idev; > > + const struct iommu_ops *ops; > > + void *data = NULL; > > + int rc = 0; > > + > > + if (cmd->flags || cmd->__reserved || !cmd->data_len) > > + return -EOPNOTSUPP; > > + > > + idev = iommufd_get_device(ucmd, cmd->dev_id); > > + if (IS_ERR(idev)) > > + return PTR_ERR(idev); > > + > > + ops = dev_iommu_ops(idev->dev); > > + if (!ops->hw_info) > > + goto done; > > If the iommu driver doesn't provide a hw_info callback, it still > returns success? Yes, as noted in the cover letter. It's for a remark from Jason. In such case, the out_data_type is NULL, it means no specific data is filled in the buffer pointed by cmd->data_ptr. - Let IOMMU_DEVICE_GET_HW_INFO succeed even the underlying iommu driver does not have driver-specific data to report per below remark. https://lore.kernel.org/kvm/ZAcwJSK%2F9UVI9LXu@xxxxxxxxxx/ Regards, Yi Liu > > + > > + /* driver has hw_info callback should have a unique hw_info_type */ > > + if (ops->hw_info_type == IOMMU_HW_INFO_TYPE_NONE) { > > + pr_warn_ratelimited("iommu driver set an invalid type\n"); > > + rc = -ENODEV; > > + goto out_err; > > + } > > + > > + data = ops->hw_info(idev->dev, &data_len); > > + if (IS_ERR(data)) { > > + rc = PTR_ERR(data); > > + goto out_err; > > + } > > + > > + length = min(cmd->data_len, data_len); > > + if (copy_to_user(u64_to_user_ptr(cmd->data_ptr), data, length)) { > > + rc = -EFAULT; > > + goto out_err; > > + } > > + > > + /* > > + * Zero the trailing bytes if the user buffer is bigger than the > > + * data size kernel actually has. > > + */ > > + if (length < cmd->data_len) { > > + rc = iommufd_zero_fill_user(cmd->data_ptr + length, > > + cmd->data_len - length); > > + if (rc) > > + goto out_err; > > + } > > + > > +done: > > + cmd->data_len = length; > > + cmd->out_data_type = ops->hw_info_type; > > + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); > > + > > +out_err: > > + kfree(data); > > + iommufd_put_object(&idev->obj); > > + return rc; > > +} > > + > > static int iommufd_group_setup_msi(struct iommufd_group *igroup, > > struct iommufd_hw_pagetable *hwpt) > > { > > Best regards, > baolu