Re: [PATCH RFCv1 04/14] iommufd: Add struct iommufd_viommu and iommufd_viommu_ops

Nicolin Chen <nicolinc@xxxxxxxxxx> · Wed, 22 May 2024 21:01:50 -0700




On Thu, May 23, 2024 at 01:43:45AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > Sent: Wednesday, May 22, 2024 9:39 PM
> > VIOMMU contains:
> >  - A nesting parent
> >  - A KVM
> >  - Any global per-VM data the driver needs
> >    * In ARM case this is VMID, sometimes shared with KVM
> 
> In which case is it not shared with KVM? I had the impression that
> VMID always comes from KVM in this VCMDQ usage. 😊

Not actually. I guess that shared VMID is for BTM.

> > On ARM the S2 is not divorced from the VIOMMU, ARM requires a single
> > VMID, shared with KVM, and localized to a single VM for some of the
> > bypass features (vBTM, vCMDQ). So to attach a S2 you actually have to
> > attach the VIOMMU to pick up the correct VMID.
> >
> > I imagine something like this:
> >    hwpt_alloc(deva, nesting_parent=true) = shared_s2
> >    viommu_alloc(deva, shared_s2) = viommu1
> >    viommu_alloc(devb, shared_s2) = viommu2
> >    hwpt_alloc(deva, viommu1, vste) = deva_vste
> >    hwpt_alloc(devb, viommu2, vste) = devb_vste
> >    attach(deva, deva_vste)
> >    attach(devb, devb_vste)
> >    attach(devc, shared_s2)
> 
> I wonder whether we want to make viommu as the 1st-class citizen
> for any nested hwpt if it is desirable to enable it even for VT-d which
> lacks of a hw viommu concept at the moment.

I think Jason is completely using SMMU as an example here.

Also FWIW, I am trying a core-allocated core-managed viommu for
IOMMU_VIOMMU_TYPE_DEFAULT. So VT-d driver doesn't need to hold
a viommu while VMM could still allocate one if it wants. And the
VIOMMU interface can provide some helpers if driver wants some
info from the core-managed viommu: a virtual dev ID to physical
dev ID (returning device pointer) translation for example. And
we can add more after we brain storm.

Sample change:
@@ -623,6 +625,18 @@ struct iommu_ops {
+ * @viommu_alloc: Allocate an iommufd_viommu associating to a nested parent
+ *                @domain as a user space IOMMU instance for HW-accelerated
+ *                features from the physical IOMMU behind the @dev. The
+ *                @viommu_type must be defined in include/uapi/linux/iommufd.h
+ *                It is suggested to call iommufd_viommu_alloc() helper for
+ *                a bundled allocation of the core and the driver structures,
+ *                using the given @ictx pointer.
+ * @default_viommu_ops: Driver can choose to use a default core-allocated core-
+ *                      managed viommu object by providing a default viommu ops.
+ *                      Otherwise, i.e. for a driver-managed viommu, viommu_ops
+ *                      should be passed in via iommufd_viommu_alloc() helper in
+ *                      its own viommu_alloc op.

[..]

+int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd)
+{
 ...
+       if (cmd->type == IOMMU_VIOMMU_TYPE_DEFAULT) {
+               viommu = __iommufd_viommu_alloc(
+                               ucmd->ictx, sizeof(*viommu),
+                               domain->ops->default_viommu_ops);
+       } else {
+               if (!domain->ops->viommu_alloc) {
+                       rc = -EOPNOTSUPP;
+                       goto out_put_hwpt;
+               }
+
+               viommu = domain->ops->viommu_alloc(domain, idev->dev,
+                                                  ucmd->ictx, cmd->type);
+       }

[..]
// Helper:
+struct device *
+iommufd_viommu_find_device(struct iommufd_viommu *viommu, u64 id);

> > The driver will then know it should program three different VMIDs for
> > the same S2 page table, which matches the ARM expectation for
> > VMID. That is to say we'd pass in the viommu as the pt_id for the
> > iommu_hwpt_alloc. The viommu would imply both the S2 page table and
> > any meta information like VMID the driver needs.
> 
> Can you elaborate the aspect about "three different VMIDs"? They are
> all for the same VM hence sharing the same VMID per the earlier
> description. This is also echo-ed in patch14:
> 
> tegra241_cmdqv_viommu_alloc()
>         vintf->vmid = smmu_domain->vmid;

The design in the series is still old using a 1:1 relationship
between a viommu and an S2 domain. I think the "three" is from
his SMMU example above? Leaving it to Jason to reply though.

> > now. If someone needs them linked someday we can add a viommu_id to
> > the create pri queue command.
> 
> I'm more worried about the potential conflict between the vqueue
> object here and the fault queue object in Baolu's series, if we want
> to introduce vIOMMU concept to platforms which lack of the hw
> support.

I actually see one argument is whether we should use a vqueue
v.s. Baolu's fault queue object, and a counter argument whether
we should also use vqueue for viommu invalidation v.s an array-
based invalidation request that we have similarly for HWPT...

Thanks
Nicolin