Re: [PATCH v6 0/6] iommufd: Add nesting infrastructure (part 2/2)

Jason Gunthorpe <jgg@xxxxxxxxxx> · Wed, 13 Dec 2023 08:40:55 -0400

On Tue, Dec 12, 2023 at 12:05:41PM -0800, Nicolin Chen wrote:
> > > // iommufd_private.h
> > > 
> > > enum iommufd_object_type {
> > > 	...
> > > +	IOMMUFD_OBJ_VIOMMU,
> > > 	...
> > > };
> > > 
> > > +struct iommufd_viommu {
> > > +	struct iommufd_object obj;
> > > +	struct iommufd_hwpt_paging *hwpt;
> > > +	struct xarray devices;
> > > +};
> > > 
> > > struct iommufd_hwpt_paging hwpt {
> > > 	...
> > > +	struct list_head viommu_list;
> > > 	...
> > > };
> > 
> > I'd probably first try to go backwards and link the hwpt to the
> > viommu.
> 
> I think a VM should have only one hwpt_paging object while one
> or more viommu objects, so we could do only viommu->hwpt_paging
> and hwpt_paging->viommu_list. How to go backwards?

That is probably how things would work but I don't know if it makes
sense to enforce it in the kernel logic..

Point the S2 to a list of viommu objects it is linked to

> > > struct iommufd_group {
> > > 	...
> > > +	struct iommufd_viommu *viommu; // should we attach to viommu instead of hwpt?
> > > 	...
> > > };
> > 
> > No. Attach is a statement of translation so you still attach to the HWPT.
> 
> OK. It's probably not necessary since we know which piommu the
> device is behind. And we only need to link viommu and piommu,
> right?

Yes

> > The second version maybe we have the xarray, or maybe we just push the
> > xarray to the eventual viommu series.
> 
> I think that I still don't get the purpose of the xarray here.
> It was needed previously because a cache invalidate per hwpt
> doesn't know which device. Now IOMMUFD_DEV_INVALIDATE knows.
> 
> Maybe it's related to that narrative "logically we could have
> multiple mappings per iommufd" that you mentioned above. Mind
> elaborating a bit?
> 
> In my mind, viommu is allocated by VMM per piommu, by detecting
> the piommu_id via hw_info. In that case, viommu can only have
> one unique device list. If IOMMUFD_DEV_INVALIDATE passes in the
> dev_id, we don't really need a mapping of vRID-pRID in a multi-
> viommu case either? In another word, VMM already has a mapping
> from vRID to dev_id, so it could call the DEV_INVALIDATE ioctl
> in the first place?

The xarray exists to optimize the invalidation flow.

For SW you can imagine issuing an invalidation against the viommu
itself and *all* commands, be they ASID or ATC invalidations can be
processed in one shot. The xarray allows converting the vSID to pSID
to process ATC invalidations, and the viommu object forces a single
VMID to handle the ATC invalidations. If we want to do this, I don't
know.

For HW, the xarray holds the vSID to pSID mapping that must be
programmed into the HW operating the dedicated invalidation queue.

Jason