On Tue, Dec 12, 2023 at 11:13:37AM -0800, Nicolin Chen wrote: > On Tue, Dec 12, 2023 at 10:44:21AM -0400, Jason Gunthorpe wrote: > > On Mon, Dec 11, 2023 at 11:30:00PM -0800, Nicolin Chen wrote: > > > > > > > Could the structure just look like this? > > > > > struct iommu_dev_assign_virtual_id { > > > > > __u32 size; > > > > > __u32 dev_id; > > > > > __u32 id_type; > > > > > __u32 id; > > > > > }; > > > > > > > > It needs to take in the viommu_id also, and I'd make the id 64 bits > > > > just for good luck. > > > > > > What is viommu_id required for in this context? I thought we > > > already know which SMMU instance to issue commands via dev_id? > > > > The viommu_id would be the container that holds the xarray that maps > > the vRID to pRID > > > > Logically we could have multiple mappings per iommufd as we could have > > multiple iommu instances working here. > > I see. This is the object to hold a shared stage-2 HWPT/domain then. It could be done like that, yes. I wasn't thinking about linking the stage two so tightly but perhaps? If we can avoid putting the hwpt here that might be more general. > // iommufd_private.h > > enum iommufd_object_type { > ... > + IOMMUFD_OBJ_VIOMMU, > ... > }; > > +struct iommufd_viommu { > + struct iommufd_object obj; > + struct iommufd_hwpt_paging *hwpt; > + struct xarray devices; > +}; > > struct iommufd_hwpt_paging hwpt { > ... > + struct list_head viommu_list; > ... > }; I'd probably first try to go backwards and link the hwpt to the viommu. > struct iommufd_group { > ... > + struct iommufd_viommu *viommu; // should we attach to viommu instead of hwpt? > ... > }; No. Attach is a statement of translation so you still attach to the HWPT. > Question to finalize how we maps vRID-pRID in the xarray: > how should IOMMUFD_DEV_INVALIDATE work? The ioctl structure has > a dev_id and a list of commands that belongs to the device. So, > it forwards the struct device pointer to the driver along with > the commands. Then, doesn't the driver already know the pRID > from the dev pointer without looking up a vRID-pRID table? The first version of DEV_INVALIDATE should have no xarray. The invalidate commands are stripped of the SID and executed on the given dev_id period. VMM splits up the invalidate command list. The second version maybe we have the xarray, or maybe we just push the xarray to the eventual viommu series. > struct iommu_hwpt_alloc { > ... > + __u32 viommu_id; > }; > > +enum iommu_dev_virtual_id_type { > + IOMMU_DEV_VIRTUAL_ID_TYPE_AMD_VIOMMU_DID, // not sure how this fits the xarray in viommu obj. > + IOMMU_DEV_VIRTUAL_ID_TYPE_AMD_VIOMMU_RID, It is just DID. In both cases the ID is the index to the "STE" radix tree, whatever the driver happens to call it. > Then, I think that we also need an iommu_viommu_alloc structure > and ioctl to allocate an object, and that VMM should know if it > needs to allocate multiple viommu objects -- this probably needs > the hw_info ioctl to return a piommu_id so VMM gets the list of > piommus from the attached devices? Yes and yes > Another question, introducing the viommu obj complicates things > a lot. Do we want it to come with iommu_dev_assign_virtual_id, > or maybe put in a later series? We could stage the xarray in the > iommu_hwpt_paging struct for now, so a single-IOMMU system could > still work with that. All this would be in its own series to enable HW accelerated viommu support on ARM & AMD as we've been doing so far. I imagine it after we get the basic invalidation done > > > And should we rename the "cache_invalidate_user"? Would VT-d > > > still uses it for device cache? > > > > I think vt-d will not implement it > > Then should we "s/cache_invalidate_user/iotlb_sync_user"? I think cache_invalidate is still a fine name.. vt-d will generate ATC invalidations under that function too. Jason