> From: Auger Eric <eric.auger@xxxxxxxxxx> > Sent: Monday, July 6, 2020 9:01 PM > > On 7/6/20 2:20 PM, Liu, Yi L wrote: > > Hi Eric, > > > >> From: Auger Eric <eric.auger@xxxxxxxxxx> > >> Sent: Monday, July 6, 2020 5:34 PM > >> > >> On 7/4/20 1:26 PM, Liu Yi L wrote: > >>> IOMMUs that support nesting translation needs report the capability > >>> info > >> need to report > >>> to userspace, e.g. the format of first level/stage paging structures. > >>> > >>> This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can > >>> get nesting info after setting DOMAIN_ATTR_NESTING. > >>> > >>> Cc: Kevin Tian <kevin.tian@xxxxxxxxx> > >>> CC: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx> > >>> Cc: Alex Williamson <alex.williamson@xxxxxxxxxx> > >>> Cc: Eric Auger <eric.auger@xxxxxxxxxx> > >>> Cc: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> > >>> Cc: Joerg Roedel <joro@xxxxxxxxxx> > >>> Cc: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> > >>> Signed-off-by: Liu Yi L <yi.l.liu@xxxxxxxxx> > >>> Signed-off-by: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx> > >>> --- > >>> v3 -> v4: > >>> *) split the SMMU driver changes to be a separate patch > >>> *) move the @addr_width and @pasid_bits from vendor specific > >>> part to generic part. > >>> *) tweak the description for the @features field of struct > >>> iommu_nesting_info. > >>> *) add description on the @data[] field of struct iommu_nesting_info > >>> > >>> v2 -> v3: > >>> *) remvoe cap/ecap_mask in iommu_nesting_info. > >>> *) reuse DOMAIN_ATTR_NESTING to get nesting info. > >>> *) return an empty iommu_nesting_info for SMMU drivers per Jean' > >>> suggestion. > >>> --- > >>> include/uapi/linux/iommu.h | 78 > >> ++++++++++++++++++++++++++++++++++++++++++++++ > >>> 1 file changed, 78 insertions(+) > >>> > >>> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h > >>> index 1afc661..1bfc032 100644 > >>> --- a/include/uapi/linux/iommu.h > >>> +++ b/include/uapi/linux/iommu.h > >>> @@ -332,4 +332,82 @@ struct iommu_gpasid_bind_data { > >>> } vendor; > >>> }; > >>> > >>> +/* > >>> + * struct iommu_nesting_info - Information for nesting-capable IOMMU. > >>> + * user space should check it before using > >>> + * nesting capability. > >> alignment? > > > > oh, yes, will do it. > > > >>> + * > >>> + * @size: size of the whole structure > >>> + * @format: PASID table entry format, the same definition with > >>> + * @format of struct iommu_gpasid_bind_data. > >> the same definition as struct iommu_gpasid_bind_data @format? > > > > right. yours is much better. > > > >>> + * @features: supported nesting features. > >>> + * @flags: currently reserved for future extension. > >>> + * @addr_width: The output addr width of first level/stage translation > >>> + * @pasid_bits: Maximum supported PASID bits, 0 represents no PASID > >>> + * support. > >>> + * @data: vendor specific cap info. data[] structure type can be deduced > >>> + * from @format field. > >>> + * > >>> + * > >> > +===============+=================================================== > >> ===+ > >>> + * | feature | Notes | > >>> + * > >> > +===============+=================================================== > >> ===+ > >>> + * | SYSWIDE_PASID | PASIDs are managed in system-wide, instead of per | > >>> + * | | device. When a device is assigned to userspace or | > >>> + * | | VM, proper uAPI (userspace driver framework uAPI, | > >>> + * | | e.g. VFIO) must be used to allocate/free PASIDs for | > >>> + * | | the assigned device. | > >>> + * +---------------+------------------------------------------------------+ > >>> + * | BIND_PGTBL | The owner of the first level/stage page table must | > >>> + * | | explicitly bind the page table to associated PASID | > >>> + * | | (either the one specified in bind request or the | > >>> + * | | default PASID of iommu domain), through userspace | > >>> + * | | driver framework uAPI (e.g. VFIO_IOMMU_NESTING_OP). | > >>> + * +---------------+------------------------------------------------------+ > >>> + * | CACHE_INVLD | The owner of the first level/stage page table must | > >>> + * | | explicitly invalidate the IOMMU cache through uAPI | > >>> + * | | provided by userspace driver framework (e.g. VFIO) | > >>> + * | | according to vendor-specific requirement when | > >>> + * | | changing the page table. | > >>> + * > >>> + +---------------+------------------------------------------------- > >>> + -----+ > >> Do you foresee cases where BIND_PGTBL and CACHE_INVLD shouldn't be > >> exposed as features? > > > > sorry, I didn't quite get it. could you explain a little bit more. :-) > For SYSWIDE_PASID I understand SMMU won't advertise it. But do you foresee any > nested implementation not requesting the owner of the tables to bind and invalidate > caches. So I understand those 2 features would always be supported? I think BIND_PGTBL is optional as ARM will bind guest pasid table to host. As for CACHE_INVLD, per current nesting implementations, looks all stage-1/level-1 owner should issue cache invalidation when stage-1/level-1 changed. But still added it here per the comments from Kevin. :-) "So far this assumption is correct but it may not be true when thinking forward. For example, a vendor might find a way to allow the owner of 1st-level page table to directly invalidate cache w/o going through host IOMMU driver. From this angle I feel explicitly reporting this capability is more robust." https://lore.kernel.org/linux-iommu/MWHPR11MB1645B09EBDC76514ADC897A68C6F0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ Regards, Yi Liu > > > >>> + * > >>> + * @data[] types defined for @format: > >>> + * > >> > +================================+================================== > >> ===+ > >>> + * | @format | @data[] | > >>> + * > >> > +================================+================================== > >> ===+ > >>> + * | IOMMU_PASID_FORMAT_INTEL_VTD | struct iommu_nesting_info_vtd > | > >>> + * > >>> ++--------------------------------+--------------------------------- > >>> +----+ > >>> + * > >>> + */ > >>> +struct iommu_nesting_info { > >>> + __u32 size; > >>> + __u32 format; > >>> + __u32 features; > >>> +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID (1 << 0) > >>> +#define IOMMU_NESTING_FEAT_BIND_PGTBL (1 << 1) > >>> +#define IOMMU_NESTING_FEAT_CACHE_INVLD (1 << 2) > >> In other structs the values seem to be defined before the field > > > > not sure. :-) I mimics the below struct from uapi/vfio.h > Yep I noticed that afterwards. In IOMMU uapi it looks the opposite though. So I > would alignto the style in the same file but that's not a big deal. I see, may align with other iommu uapi. :-) Regards, Yi Liu > > > > struct vfio_iommu_type1_dma_map { > > __u32 argsz; > > __u32 flags; > > #define VFIO_DMA_MAP_FLAG_READ (1 << 0) /* readable from device */ > > #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1) /* writable from device */ > > __u64 vaddr; /* Process virtual address */ > > __u64 iova; /* IO virtual address */ > > __u64 size; /* Size of mapping (bytes) */ > > }; > > > >>> + __u32 flags; > >>> + __u16 addr_width; > >>> + __u16 pasid_bits; > >>> + __u32 padding; > >>> + __u8 data[]; > >>> +}; > >>> + > >>> +/* > >>> + * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info > >>> + * > >> spurious line > > > > yes, will remove this line. > > > > Regards, > > Yi Liu > > > >>> + * > >>> + * @flags: VT-d specific flags. Currently reserved for future > >>> + * extension. > >>> + * @cap_reg: Describe basic capabilities as defined in VT-d capability > >>> + * register. > >>> + * @ecap_reg: Describe the extended capabilities as defined in VT-d > >>> + * extended capability register. > >>> + */ > >>> +struct iommu_nesting_info_vtd { > >>> + __u32 flags; > >>> + __u32 padding; > >>> + __u64 cap_reg; > >>> + __u64 ecap_reg; > >>> +}; > >>> + > >>> #endif /* _UAPI_IOMMU_H */ > >>> > >> Thanks > >> > >> Eric > > > > Thanks > > Eric