Hi Jean, > From: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> > Sent: Thursday, April 9, 2020 4:15 PM > Subject: Re: [PATCH v1 5/8] vfio/type1: Report 1st-level/stage-1 format to > userspace > > On Wed, Apr 08, 2020 at 12:27:58PM +0200, Auger Eric wrote: > > Hi Yi, > > > > On 4/7/20 11:43 AM, Liu, Yi L wrote: > > > Hi Jean, > > > > > >> From: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> > > >> Sent: Friday, April 3, 2020 4:23 PM > > >> To: Auger Eric <eric.auger@xxxxxxxxxx> > > >> userspace > > >> > > >> On Wed, Apr 01, 2020 at 03:01:12PM +0200, Auger Eric wrote: > > >>>>>> header = vfio_info_cap_add(caps, sizeof(*nesting_cap), > > >>>>>> > VFIO_IOMMU_TYPE1_INFO_CAP_NESTING, 1); > > >> @@ -2254,6 +2309,7 > > >>>>>> @@ static int vfio_iommu_info_add_nesting_cap(struct > > >>>>> vfio_iommu *iommu, > > >>>>>> /* nesting iommu type supports PASID requests (alloc/free) > */ > > >>>>>> nesting_cap->nesting_capabilities |= > VFIO_IOMMU_PASID_REQS; > > >>>>> What is the meaning for ARM? > > >>>> > > >>>> I think it's just a software capability exposed to userspace, on > > >>>> userspace side, it has a choice to use it or not. :-) The reason > > >>>> define it and report it in cap nesting is that I'd like to make the > > >>>> pasid alloc/free be available just for IOMMU with type > > >>>> VFIO_IOMMU_TYPE1_NESTING. Please feel free tell me if it is not good > > >>>> for ARM. We can find a proper way to report the availability. > > >>> > > >>> Well it is more a question for jean-Philippe. Do we have a system wide > > >>> PASID allocation on ARM? > > >> > > >> We don't, the PASID spaces are per-VM on Arm, so this function should consult > the > > >> IOMMU driver before setting flags. As you said on patch 3, nested doesn't > > >> necessarily imply PASID support. The SMMUv2 does not support PASID but does > > >> support nesting stages 1 and 2 for the IOVA space. > > >> SMMUv3 support of PASID depends on HW capabilities. So I think this needs to > be > > >> finer grained: > > >> > > >> Does the container support: > > >> * VFIO_IOMMU_PASID_REQUEST? > > >> -> Yes for VT-d 3 > > >> -> No for Arm SMMU > > >> * VFIO_IOMMU_{,UN}BIND_GUEST_PGTBL? > > >> -> Yes for VT-d 3 > > >> -> Sometimes for SMMUv2 > > >> -> No for SMMUv3 (if we go with BIND_PASID_TABLE, which is simpler due to > > >> PASID tables being in GPA space.) > > >> * VFIO_IOMMU_BIND_PASID_TABLE? > > >> -> No for VT-d > > >> -> Sometimes for SMMUv3 > > >> > > >> Any bind support implies VFIO_IOMMU_CACHE_INVALIDATE support. > > > > > > good summary. do you expect to see any > > > > > >> > > >>>>>> + nesting_cap->stage1_formats = formats; > > >>>>> as spotted by Kevin, since a single format is supported, rename > > >>>> > > >>>> ok, I was believing it may be possible on ARM or so. :-) will rename > > >>>> it. > > >> > > >> Yes I don't think an u32 is going to cut it for Arm :( We need to > > >> describe all sorts > of > > >> capabilities for page and PASID tables (granules, GPA size, ASID/PASID size, HW > > >> access/dirty, etc etc.) Just saying "Arm stage-1 format" wouldn't mean much. I > > >> guess we could have a secondary vendor capability for these? > > > > > > Actually, I'm wondering if we can define some formats to stands for a set of > > > capabilities. e.g. VTD_STAGE1_FORMAT_V1 which may indicates the 1st level > > > page table related caps (aw, a/d, SRE, EA and etc.). And vIOMMU can parse > > > the capabilities. > > > > But eventually do we really need all those capability getters? I mean > > can't we simply rely on the actual call to VFIO_IOMMU_BIND_GUEST_PGTBL() > > to detect any mismatch? Definitively the error handling may be heavier > > on userspace but can't we manage. > > I think we need to present these capabilities at boot time, long before > the guest triggers a bind(). For example if the host SMMU doesn't support > 16-bit ASID, we need to communicate that to the guest using vSMMU ID > registers or PROBE properties. Otherwise a bind() will succeed, but if the > guest uses 16-bit ASIDs in its CD, DMA will result in C_BAD_CD events > which we'll inject into the guest, for no apparent reason from their > perspective. > > In addition some VMMs may have fallbacks if shared page tables are not > available. They could fall back to a MAP/UNMAP interface, or simply not > present a vIOMMU to the guest. > Based on the comments, I think it would be a need to report iommu caps in detail. So I guess iommu uapi needs to provide something alike vfio cap chain in iommu uapi. Please feel free let me know your thoughts. :-) In vfio, we can define a cap as below: struct vfio_iommu_type1_info_cap_nesting { struct vfio_info_cap_header header; __u64 iommu_model; #define VFIO_IOMMU_PASID_REQS (1 << 0) #define VFIO_IOMMU_BIND_GPASID (1 << 1) #define VFIO_IOMMU_CACHE_INV (1 << 2) __u32 nesting_capabilities; __u32 pasid_bits; #define VFIO_IOMMU_VENDOR_SUB_CAP (1 << 3) __u32 flags; __u32 data_size; __u8 data[]; /*iommu info caps defined by iommu uapi */ }; VFIO needs new iommu APIs to ask iommu driver whether PASID/bind_gpasid/ cache_inv/bind_gpasid_table is available or not and also the pasid bits. After that VFIO will ask iommu driver about the iommu_cap_info and fill in the @data[] field. iommu uapi: struct iommu_info_cap_header { __u16 id; /* Identifies capability */ __u16 version; /* Version specific to the capability ID */ __u32 next; /* Offset of next capability */ }; #define IOMMU_INFO_CAP_INTEL_VTD 1 struct iommu_info_cap_intel_vtd { struct iommu_info_cap_header header; __u32 vaddr_width; /* VA addr_width*/ __u32 ipaddr_width; /* IPA addr_width, input of SL page table */ /* same definition with @flags instruct iommu_gpasid_bind_data_vtd */ __u64 flags; }; #define IOMMU_INFO_CAP_ARM_SMMUv3 2 struct iommu_info_cap_arm_smmuv3 { struct iommu_info_cap_header header; ... }; Regards, Yi Liu