Hi Yi, On 4/9/20 2:47 PM, Liu, Yi L wrote: > Hi Jean, > >> From: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> >> Sent: Thursday, April 9, 2020 4:15 PM >> Subject: Re: [PATCH v1 5/8] vfio/type1: Report 1st-level/stage-1 format to >> userspace >> >> On Wed, Apr 08, 2020 at 12:27:58PM +0200, Auger Eric wrote: >>> Hi Yi, >>> >>> On 4/7/20 11:43 AM, Liu, Yi L wrote: >>>> Hi Jean, >>>> >>>>> From: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> >>>>> Sent: Friday, April 3, 2020 4:23 PM >>>>> To: Auger Eric <eric.auger@xxxxxxxxxx> >>>>> userspace >>>>> >>>>> On Wed, Apr 01, 2020 at 03:01:12PM +0200, Auger Eric wrote: >>>>>>>>> header = vfio_info_cap_add(caps, sizeof(*nesting_cap), >>>>>>>>> >> VFIO_IOMMU_TYPE1_INFO_CAP_NESTING, 1); >>>>> @@ -2254,6 +2309,7 >>>>>>>>> @@ static int vfio_iommu_info_add_nesting_cap(struct >>>>>>>> vfio_iommu *iommu, >>>>>>>>> /* nesting iommu type supports PASID requests (alloc/free) >> */ >>>>>>>>> nesting_cap->nesting_capabilities |= >> VFIO_IOMMU_PASID_REQS; >>>>>>>> What is the meaning for ARM? >>>>>>> >>>>>>> I think it's just a software capability exposed to userspace, on >>>>>>> userspace side, it has a choice to use it or not. :-) The reason >>>>>>> define it and report it in cap nesting is that I'd like to make the >>>>>>> pasid alloc/free be available just for IOMMU with type >>>>>>> VFIO_IOMMU_TYPE1_NESTING. Please feel free tell me if it is not good >>>>>>> for ARM. We can find a proper way to report the availability. >>>>>> >>>>>> Well it is more a question for jean-Philippe. Do we have a system wide >>>>>> PASID allocation on ARM? >>>>> >>>>> We don't, the PASID spaces are per-VM on Arm, so this function should consult >> the >>>>> IOMMU driver before setting flags. As you said on patch 3, nested doesn't >>>>> necessarily imply PASID support. The SMMUv2 does not support PASID but does >>>>> support nesting stages 1 and 2 for the IOVA space. >>>>> SMMUv3 support of PASID depends on HW capabilities. So I think this needs to >> be >>>>> finer grained: >>>>> >>>>> Does the container support: >>>>> * VFIO_IOMMU_PASID_REQUEST? >>>>> -> Yes for VT-d 3 >>>>> -> No for Arm SMMU >>>>> * VFIO_IOMMU_{,UN}BIND_GUEST_PGTBL? >>>>> -> Yes for VT-d 3 >>>>> -> Sometimes for SMMUv2 >>>>> -> No for SMMUv3 (if we go with BIND_PASID_TABLE, which is simpler due to >>>>> PASID tables being in GPA space.) >>>>> * VFIO_IOMMU_BIND_PASID_TABLE? >>>>> -> No for VT-d >>>>> -> Sometimes for SMMUv3 >>>>> >>>>> Any bind support implies VFIO_IOMMU_CACHE_INVALIDATE support. >>>> >>>> good summary. do you expect to see any >>>> >>>>> >>>>>>>>> + nesting_cap->stage1_formats = formats; >>>>>>>> as spotted by Kevin, since a single format is supported, rename >>>>>>> >>>>>>> ok, I was believing it may be possible on ARM or so. :-) will rename >>>>>>> it. >>>>> >>>>> Yes I don't think an u32 is going to cut it for Arm :( We need to >>>>> describe all sorts >> of >>>>> capabilities for page and PASID tables (granules, GPA size, ASID/PASID size, HW >>>>> access/dirty, etc etc.) Just saying "Arm stage-1 format" wouldn't mean much. I >>>>> guess we could have a secondary vendor capability for these? >>>> >>>> Actually, I'm wondering if we can define some formats to stands for a set of >>>> capabilities. e.g. VTD_STAGE1_FORMAT_V1 which may indicates the 1st level >>>> page table related caps (aw, a/d, SRE, EA and etc.). And vIOMMU can parse >>>> the capabilities. >>> >>> But eventually do we really need all those capability getters? I mean >>> can't we simply rely on the actual call to VFIO_IOMMU_BIND_GUEST_PGTBL() >>> to detect any mismatch? Definitively the error handling may be heavier >>> on userspace but can't we manage. >> >> I think we need to present these capabilities at boot time, long before >> the guest triggers a bind(). For example if the host SMMU doesn't support >> 16-bit ASID, we need to communicate that to the guest using vSMMU ID >> registers or PROBE properties. Otherwise a bind() will succeed, but if the >> guest uses 16-bit ASIDs in its CD, DMA will result in C_BAD_CD events >> which we'll inject into the guest, for no apparent reason from their >> perspective. >> >> In addition some VMMs may have fallbacks if shared page tables are not >> available. They could fall back to a MAP/UNMAP interface, or simply not >> present a vIOMMU to the guest. >> > > Based on the comments, I think it would be a need to report iommu caps > in detail. So I guess iommu uapi needs to provide something alike vfio > cap chain in iommu uapi. Please feel free let me know your thoughts. :-) Yes to me it sounds sensible. > > In vfio, we can define a cap as below: > > struct vfio_iommu_type1_info_cap_nesting { > struct vfio_info_cap_header header; > __u64 iommu_model; > #define VFIO_IOMMU_PASID_REQS (1 << 0) I still think the name shall be changed > #define VFIO_IOMMU_BIND_GPASID (1 << 1) > #define VFIO_IOMMU_CACHE_INV (1 << 2) this operation seems mandated as soon as we have a nested paging based implementation? > __u32 nesting_capabilities; > __u32 pasid_bits; > #define VFIO_IOMMU_VENDOR_SUB_CAP (1 << 3) > __u32 flags; > __u32 data_size; > __u8 data[]; /*iommu info caps defined by iommu uapi */ > }; > > VFIO needs new iommu APIs to ask iommu driver whether PASID/bind_gpasid/ > cache_inv/bind_gpasid_table is available or not and also the pasid > bits. After that VFIO will ask iommu driver about the iommu_cap_info > and fill in the @data[] field. > > iommu uapi: > struct iommu_info_cap_header { > __u16 id; /* Identifies capability */ > __u16 version; /* Version specific to the capability ID */ > __u32 next; /* Offset of next capability */ > }; > > #define IOMMU_INFO_CAP_INTEL_VTD 1 > struct iommu_info_cap_intel_vtd { > struct iommu_info_cap_header header; > __u32 vaddr_width; /* VA addr_width*/ > __u32 ipaddr_width; /* IPA addr_width, input of SL page table */ > /* same definition with @flags instruct iommu_gpasid_bind_data_vtd */ > __u64 flags; > }; > > #define IOMMU_INFO_CAP_ARM_SMMUv3 2 > struct iommu_info_cap_arm_smmuv3 { > struct iommu_info_cap_header header; > ... > }; Thanks Eric > > Regards, > Yi Liu >