Hi Eric, > From: Auger Eric <eric.auger@xxxxxxxxxx> > Sent: Friday, April 10, 2020 11:28 AM > To: Liu, Yi L <yi.l.liu@xxxxxxxxx>; Jean-Philippe Brucker <jean- > Subject: Re: [PATCH v1 5/8] vfio/type1: Report 1st-level/stage-1 format to > userspace > > Hi Yi, > > On 4/9/20 2:47 PM, Liu, Yi L wrote: > > Hi Jean, > > > >> From: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> > >> Sent: Thursday, April 9, 2020 4:15 PM > >> Subject: Re: [PATCH v1 5/8] vfio/type1: Report 1st-level/stage-1 > >> format to userspace > >> > >> On Wed, Apr 08, 2020 at 12:27:58PM +0200, Auger Eric wrote: > >>> Hi Yi, > >>> > >>> On 4/7/20 11:43 AM, Liu, Yi L wrote: > >>>> Hi Jean, > >>>> > >>>>> From: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> > >>>>> Sent: Friday, April 3, 2020 4:23 PM > >>>>> To: Auger Eric <eric.auger@xxxxxxxxxx> userspace > >>>>> > >>>>> On Wed, Apr 01, 2020 at 03:01:12PM +0200, Auger Eric wrote: > >>>>>>>>> header = vfio_info_cap_add(caps, sizeof(*nesting_cap), > >>>>>>>>> > >> VFIO_IOMMU_TYPE1_INFO_CAP_NESTING, 1); > >>>>> @@ -2254,6 +2309,7 > >>>>>>>>> @@ static int vfio_iommu_info_add_nesting_cap(struct > >>>>>>>> vfio_iommu *iommu, > >>>>>>>>> /* nesting iommu type supports PASID requests > (alloc/free) > >> */ > >>>>>>>>> nesting_cap->nesting_capabilities |= > >> VFIO_IOMMU_PASID_REQS; > >>>>>>>> What is the meaning for ARM? > >>>>>>> > >>>>>>> I think it's just a software capability exposed to userspace, on > >>>>>>> userspace side, it has a choice to use it or not. :-) The reason > >>>>>>> define it and report it in cap nesting is that I'd like to make > >>>>>>> the pasid alloc/free be available just for IOMMU with type > >>>>>>> VFIO_IOMMU_TYPE1_NESTING. Please feel free tell me if it is not > >>>>>>> good for ARM. We can find a proper way to report the availability. > >>>>>> > >>>>>> Well it is more a question for jean-Philippe. Do we have a system > >>>>>> wide PASID allocation on ARM? > >>>>> > >>>>> We don't, the PASID spaces are per-VM on Arm, so this function > >>>>> should consult > >> the > >>>>> IOMMU driver before setting flags. As you said on patch 3, nested > >>>>> doesn't necessarily imply PASID support. The SMMUv2 does not > >>>>> support PASID but does support nesting stages 1 and 2 for the IOVA space. > >>>>> SMMUv3 support of PASID depends on HW capabilities. So I think > >>>>> this needs to > >> be > >>>>> finer grained: > >>>>> > >>>>> Does the container support: > >>>>> * VFIO_IOMMU_PASID_REQUEST? > >>>>> -> Yes for VT-d 3 > >>>>> -> No for Arm SMMU > >>>>> * VFIO_IOMMU_{,UN}BIND_GUEST_PGTBL? > >>>>> -> Yes for VT-d 3 > >>>>> -> Sometimes for SMMUv2 > >>>>> -> No for SMMUv3 (if we go with BIND_PASID_TABLE, which is simpler > due to > >>>>> PASID tables being in GPA space.) > >>>>> * VFIO_IOMMU_BIND_PASID_TABLE? > >>>>> -> No for VT-d > >>>>> -> Sometimes for SMMUv3 > >>>>> > >>>>> Any bind support implies VFIO_IOMMU_CACHE_INVALIDATE support. > >>>> > >>>> good summary. do you expect to see any > >>>> > >>>>> > >>>>>>>>> + nesting_cap->stage1_formats = formats; > >>>>>>>> as spotted by Kevin, since a single format is supported, rename > >>>>>>> > >>>>>>> ok, I was believing it may be possible on ARM or so. :-) will > >>>>>>> rename it. > >>>>> > >>>>> Yes I don't think an u32 is going to cut it for Arm :( We need to > >>>>> describe all sorts > >> of > >>>>> capabilities for page and PASID tables (granules, GPA size, > >>>>> ASID/PASID size, HW access/dirty, etc etc.) Just saying "Arm > >>>>> stage-1 format" wouldn't mean much. I guess we could have a secondary > vendor capability for these? > >>>> > >>>> Actually, I'm wondering if we can define some formats to stands for > >>>> a set of capabilities. e.g. VTD_STAGE1_FORMAT_V1 which may > >>>> indicates the 1st level page table related caps (aw, a/d, SRE, EA > >>>> and etc.). And vIOMMU can parse the capabilities. > >>> > >>> But eventually do we really need all those capability getters? I > >>> mean can't we simply rely on the actual call to > >>> VFIO_IOMMU_BIND_GUEST_PGTBL() to detect any mismatch? Definitively > >>> the error handling may be heavier on userspace but can't we manage. > >> > >> I think we need to present these capabilities at boot time, long > >> before the guest triggers a bind(). For example if the host SMMU > >> doesn't support 16-bit ASID, we need to communicate that to the guest > >> using vSMMU ID registers or PROBE properties. Otherwise a bind() will > >> succeed, but if the guest uses 16-bit ASIDs in its CD, DMA will > >> result in C_BAD_CD events which we'll inject into the guest, for no > >> apparent reason from their perspective. > >> > >> In addition some VMMs may have fallbacks if shared page tables are > >> not available. They could fall back to a MAP/UNMAP interface, or > >> simply not present a vIOMMU to the guest. > >> > > > > Based on the comments, I think it would be a need to report iommu caps > > in detail. So I guess iommu uapi needs to provide something alike vfio > > cap chain in iommu uapi. Please feel free let me know your thoughts. > > :-) > > Yes to me it sounds sensible. > > > > In vfio, we can define a cap as below: > > > > struct vfio_iommu_type1_info_cap_nesting { > > struct vfio_info_cap_header header; > > __u64 iommu_model; > > #define VFIO_IOMMU_PASID_REQS (1 << 0) > I still think the name shall be changed yes, I'll rename it per your suggestion.:-) > > #define VFIO_IOMMU_BIND_GPASID (1 << 1) > > #define VFIO_IOMMU_CACHE_INV (1 << 2) > this operation seems mandated as soon as we have a nested paging based > implementation? oh, yes, should be. will remove it and comment in the code. Regards, Yi Liu > > __u32 nesting_capabilities; > > __u32 pasid_bits; > > #define VFIO_IOMMU_VENDOR_SUB_CAP (1 << 3) > > __u32 flags; > > __u32 data_size; > > __u8 data[]; /*iommu info caps defined by iommu uapi */ > > }; > > > > VFIO needs new iommu APIs to ask iommu driver whether > > PASID/bind_gpasid/ cache_inv/bind_gpasid_table is available or not and > > also the pasid bits. After that VFIO will ask iommu driver about the > > iommu_cap_info and fill in the @data[] field. > > > > iommu uapi: > > struct iommu_info_cap_header { > > __u16 id; /* Identifies capability */ > > __u16 version; /* Version specific to the capability ID */ > > __u32 next; /* Offset of next capability */ > > }; > > > > #define IOMMU_INFO_CAP_INTEL_VTD 1 > > struct iommu_info_cap_intel_vtd { > > struct iommu_info_cap_header header; > > __u32 vaddr_width; /* VA addr_width*/ > > __u32 ipaddr_width; /* IPA addr_width, input of SL page table */ > > /* same definition with @flags instruct iommu_gpasid_bind_data_vtd */ > > __u64 flags; > > }; > > > > #define IOMMU_INFO_CAP_ARM_SMMUv3 2 > > struct iommu_info_cap_arm_smmuv3 { > > struct iommu_info_cap_header header; > > ... > > }; > > Thanks > > Eric > > > > Regards, > > Yi Liu > >