RE: [PATCH v1 5/8] vfio/type1: Report 1st-level/stage-1 format to userspace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jean,

> From: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx>
> Sent: Thursday, April 9, 2020 4:15 PM
> Subject: Re: [PATCH v1 5/8] vfio/type1: Report 1st-level/stage-1 format to
> userspace
> 
> On Wed, Apr 08, 2020 at 12:27:58PM +0200, Auger Eric wrote:
> > Hi Yi,
> >
> > On 4/7/20 11:43 AM, Liu, Yi L wrote:
> > > Hi Jean,
> > >
> > >> From: Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx>
> > >> Sent: Friday, April 3, 2020 4:23 PM
> > >> To: Auger Eric <eric.auger@xxxxxxxxxx>
> > >> userspace
> > >>
> > >> On Wed, Apr 01, 2020 at 03:01:12PM +0200, Auger Eric wrote:
> > >>>>>>  	header = vfio_info_cap_add(caps, sizeof(*nesting_cap),
> > >>>>>>
> VFIO_IOMMU_TYPE1_INFO_CAP_NESTING, 1);
> > >> @@ -2254,6 +2309,7
> > >>>>>> @@ static int vfio_iommu_info_add_nesting_cap(struct
> > >>>>> vfio_iommu *iommu,
> > >>>>>>  		/* nesting iommu type supports PASID requests (alloc/free)
> */
> > >>>>>>  		nesting_cap->nesting_capabilities |=
> VFIO_IOMMU_PASID_REQS;
> > >>>>> What is the meaning for ARM?
> > >>>>
> > >>>> I think it's just a software capability exposed to userspace, on
> > >>>> userspace side, it has a choice to use it or not. :-) The reason
> > >>>> define it and report it in cap nesting is that I'd like to make the
> > >>>> pasid alloc/free be available just for IOMMU with type
> > >>>> VFIO_IOMMU_TYPE1_NESTING. Please feel free tell me if it is not good
> > >>>> for ARM. We can find a proper way to report the availability.
> > >>>
> > >>> Well it is more a question for jean-Philippe. Do we have a system wide
> > >>> PASID allocation on ARM?
> > >>
> > >> We don't, the PASID spaces are per-VM on Arm, so this function should consult
> the
> > >> IOMMU driver before setting flags. As you said on patch 3, nested doesn't
> > >> necessarily imply PASID support. The SMMUv2 does not support PASID but does
> > >> support nesting stages 1 and 2 for the IOVA space.
> > >> SMMUv3 support of PASID depends on HW capabilities. So I think this needs to
> be
> > >> finer grained:
> > >>
> > >> Does the container support:
> > >> * VFIO_IOMMU_PASID_REQUEST?
> > >>   -> Yes for VT-d 3
> > >>   -> No for Arm SMMU
> > >> * VFIO_IOMMU_{,UN}BIND_GUEST_PGTBL?
> > >>   -> Yes for VT-d 3
> > >>   -> Sometimes for SMMUv2
> > >>   -> No for SMMUv3 (if we go with BIND_PASID_TABLE, which is simpler due to
> > >>      PASID tables being in GPA space.)
> > >> * VFIO_IOMMU_BIND_PASID_TABLE?
> > >>   -> No for VT-d
> > >>   -> Sometimes for SMMUv3
> > >>
> > >> Any bind support implies VFIO_IOMMU_CACHE_INVALIDATE support.
> > >
> > > good summary. do you expect to see any
> > >
> > >>
> > >>>>>> +	nesting_cap->stage1_formats = formats;
> > >>>>> as spotted by Kevin, since a single format is supported, rename
> > >>>>
> > >>>> ok, I was believing it may be possible on ARM or so. :-) will rename
> > >>>> it.
> > >>
> > >> Yes I don't think an u32 is going to cut it for Arm :( We need to 
> > >> describe all sorts
> of
> > >> capabilities for page and PASID tables (granules, GPA size, ASID/PASID size, HW
> > >> access/dirty, etc etc.) Just saying "Arm stage-1 format" wouldn't mean much. I
> > >> guess we could have a secondary vendor capability for these?
> > >
> > > Actually, I'm wondering if we can define some formats to stands for a set of
> > > capabilities. e.g. VTD_STAGE1_FORMAT_V1 which may indicates the 1st level
> > > page table related caps (aw, a/d, SRE, EA and etc.). And vIOMMU can parse
> > > the capabilities.
> >
> > But eventually do we really need all those capability getters? I mean
> > can't we simply rely on the actual call to VFIO_IOMMU_BIND_GUEST_PGTBL()
> > to detect any mismatch? Definitively the error handling may be heavier
> > on userspace but can't we manage.
> 
> I think we need to present these capabilities at boot time, long before
> the guest triggers a bind(). For example if the host SMMU doesn't support
> 16-bit ASID, we need to communicate that to the guest using vSMMU ID
> registers or PROBE properties. Otherwise a bind() will succeed, but if the
> guest uses 16-bit ASIDs in its CD, DMA will result in C_BAD_CD events
> which we'll inject into the guest, for no apparent reason from their
> perspective.
> 
> In addition some VMMs may have fallbacks if shared page tables are not
> available. They could fall back to a MAP/UNMAP interface, or simply not
> present a vIOMMU to the guest.
> 

Based on the comments, I think it would be a need to report iommu caps
in detail. So I guess iommu uapi needs to provide something alike vfio
cap chain in iommu uapi. Please feel free let me know your thoughts. :-)

In vfio, we can define a cap as below:

struct vfio_iommu_type1_info_cap_nesting {
	struct  vfio_info_cap_header header;
	__u64	iommu_model;
#define VFIO_IOMMU_PASID_REQS		(1 << 0)
#define VFIO_IOMMU_BIND_GPASID		(1 << 1)
#define VFIO_IOMMU_CACHE_INV		(1 << 2)
	__u32	nesting_capabilities;
	__u32	pasid_bits;
#define VFIO_IOMMU_VENDOR_SUB_CAP	(1 << 3)
	__u32	flags;
	__u32	data_size;
	__u8	data[];  /*iommu info caps defined by iommu uapi */
};

VFIO needs new iommu APIs to ask iommu driver whether PASID/bind_gpasid/
cache_inv/bind_gpasid_table is available or not and also the pasid
bits. After that VFIO will ask iommu driver about the iommu_cap_info
and fill in the @data[] field.

iommu uapi:
struct iommu_info_cap_header {
	__u16	id;		/* Identifies capability */
	__u16	version;		/* Version specific to the capability ID */
	__u32	next;		/* Offset of next capability */
};

#define IOMMU_INFO_CAP_INTEL_VTD 1
struct iommu_info_cap_intel_vtd {
	struct	iommu_info_cap_header header;
	__u32   vaddr_width;   /* VA addr_width*/
	__u32   ipaddr_width; /* IPA addr_width, input of SL page table */
	/* same definition with @flags instruct iommu_gpasid_bind_data_vtd */
	__u64	flags;
};

#define IOMMU_INFO_CAP_ARM_SMMUv3 2
struct iommu_info_cap_arm_smmuv3 {
	struct	iommu_info_cap_header header;
	...
};

Regards,
Yi Liu




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux