Re: [PATCH] vfio/iommu_type1: report the IOMMU aperture info

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 1 Dec 2017 10:38:07 +0100
Pierre Morel <pmorel@xxxxxxxxxxxxxxxxxx> wrote:

> On 30/11/2017 19:30, Alex Williamson wrote:
> > On Thu, 30 Nov 2017 16:11:35 +0100
> > Pierre Morel <pmorel@xxxxxxxxxxxxxxxxxx> wrote:
> >   
> >> On 30/11/2017 15:08, Alex Williamson wrote:  
> >>> On Thu, 30 Nov 2017 12:34:38 +0100
> >>> Pierre Morel <pmorel@xxxxxxxxxxxxxxxxxx> wrote:
> >>>      
> >>>> When userland VFIO defines a new IOMMU for a guest it may
> >>>> want to specify to the guest the physical limits of
> >>>> the underlying host IOMMU to avoid access to forbidden
> >>>> memory ranges.
> >>>>
> >>>> Currently, the vfio_iommu_type1 driver does not report this
> >>>> information to userland.
> >>>>
> >>>> Let's extend the vfio_iommu_type1_info structure reported
> >>>> by the ioctl VFIO_IOMMU_GET_INFO command to report the
> >>>> IOMMU limits as new uint64_t entries aperture_start and
> >>>> aperture_end.
> >>>>
> >>>> Let's also extend the flags bit map to add a flag specifying
> >>>> if this extension of the info structure is reported or not.
> >>>>
> >>>> Signed-off-by: Pierre Morel <pmorel@xxxxxxxxxxxxxxxxxx>
> >>>> ---
> >>>>    drivers/vfio/vfio_iommu_type1.c | 42 +++++++++++++++++++++++++++++++++++++++++
> >>>>    include/uapi/linux/vfio.h       |  3 +++
> >>>>    2 files changed, 45 insertions(+)
> >>>>
> >>>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >>>> index 8549cb1..7da5fe0 100644
> >>>> --- a/drivers/vfio/vfio_iommu_type1.c
> >>>> +++ b/drivers/vfio/vfio_iommu_type1.c
> >>>> @@ -1526,6 +1526,40 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
> >>>>    	return ret;
> >>>>    }
> >>>>    
> >>>> +/**
> >>>> + * vfio_get_aperture - report minimal aperture of a vfio_iommu
> >>>> + * @iommu: the current vfio_iommu
> >>>> + * @start: a pointer to the aperture start
> >>>> + * @end  : a pointer to the aperture end
> >>>> + *
> >>>> + * This function iterate on the domains using the given vfio_iommu
> >>>> + * and restrict the aperture to the minimal aperture common
> >>>> + * to all domains sharing this vfio_iommu.
> >>>> + */
> >>>> +static void vfio_get_aperture(struct vfio_iommu *iommu, uint64_t *start,
> >>>> +				uint64_t *end)
> >>>> +{
> >>>> +	struct iommu_domain_geometry geometry;
> >>>> +	struct vfio_domain *domain;
> >>>> +
> >>>> +	*start = 0;
> >>>> +	*end = U64_MAX;
> >>>> +
> >>>> +	mutex_lock(&iommu->lock);
> >>>> +	/* loop on all domains using this vfio_iommu */
> >>>> +	list_for_each_entry(domain, &iommu->domain_list, next) {
> >>>> +		iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY,
> >>>> +					&geometry);
> >>>> +		if (geometry.force_aperture) {
> >>>> +			if (geometry.aperture_start > *start)
> >>>> +				*start = geometry.aperture_start;
> >>>> +			if (geometry.aperture_end < *end)
> >>>> +				*end = geometry.aperture_end;
> >>>> +		}
> >>>> +	}
> >>>> +	mutex_unlock(&iommu->lock);
> >>>> +}
> >>>> +
> >>>>    static long vfio_iommu_type1_ioctl(void *iommu_data,
> >>>>    				   unsigned int cmd, unsigned long arg)
> >>>>    {
> >>>> @@ -1560,6 +1594,14 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
> >>>>    
> >>>>    		info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
> >>>>    
> >>>> +		minsz = min_t(size_t, info.argsz, sizeof(info));
> >>>> +		if (minsz >= offsetofend(struct vfio_iommu_type1_info,
> >>>> +					 aperture_end)) {
> >>>> +			info.flags |= VFIO_IOMMU_INFO_APERTURE;
> >>>> +			vfio_get_aperture(iommu, &info.aperture_start,
> >>>> +					  &info.aperture_end);
> >>>> +		}
> >>>> +
> >>>>    		return copy_to_user((void __user *)arg, &info, minsz) ?
> >>>>    			-EFAULT : 0;
> >>>>    
> >>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >>>> index 0fb25fb..780d909 100644
> >>>> --- a/include/uapi/linux/vfio.h
> >>>> +++ b/include/uapi/linux/vfio.h
> >>>> @@ -519,6 +519,9 @@ struct vfio_iommu_type1_info {
> >>>>    	__u32	flags;
> >>>>    #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)	/* supported page sizes info */
> >>>>    	__u64	iova_pgsizes;		/* Bitmap of supported page sizes */
> >>>> +#define VFIO_IOMMU_INFO_APERTURE (1 << 1)	/* supported aperture info */
> >>>> +	__u64   aperture_start;		/* start of DMA aperture */
> >>>> +	__u64   aperture_end;		/* end of DMA aperture */
> >>>>    };
> >>>>    
> >>>>    #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)  
> >>>
> >>> This only supports the most simple topology, even x86 cannot claim to
> >>> have a single contiguous aperture, it's typically bisected by an MSI
> >>> window.  I think we need an API that supports one or more apertures
> >>> out of the box.  Also as Eric indicates, a capability is probably the
> >>> better option for creating a flexible structure.  Thanks,
> >>>
> >>> Alex
> >>>      
> >>
> >>
> >> Yes, I understand that a capability here is a must, I will follow this way.
> >>
> >> For having multiple aperture and MSI protection, I understood it was
> >> done using windows and reserved regions.
> >> Can you point me to my error?  
> > 
> > See the thread from Huawei, I don't think that's a solved problem:
> > 
> > https://lists.gnu.org/archive/html/qemu-arm/2017-11/msg00237.html
> > 
> > If you want sysfs to be consumed separately by the user and fed into
> > new QEMU command line options for creating a VM layout, perhaps that's
> > sufficient, but I think the vfio api for the iommu should encompass
> > describing available ranges of mappable iova space without cobbling
> > together arbitrary info from sysfs.  Thanks,
> > 
> > Alex
> >   
> 
> Hi Alex,
> 
> I resume to see if I understood you well:
> 
> We may have physical IOMMUs with a more complex access that can not be 
> specified by only defining the start and end of a read/write region.
> 
> Windows can be used to reserve regions for the VM but it is not what we 
> want. What we want is to know what the host can offer which is a mix of 
> aperture and windows.
> 
> To report this we can use capabilities in a positive way, describing 
> what the host offers not what it can not provide.
> 
> To achieve this we have to use two interfaces:
> - VFIO user interface with VFIO_IOMMU_GET_INFO and capabilities
> - Physical IOMMU interface with both geometry and window iommu_ops 
> callbacks.
> 
> If it is sufficiently near from what you thought I will provide a new 
> version in this direction.

I believe so.  VFIO would construct a set of mappable iova
regions/windows using information provided via the IOMMU API via
iommu_ops and expose this via a new capability supporting multiple such
regions via the VFIO_IOMMU_GET_INFO ioctl.  This ioctl would be
extended to support capabilities in the same way we've done so for
other vfio ioctls.  Thanks,

Alex



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux