Re: [RFC PATCH v3 3/3] VFIO Type1 IOMMU change: to support with iommu and without iommu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/04/2016 06:43 AM, Alex Williamson wrote:
> On Tue, 3 May 2016 00:10:41 +0530
> Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote:
>> +
>> +/*
>> + * Pin a set of guest PFNs and return their associated host PFNs for vGPU.
>> + * @vaddr [in]: array of guest PFNs
>> + * @npage [in]: count of array elements
>> + * @prot [in] : protection flags
>> + * @pfn_base[out] : array of host PFNs
>> + */
>> +int vfio_pin_pages(void *iommu_data, dma_addr_t *vaddr, long npage,
>> +		   int prot, dma_addr_t *pfn_base)
>> +{
>> +	struct vfio_iommu *iommu = iommu_data;
>> +	struct vfio_domain *domain = NULL, *domain_vgpu = NULL;
>> +	int i = 0, ret = 0;
>> +	long retpage;
>> +	dma_addr_t remote_vaddr = 0;
>> +	dma_addr_t *pfn = pfn_base;
>> +	struct vfio_dma *dma;
>> +
>> +	if (!iommu || !pfn_base)
>> +		return -EINVAL;
>> +
>> +	if (list_empty(&iommu->domain_list)) {
>> +		ret = -EINVAL;
>> +		goto pin_done;
>> +	}
>> +
>> +	get_first_domains(iommu, &domain, &domain_vgpu);
>> +
>> +	// Return error if vGPU domain doesn't exist
> 
> No c++ style comments please.
> 
>> +	if (!domain_vgpu) {
>> +		ret = -EINVAL;
>> +		goto pin_done;
>> +	}
>> +
>> +	for (i = 0; i < npage; i++) {
>> +		struct vfio_vgpu_pfn *p;
>> +		struct vfio_vgpu_pfn *lpfn;
>> +		unsigned long tpfn;
>> +		dma_addr_t iova;
>> +
>> +		mutex_lock(&iommu->lock);
>> +
>> +		iova = vaddr[i] << PAGE_SHIFT;
>> +
>> +		dma = vfio_find_dma(iommu, iova, 0 /*  size */);
>> +		if (!dma) {
>> +			mutex_unlock(&iommu->lock);
>> +			ret = -EINVAL;
>> +			goto pin_done;
>> +		}
>> +
>> +		remote_vaddr = dma->vaddr + iova - dma->iova;
>> +
>> +		retpage = vfio_pin_pages_internal(domain_vgpu, remote_vaddr,
>> +						  (long)1, prot, &tpfn);
>> +		mutex_unlock(&iommu->lock);
>> +		if (retpage <= 0) {
>> +			WARN_ON(!retpage);
>> +			ret = (int)retpage;
>> +			goto pin_done;
>> +		}
>> +
>> +		pfn[i] = tpfn;
>> +
>> +		mutex_lock(&domain_vgpu->lock);
>> +
>> +		// search if pfn exist
>> +		if ((p = vfio_find_vgpu_pfn(domain_vgpu, tpfn))) {
>> +			atomic_inc(&p->ref_count);
>> +			mutex_unlock(&domain_vgpu->lock);
>> +			continue;
>> +		}
> 
> The only reason I can come up with for why we'd want to integrate an
> api-only domain into the existing type1 code would be to avoid page
> accounting issues where we count locked pages once for a normal
> assigned device and again for a vgpu, but that's not what we're doing
> here.  We're not only locking the pages again regardless of them
> already being locked, we're counting every time we lock them through
> this new interface.  So there's really no point at all to making type1
> become this unsupportable.  In that case we should be pulling out the
> common code that we want to share from type1 and making a new type1
> compatible vfio iommu backend rather than conditionalizing everything
> here.
> 
>> +
>> +		// add to pfn_list
>> +		lpfn = kzalloc(sizeof(*lpfn), GFP_KERNEL);
>> +		if (!lpfn) {
>> +			ret = -ENOMEM;
>> +			mutex_unlock(&domain_vgpu->lock);
>> +			goto pin_done;
>> +		}
>> +		lpfn->vmm_va = remote_vaddr;
>> +		lpfn->iova = iova;
>> +		lpfn->pfn = pfn[i];
>> +		lpfn->npage = 1;
>> +		lpfn->prot = prot;
>> +		atomic_inc(&lpfn->ref_count);
>> +		vfio_link_vgpu_pfn(domain_vgpu, lpfn);
>> +		mutex_unlock(&domain_vgpu->lock);
>> +	}
>> +
>> +	ret = i;
>> +
>> +pin_done:
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(vfio_pin_pages);
>> +

IIUC, an api-only domain is a VFIO domain *without* underlying IOMMU
hardware. It just, as you said in another mail, "rather than
programming them into an IOMMU for a device, it simply stores the
translations for use by later requests".

That imposes a constraint on gfx driver: hardware IOMMU must be disabled.
Otherwise, if IOMMU is present, the gfx driver eventually programs
the hardware IOMMU with IOVA returned by pci_map_page or dma_map_page;
Meanwhile, the IOMMU backend for vgpu only maintains GPA <-> HPA
translations without any knowledge about hardware IOMMU, how is the
device model supposed to do to get an IOVA for a given GPA (thereby HPA
by the IOMMU backend here)?

If things go as guessed above, as vfio_pin_pages() indicates, it
pin & translate vaddr to PFN, then it will be very difficult for the
device model to figure out:

	1, for a given GPA, how to avoid calling dma_map_page multiple times?
	2, for which page to call dma_unmap_page?

--
Thanks,
Jike

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux