Re: [RFC PATCH v3 3/3] VFIO Type1 IOMMU change: to support with iommu and without iommu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5/4/2016 4:13 AM, Alex Williamson wrote:
> On Tue, 3 May 2016 00:10:41 +0530
> Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote:
>
[..]


>> +	if (domain->vfio_iommu_api_only)
>> +		mm = domain->vmm_mm;
>> +	else
>> +		mm = current->mm;
>> +
>> +	if (!mm)
>> +		return -ENODEV;
>> +
>> +	ret = vaddr_get_pfn(mm, vaddr, prot, pfn_base);
>
> We could pass domain->mm unconditionally to vaddr_get_pfn(), let it be
> NULL in the !api_only case and use it as a cue to vaddr_get_pfn() which
> gup variant to use.  Of course we need to deal with mmap_sem somewhere
> too without turning the code into swiss cheese.
>

Yes, I missed that. Thanks for pointing out. I'll fix it.

> Correct me if I'm wrong, but I assume the main benefit of interweaving
> this into type1 vs pulling out common code and making a new vfio iommu
> backend is the page accounting, ie. not over accounting locked pages.
> TBH, I don't know if it's worth it.  Any idea what the high water mark
> of pinned pages for a vgpu might be?
>

It depends in which VM (Linux/Windows) is running and what workload is running. On Windows DMA pages are managed by WDDM model. On Linux each user space application can DMA to pages and there are not restrictions.


> The only reason I can come up with for why we'd want to integrate an
> api-only domain into the existing type1 code would be to avoid page
> accounting issues where we count locked pages once for a normal
> assigned device and again for a vgpu, but that's not what we're doing
> here.  We're not only locking the pages again regardless of them
> already being locked, we're counting every time we lock them through
> this new interface.  So there's really no point at all to making type1
> become this unsupportable.  In that case we should be pulling out the
> common code that we want to share from type1 and making a new type1
> compatible vfio iommu backend rather than conditionalizing everything
> here.
>

I tried to add pfn tracking logic and use already locked pages, but that didn't worked somehow, I'll revisit it again. With this there will be additional pfn tracking logic for the case where device is directly assigned and vGPU device is not present.



>> +		// verify if pfn exist in pfn_list
>> +		if (!(p = vfio_find_vgpu_pfn(domain_vgpu, *(pfn + i)))) {
>> +			continue;
>
> How does the caller deal with this, the function returns number of
> pages unpinned which will not match the requested number of pages to
> unpin if there are any missing.  Also, no setting variables within a
> test when easily avoidable please, separate to a set then test.
>

Here we are following the current code logic. Do you have any suggestion how to deal with that?


>> +	(iommu, &domain, &domain_vgpu);
>> +
>> +	if (!domain)
>> +		return;
>> +
>> +	d = domain;
>>  	list_for_each_entry_continue(d, &iommu->domain_list, next) {
>> -		iommu_unmap(d->domain, dma->iova, dma->size);
>> -		cond_resched();
>> +		if (!d->vfio_iommu_api_only) {
>> +			iommu_unmap(d->domain, dma->iova, dma->size);
>> +			cond_resched();
>> +		}get_first_domains
>>  	}
>>
>>  	while (iova < end) {
>
> How do api-only domain not blowup on the iommu API code in this next
> code block?  Are you just getting lucky that the api-only domain is
> first in the list and the real domain is last?
>

Control will not come here if there is no domain with IOMMU due to below change:

>> +	if (!domain)
>> +		return;

get_first_domains() returns the first domain with IOMMU and first domain with api_only.


>> +		if (d->vfio_iommu_api_only)
>> +			continue;
>> +
>
> Really disliking all these switches everywhere, too many different code
> paths.
>

I'll move such APIs to inline functions such that this check would be within inline functions and code would look much cleaner.


>> +	// Skip pin and map only if domain without IOMMU is present
>> +	if (!domain_with_iommu_present) {
>> +		dma->size = size;
>> +		goto map_done;
>> +	}
>> +
>
> Yet more special cases, the code is getting unsupportable.

From vfio_dma_do_map(), if there is no devices pass-throughed then we don't want to pin all pages upfront. and that is the reason of this check.

>
> I'm really not convinced that pushing this into the type1 code is the
> right approach vs pulling out shareable code chunks where it makes
> sense and creating a separate iommu backend.  We're not getting
> anything but code complexity out of this approach it seems.

I find pulling out shared code is also not simple. I would like to revisit this again and sort out the concerns you raised rather than making separate module.

Thanks,
Kirti.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux