On 3/14/22 5:38 PM, Jason Gunthorpe wrote:
On Mon, Mar 14, 2022 at 03:44:34PM -0400, Matthew Rosato wrote:
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 9394aa9444c1..0bec97077d61 100644
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -77,6 +77,7 @@ struct vfio_iommu {
bool nesting;
bool dirty_page_tracking;
bool container_open;
+ bool kvm;
struct list_head emulated_iommu_groups;
};
@@ -2203,7 +2204,12 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
goto out_free_group;
ret = -EIO;
- domain->domain = iommu_domain_alloc(bus);
+
+ if (iommu->kvm)
+ domain->domain = iommu_domain_alloc_type(bus, IOMMU_DOMAIN_KVM);
+ else
+ domain->domain = iommu_domain_alloc(bus);
+
if (!domain->domain)
goto out_free_domain;
@@ -2552,6 +2558,9 @@ static void *vfio_iommu_type1_open(unsigned long arg)
case VFIO_TYPE1v2_IOMMU:
iommu->v2 = true;
break;
+ case VFIO_KVM_IOMMU:
+ iommu->kvm = true;
+ break;
Same remark for this - but more - this is called KVM but it doesn't
accept a kvm FD or any thing else to link the domain to the KVM
in-use.
Right... The name is poor, but with the current design the KVM
association comes shortly after. To summarize, with this series, the
following relevant steps occur:
1) VFIO_SET_IOMMU: Indicate we wish to use the alternate IOMMU domain
-> At this point, the IOMMU will reject any maps (no KVM, no guest
table anchor)
2) KVM ioctl "start":
-> Register the KVM with the IOMMU domain
-> At this point, IOMMU will still reject any maps (no guest table anchor)
3) KVM ioctl "register ioat"
-> Register the guest DMA table head with the IOMMU domain
-> now IOMMU maps are allowed
The rationale for splitting steps 1 and 2 are that VFIO_SET_IOMMU
doesn't have a mechanism for specifying more than the type as an arg,
no? Otherwise yes, you could specify a kvm fd at this point and it
would have some other advantages (e.g. skip notifier). But we still
can't use the IOMMU for mapping until step 3.
The rationale for splitting steps 2 and 3 are twofold: 1) during init,
we simply don't know where the guest anchor will be when we allocate the
domain and 2) because the guest can technically clear and re-initialize
their DMA space during the life of the guest, moving the location of the
table anchor. We would receive another ioctl operation to unregister
the guest table anchor and again reject any map operation until a new
table location is provided.