> From: Robin Murphy <robin.murphy@xxxxxxx> > Sent: Thursday, August 8, 2024 8:39 PM > > On 06/08/2024 9:25 am, Tian, Kevin wrote: > >> From: Nicolin Chen <nicolinc@xxxxxxxxxx> > >> Sent: Saturday, August 3, 2024 8:32 AM > >> > >> From: Robin Murphy <robin.murphy@xxxxxxx> > >> > >> Currently, iommu-dma is the only place outside of IOMMUFD and drivers > >> which might need to be aware of the stage 2 domain encapsulated within > >> a nested domain. This would be in the legacy-VFIO-style case where we're > > > > why is it a legacy-VFIO-style? We only support nested in IOMMUFD. > > Because with proper nesting we ideally shouldn't need the host-managed > MSI mess at all, which all stems from the old VFIO paradigm of > completely abstracting interrupts from userspace. I'm still hoping yes this was the consensus from previous discussion iirc. Just a bit distracting by calling it old VFIO paradigm. 😊 > IOMMUFD can grow its own interface for efficient MSI passthrough, where > the VMM can simply map the physical MSI doorbell into whatever IPA (GPA) > it wants it to appear at in the S2 domain, then whatever the guest does > with S1 it can program the MSI address into the endpoint accordingly > without us having to fiddle with it. > > FWIW, apart from IOMMUFD, we may also end up wanting something along > those lines for Arm CCA (if non-Secure proxying of T=1 MSIs becomes an > unavoidable thing). > > >> using host-managed MSIs with an identity mapping at stage 1, where it is > >> the underlying stage 2 domain which owns an MSI cookie and holds the > >> corresponding dynamic mappings. Hook up the new op to resolve what > we > >> need from a nested domain. > >> > >> Signed-off-by: Robin Murphy <robin.murphy@xxxxxxx> > >> Signed-off-by: Nicolin Chen <nicolinc@xxxxxxxxxx> > >> --- > >> drivers/iommu/dma-iommu.c | 18 ++++++++++++++++-- > >> include/linux/iommu.h | 4 ++++ > >> 2 files changed, 20 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > >> index 7b1dfa0665df6..05e04934a5f81 100644 > >> --- a/drivers/iommu/dma-iommu.c > >> +++ b/drivers/iommu/dma-iommu.c > >> @@ -1799,6 +1799,20 @@ static struct iommu_dma_msi_page > >> *iommu_dma_get_msi_page(struct device *dev, > >> return NULL; > >> } > >> > >> +/* > >> + * Nested domains may not have an MSI cookie or accept mappings, but > >> they may > >> + * be related to a domain which does, so we let them tell us what they > need. > >> + */ > >> +static struct iommu_domain > >> *iommu_dma_get_msi_mapping_domain(struct device *dev) > >> +{ > >> + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > >> + > >> + if (domain && domain->type == IOMMU_DOMAIN_NESTED && > >> + domain->ops->get_msi_mapping_domain) > > > > I'm not sure the core should restrict it to the NESTED type. Given > > there is a new domain ops any type restriction can be handled > > inside the callback. Anyway the driver should register the op > > for a domain only when there is a need. > > Nested should remain the only case where domains are chained together > such that the VFIO MSI cookie may be hiding somewhere else. This is > effectively documenting that implementing the callback for any other > domain type would be a bad smell. Much like how the mapping-related ops > are explicitly short-circuited for non-paging domain types. Then probably good to clarify it in below comment for @get_msi_mapping_domain so driver developers could catch that restriction easily w/o searching the related code. > > Thanks, > Robin. > > >> + domain = domain->ops->get_msi_mapping_domain(domain); > >> + return domain; > >> +} > >> + > >> /** > >> * iommu_dma_prepare_msi() - Map the MSI page in the IOMMU > domain > >> * @desc: MSI descriptor, will store the MSI page > >> @@ -1809,7 +1823,7 @@ static struct iommu_dma_msi_page > >> *iommu_dma_get_msi_page(struct device *dev, > >> int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t > msi_addr) > >> { > >> struct device *dev = msi_desc_to_dev(desc); > >> - struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > >> + struct iommu_domain *domain = > >> iommu_dma_get_msi_mapping_domain(dev); > >> struct iommu_dma_msi_page *msi_page; > >> static DEFINE_MUTEX(msi_prepare_lock); /* see below */ > >> > >> @@ -1842,7 +1856,7 @@ int iommu_dma_prepare_msi(struct msi_desc > >> *desc, phys_addr_t msi_addr) > >> void iommu_dma_compose_msi_msg(struct msi_desc *desc, struct > msi_msg > >> *msg) > >> { > >> struct device *dev = msi_desc_to_dev(desc); > >> - const struct iommu_domain *domain = > >> iommu_get_domain_for_dev(dev); > >> + const struct iommu_domain *domain = > >> iommu_dma_get_msi_mapping_domain(dev); > >> const struct iommu_dma_msi_page *msi_page; > >> > >> msi_page = msi_desc_get_iommu_cookie(desc); > >> diff --git a/include/linux/iommu.h b/include/linux/iommu.h > >> index 4d47f2c333118..69ed76f9c3ec4 100644 > >> --- a/include/linux/iommu.h > >> +++ b/include/linux/iommu.h > >> @@ -638,6 +638,8 @@ struct iommu_ops { > >> * @enable_nesting: Enable nesting > >> * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*) > >> * @free: Release the domain after use. > >> + * @get_msi_mapping_domain: Return the related iommu_domain that > >> should hold the > >> + * MSI cookie and accept mapping(s). > >> */ > >> struct iommu_domain_ops { > >> int (*attach_dev)(struct iommu_domain *domain, struct device *dev); > >> @@ -668,6 +670,8 @@ struct iommu_domain_ops { > >> unsigned long quirks); > >> > >> void (*free)(struct iommu_domain *domain); > >> + struct iommu_domain * > >> + (*get_msi_mapping_domain)(struct iommu_domain > >> *domain); > >> }; > >> > >> /** > >> -- > >> 2.43.0 > >> > >