On 2021-11-15 13:24, Jason Gunthorpe via iommu wrote:
On Mon, Nov 15, 2021 at 05:19:02AM -0800, Christoph Hellwig wrote:
On Mon, Nov 15, 2021 at 10:05:43AM +0800, Lu Baolu wrote:
@@ -566,6 +567,12 @@ static int really_probe(struct device *dev, struct device_driver *drv)
goto done;
}
+ if (!drv->suppress_auto_claim_dma_owner) {
+ ret = iommu_device_set_dma_owner(dev, DMA_OWNER_KERNEL, NULL);
+ if (ret)
+ return ret;
+ }
I'd expect this to go into iommu_setup_dma_ops (and its arm and s390
equivalents), as that is what claims an IOMMU for in-kernel usage
If iommu_device_set_dma_owner(dev_a) fails changes dynamically
depending on what iommu_device_set_dma_owner(dev_b, DMA_OWNER_USER)
have been done.
The whole point here is that doing a
iommu_device_set_dma_owner(dev_b, DMA_OWNER_USER)
needs to revoke kernel usage from a whole bunch of other devices in
the same group.
revoking kernel usage means it needs to ensure that no driver is bound
and prevent future drivers from being bound.
iommu_setup_dma_ops() is something done once early on in boot, not at
every driver probe, so I don't see how it can help??
Note that there's some annoying inconsistency across architectures, and
with the {acpi,of}_dma_configure() code in general. I guess Christoph
might be thinking of the case where iommu_setup_dma_ops() *does* end up
being called off the back of the bus->dma_configure() hook a few lines
below the context above.
iommu_setup_dma_ops() itself is indeed not the appropriate place for
this (the fact that it can be called as late as driver probe is subtly
broken and still on my list to fix...) but
bus->dma_configure() definitely is. Only a handful of buses care about
IOMMUs, and possibly even fewer of them support VFIO, so I'm in full
agreement with Greg and Christoph that this absolutely warrants being
scoped per-bus. I mean, we literally already have infrastructure to
prevent drivers binding if the IOMMU/DMA configuration is broken or not
ready yet; why would we want a totally different mechanism to prevent
driver binding when the only difference is that that configuration *is*
ready and working to the point that someone's already claimed it for
other purposes?
Robin.