On 14/02/2025 8:14 pm, Jason Gunthorpe wrote:
On Thu, Feb 13, 2025 at 11:49:00PM +0000, Robin Murphy wrote:
much just calling the same path twice. At client driver probe time,
dev->driver is obviously set; conversely at device_add(), or a
subsequent bus_iommu_probe(), any device waiting for an IOMMU really
Could you put the dev->driver test into iommu_device_use_default_domain()?
It looks like many of the cases are just guarding that call.
should *not* have a driver already, so we can use that as a condition to
disambiguate the two cases, and avoid recursing back into the IOMMU core
at the wrong times.
Which sounds like this:
+ mutex_unlock(&iommu_probe_device_lock);
+ dev->bus->dma_configure(dev);
+ mutex_lock(&iommu_probe_device_lock);
+ }
Shouldn't call iommu_device_use_default_domain() ?
Semantically it shouldn't really be called at this stage, but it won't
be anyway since "to_<x>_driver(NULL)->driver_managed_dma" is not false -
trouble is it's also not true ;)
But... I couldn't guess what the problem with calling it is?
In the not-probed case it will see dev->iommu_group is NULL and succeed.
The probed case could be prevented by checking dev->iommu_group sooner
in __iommu_probe_device()?
Anyhow, the approach seems OK
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 9f4efa8f75a6..42b8f1833c3c 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -1619,6 +1619,9 @@ static int acpi_iommu_configure_id(struct device *dev, const u32 *id_in)
{
int err;
+ if (device_iommu_mapped(dev))
+ return 0;
This is unlocked and outside a driver context, it should have a
comment explaining why races with probe can't happen?
Sure, for now this is more just an opportunistic thing - since we'll now
expect to come through this path twice, if we see a group assigned then
we know for sure that someone else already set up the fwspec for that to
happen, so there's definitely nothing to do here and we can skip
potentially waiting for iommu_probe_device_lock.
+ /*
+ * For FDT-based systems and ACPI IORT/VIOT, the common firmware parsing
+ * is buried in the bus dma_configure path. Properly unpicking that is
+ * still a fairly big job, so for now just invoke the whole thing. Our
+ * bus_iommu_probe() walk may see devices with drivers already bound,
+ * but that must mean they're already configured - either probed by
+ * another IOMMU, or there was no IOMMU for iommu_fwspec_init() to wait
+ * for - so either way we can safely skip this and avoid worrying about
+ * those recursing back here thinking they need a replay call.
+ */
+ if (!dev->driver && dev->bus->dma_configure) {
+ mutex_unlock(&iommu_probe_device_lock);
+ dev->bus->dma_configure(dev);
+ mutex_lock(&iommu_probe_device_lock);
+ }
+
+ /*
+ * At this point, either valid devices now have a fwspec, or we can
+ * assume that only one of Intel, AMD, s390, PAMU or legacy SMMUv2 can
+ * be present, and that any of their registered instances has suitable
+ * ops for probing, and thus cheekily co-opt the same mechanism.
+ */
+ ops = iommu_fwspec_ops(dev_iommu_fwspec_get(dev));
+ if (!ops)
+ return -ENODEV;
+
/* Device is probed already if in a group */
if (dev->iommu_group)
return 0;
This is the test I mean, if iommu_group is set then
dev->iommu->iommu_dev->ops is supposed to be valid too. It seems like
it should be done earlier..
Yeah, looking at it now I'm really not sure why this ended up in this
order - I guess I was effectively adding the dma_configure() call to the
front of the existing iommu_fwspec_ops() check, and then I moved the
lockdep_assert() up to make more sense. But then the ops check probably
should have been after the group check to begin with, for much the same
reasoning as above. I'll sort that out for v2.
+ /*
+ * And if we do now see any replay calls, they would indicate someone
+ * misusing the dma_configure path outside bus code.
+ */
+ if (dev_iommu_fwspec_get(dev) && dev->driver)
+ dev_WARN(dev, "late IOMMU probe at driver bind, something fishy here!\n");
WARN_ON_ONCE or dump_stack() to get the stack trace out?
Indeed, hence dev_WARN() (!= dev_warn())
@@ -121,6 +121,9 @@ int of_iommu_configure(struct device *dev, struct device_node *master_np,
if (!master_np)
return -ENODEV;
+ if (device_iommu_mapped(dev))
+ return 0;
Same note
Ack.
@@ -151,7 +154,12 @@ int of_iommu_configure(struct device *dev, struct device_node *master_np,
iommu_fwspec_free(dev);
mutex_unlock(&iommu_probe_device_lock);
- if (!err && dev->bus)
+ /*
+ * If we have reason to believe the IOMMU driver missed the initial
+ * iommu_probe_device() call for dev, try to fix it up. This should
+ * no longer happen unless of_dma_configure() is being misused.
+ */
+ if (!err && dev->driver)
err = iommu_probe_device(dev);
This is being conservative? After some time of nobody complaining
it can be removed?
Indeed I feel sufficiently confident about the ACPI path (which at the
moment is effectively arm64-only) to remove it from there already, but
less so about all the assorted DT platforms. That said, I guess adding a
new dependency on dev->driver here might still represent a change of
behaviour for the sketchy direct calls of of_dma_configure() outside bus
code, since in a lot of those the target device doesn't actually have
its own driver either. Maybe I need to think about this a bit more...
Thanks,
Robin.