On 2024-11-19 7:10 pm, Pratyush Brahma wrote:
On 11/7/2024 8:31 PM, Robin Murphy wrote:
On 29/10/2024 4:15 pm, Will Deacon wrote:
On Fri, 04 Oct 2024 14:34:28 +0530, Pratyush Brahma wrote:
Null pointer dereference occurs due to a race between smmu
driver probe and client driver probe, when of_dma_configure()
for client is called after the iommu_device_register() for smmu driver
probe has executed but before the driver_bound() for smmu driver
has been called.
Following is how the race occurs:
[...]
Applied to will (for-joerg/arm-smmu/updates), thanks!
[1/1] iommu/arm-smmu: Defer probe of clients after smmu device bound
https://git.kernel.org/will/c/229e6ee43d2a
I've finally got to the point of proving to myself that this isn't the
right fix, since once we do get __iommu_probe_device() working properly
in the correct order, iommu_device_register() then runs into the same
condition itself. Diff below should make this issue go away - I'll write
up proper patches once I've tested it a little more.
Thanks,
Robin.
----->8-----
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/
iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 737c5b882355..b7dcb1494aa4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3171,8 +3171,8 @@ static struct platform_driver arm_smmu_driver;
static
struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle
*fwnode)
{
- struct device *dev =
driver_find_device_by_fwnode(&arm_smmu_driver.driver,
- fwnode);
+ struct device *dev =
bus_find_device_by_fwnode(&platform_bus_type, fwnode);
+ put_device(dev);
return dev ? dev_get_drvdata(dev) : NULL;
}
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/
arm/arm-smmu/arm-smmu.c
index 8321962b3714..aba315aa6848 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1411,8 +1411,8 @@ static bool arm_smmu_capable(struct device *dev,
enum iommu_cap cap)
static
struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle
*fwnode)
{
- struct device *dev =
driver_find_device_by_fwnode(&arm_smmu_driver.driver,
- fwnode);
+ struct device *dev =
bus_find_device_by_fwnode(&platform_bus_type, fwnode);
I think it would still follow this path:
bus_find_device_by_fwnode() -> bus_find_device() -> next_device()
next_device() would always return null until the driver is bound to the
device which
No, this is traversing the bus list, *not* the driver list, that's the
whole point. The SMMU device must exist on the platform bus before the
driver can bind, since the bus is responsible for matching the driver in
the first place.
happens much later in really_probe() after the iommu_device_register()
would be called
even as per this patch. That way the race would still occur, wouldn't it?
Can you please help me understand what I may be missing here?
Are you saying that these additional patches are required along with the
fix I've
posted?
I'm saying my change makes there be no race, i.e. the "if (!smmu)" case
can never be true, and so no longer needs working around.
Thanks,
Robin.
+
put_device(dev);
return dev ? dev_get_drvdata(dev) : NULL;
}
@@ -2232,21 +2232,6 @@ static int arm_smmu_device_probe(struct
platform_device *pdev)
i, irq);
}
- err = iommu_device_sysfs_add(&smmu->iommu, smmu->dev, NULL,
- "smmu.%pa", &smmu->ioaddr);
- if (err) {
- dev_err(dev, "Failed to register iommu in sysfs\n");
- return err;
- }
-
- err = iommu_device_register(&smmu->iommu, &arm_smmu_ops,
- using_legacy_binding ? NULL : dev);
- if (err) {
- dev_err(dev, "Failed to register iommu\n");
- iommu_device_sysfs_remove(&smmu->iommu);
- return err;
- }
-
platform_set_drvdata(pdev, smmu);
/* Check for RMRs and install bypass SMRs if any */
@@ -2255,6 +2240,18 @@ static int arm_smmu_device_probe(struct
platform_device *pdev)
arm_smmu_device_reset(smmu);
arm_smmu_test_smr_masks(smmu);
+ err = iommu_device_sysfs_add(&smmu->iommu, smmu->dev, NULL,
+ "smmu.%pa", &smmu->ioaddr);
+ if (err)
+ return dev_err_probe(dev, err, "Failed to register iommu in
sysfs\n");
+
+ err = iommu_device_register(&smmu->iommu, &arm_smmu_ops,
+ using_legacy_binding ? NULL : dev);
+ if (err) {
+ iommu_device_sysfs_remove(&smmu->iommu);
+ return dev_err_probe(dev, err, "Failed to register iommu\n");
+ }
+
/*
* We want to avoid touching dev->power.lock in fastpaths unless
* it's really going to do something useful - pm_runtime_enabled()