On 11/7/2024 8:31 PM, Robin Murphy wrote:
On 29/10/2024 4:15 pm, Will Deacon wrote:
On Fri, 04 Oct 2024 14:34:28 +0530, Pratyush Brahma wrote:
Null pointer dereference occurs due to a race between smmu
driver probe and client driver probe, when of_dma_configure()
for client is called after the iommu_device_register() for smmu driver
probe has executed but before the driver_bound() for smmu driver
has been called.
Following is how the race occurs:
[...]
Applied to will (for-joerg/arm-smmu/updates), thanks!
[1/1] iommu/arm-smmu: Defer probe of clients after smmu device bound
https://git.kernel.org/will/c/229e6ee43d2a
I've finally got to the point of proving to myself that this isn't the
right fix, since once we do get __iommu_probe_device() working properly
in the correct order, iommu_device_register() then runs into the same
condition itself. Diff below should make this issue go away - I'll write
up proper patches once I've tested it a little more.
Thanks,
Robin.
----->8-----
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 737c5b882355..b7dcb1494aa4 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3171,8 +3171,8 @@ static struct platform_driver arm_smmu_driver;
static
struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle
*fwnode)
{
- struct device *dev =
driver_find_device_by_fwnode(&arm_smmu_driver.driver,
- fwnode);
+ struct device *dev =
bus_find_device_by_fwnode(&platform_bus_type, fwnode);
+ put_device(dev);
return dev ? dev_get_drvdata(dev) : NULL;
}
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 8321962b3714..aba315aa6848 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1411,8 +1411,8 @@ static bool arm_smmu_capable(struct device *dev,
enum iommu_cap cap)
static
struct arm_smmu_device *arm_smmu_get_by_fwnode(struct fwnode_handle
*fwnode)
{
- struct device *dev =
driver_find_device_by_fwnode(&arm_smmu_driver.driver,
- fwnode);
+ struct device *dev =
bus_find_device_by_fwnode(&platform_bus_type, fwnode);
I think it would still follow this path:
bus_find_device_by_fwnode() -> bus_find_device() -> next_device()
next_device() would always return null until the driver is bound to the
device which
happens much later in really_probe() after the iommu_device_register()
would be called
even as per this patch. That way the race would still occur, wouldn't it?
Can you please help me understand what I may be missing here?
Are you saying that these additional patches are required along with the
fix I've
posted?
+
put_device(dev);
return dev ? dev_get_drvdata(dev) : NULL;
}
@@ -2232,21 +2232,6 @@ static int arm_smmu_device_probe(struct
platform_device *pdev)
i, irq);
}
- err = iommu_device_sysfs_add(&smmu->iommu, smmu->dev, NULL,
- "smmu.%pa", &smmu->ioaddr);
- if (err) {
- dev_err(dev, "Failed to register iommu in sysfs\n");
- return err;
- }
-
- err = iommu_device_register(&smmu->iommu, &arm_smmu_ops,
- using_legacy_binding ? NULL : dev);
- if (err) {
- dev_err(dev, "Failed to register iommu\n");
- iommu_device_sysfs_remove(&smmu->iommu);
- return err;
- }
-
platform_set_drvdata(pdev, smmu);
/* Check for RMRs and install bypass SMRs if any */
@@ -2255,6 +2240,18 @@ static int arm_smmu_device_probe(struct
platform_device *pdev)
arm_smmu_device_reset(smmu);
arm_smmu_test_smr_masks(smmu);
+ err = iommu_device_sysfs_add(&smmu->iommu, smmu->dev, NULL,
+ "smmu.%pa", &smmu->ioaddr);
+ if (err)
+ return dev_err_probe(dev, err, "Failed to register iommu in
sysfs\n");
+
+ err = iommu_device_register(&smmu->iommu, &arm_smmu_ops,
+ using_legacy_binding ? NULL : dev);
+ if (err) {
+ iommu_device_sysfs_remove(&smmu->iommu);
+ return dev_err_probe(dev, err, "Failed to register iommu\n");
+ }
+
/*
* We want to avoid touching dev->power.lock in fastpaths unless
* it's really going to do something useful - pm_runtime_enabled()
--
Thanks and Regards
Pratyush Brahma