Hi Saravana, On 12.01.2021 21:51, Saravana Kannan wrote: > On Mon, Jan 11, 2021 at 11:11 PM Marek Szyprowski > <m.szyprowski@xxxxxxxxxxx> wrote: >> On 11.01.2021 22:47, Saravana Kannan wrote: >>> On Mon, Jan 11, 2021 at 6:18 AM Marek Szyprowski >>> <m.szyprowski@xxxxxxxxxxx> wrote: >>>> On 11.01.2021 12:12, Marek Szyprowski wrote: >>>>> On 18.12.2020 04:17, Saravana Kannan wrote: >>>>>> Cyclic dependencies in some firmware was one of the last remaining >>>>>> reasons fw_devlink=on couldn't be set by default. Now that cyclic >>>>>> dependencies don't block probing, set fw_devlink=on by default. >>>>>> >>>>>> Setting fw_devlink=on by default brings a bunch of benefits (currently, >>>>>> only for systems with device tree firmware): >>>>>> * Significantly cuts down deferred probes. >>>>>> * Device probe is effectively attempted in graph order. >>>>>> * Makes it much easier to load drivers as modules without having to >>>>>> worry about functional dependencies between modules (depmod is still >>>>>> needed for symbol dependencies). >>>>>> >>>>>> If this patch prevents some devices from probing, it's very likely due >>>>>> to the system having one or more device drivers that "probe"/set up a >>>>>> device (DT node with compatible property) without creating a struct >>>>>> device for it. If we hit such cases, the device drivers need to be >>>>>> fixed so that they populate struct devices and probe them like normal >>>>>> device drivers so that the driver core is aware of the devices and their >>>>>> status. See [1] for an example of such a case. >>>>>> >>>>>> [1] - >>>>>> https://lore.kernel.org/lkml/CAGETcx9PiX==mLxB9PO8Myyk6u2vhPVwTMsA5NkD-ywH5xhusw@xxxxxxxxxxxxxx/ >>>>>> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx> >>>>> This patch landed recently in linux next-20210111 as commit >>>>> e590474768f1 ("driver core: Set fw_devlink=on by default"). Sadly it >>>>> breaks Exynos IOMMU operation, what causes lots of devices being >>>>> deferred and not probed at all. I've briefly checked and noticed that >>>>> exynos_sysmmu_probe() is never called after this patch. This is really >>>>> strange for me, as the SYSMMU controllers on Exynos platform are >>>>> regular platform devices registered by the OF code. The driver code is >>>>> here: drivers/iommu/exynos-iommu.c, example dts: >>>>> arch/arm/boot/dts/exynos3250.dtsi (compatible = "samsung,exynos-sysmmu"). >>>> Okay, I found the source of this problem. It is caused by Exynos power >>>> domain driver, which is not platform driver yet. I will post a patch, >>>> which converts it to the platform driver. >>> Thanks Marek! Hopefully the debug logs I added were sufficient to >>> figure out the reason. >> Frankly, it took me a while to figure out that device core waits for the >> power domain devices. Maybe it would be possible to add some more debug >> messages or hints? Like the reason of the deferred probe in >> /sys/kernel/debug/devices_deferred ? > There's already a /sys/devices/.../<device>/waiting_for_supplier file > that tells you if the device is waiting for a supplier device to be > added. That file goes away once the device probes. If the file has 1, > then it's waiting for the supplier device to be added (like your > case). If it's 0, then the device is just waiting on one of the > existing suppliers to probe. You can find the existing suppliers > through /sys/devices/.../<device>/supplier:*/supplier. Also, flip > these dev_dbg() to dev_info() if you need more details about deferred > probing. Frankly speaking I doubt that anyone will find those. Even experienced developer might need some time to figure it out. I expect that such information will be at least in the mentioned /sys/kernel/debug/devices_deferred file. We already have infrastructure for putting the deferred probe reason there, see dev_err_probe() function. Even such a simple change makes the debugging this issue much easier: diff --git a/drivers/base/core.c b/drivers/base/core.c index cd8e518fadd6..ceb5aed5a84c 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -937,12 +937,13 @@ int device_links_check_suppliers(struct device *dev) mutex_lock(&fwnode_link_lock); if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) && !fw_devlink_is_permissive()) { - dev_dbg(dev, "probe deferral - wait for supplier %pfwP\n", + ret = dev_err_probe(dev, -EPROBE_DEFER, + "probe deferral - wait for supplier %pfwP\n", list_first_entry(&dev->fwnode->suppliers, struct fwnode_link, c_hook)->supplier); mutex_unlock(&fwnode_link_lock); - return -EPROBE_DEFER; + return ret; } mutex_unlock(&fwnode_link_lock); @@ -955,9 +956,9 @@ int device_links_check_suppliers(struct device *dev) if (link->status != DL_STATE_AVAILABLE && !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) { device_links_missing_supplier(dev); - dev_dbg(dev, "probe deferral - supplier %s not ready\n", + ret = dev_err_probe(dev, -EPROBE_DEFER, + "probe deferral - supplier %s not ready\n", dev_name(link->supplier)); - ret = -EPROBE_DEFER; break; } WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE); After such change: # cat /sys/kernet/debug/devices_deferred sound 13620000.sysmmu platform: probe deferral - supplier 10023c40.power-domain not ready 13630000.sysmmu platform: probe deferral - supplier 10023c40.power-domain not ready 12e20000.sysmmu platform: probe deferral - supplier 10023c20.power-domain not ready 11a20000.sysmmu platform: probe deferral - supplier 10023c00.power-domain not ready 11a30000.sysmmu platform: probe deferral - supplier 10023c00.power-domain not ready 11a40000.sysmmu platform: probe deferral - supplier 10023c00.power-domain not ready 11a50000.sysmmu platform: probe deferral - supplier 10023c00.power-domain not ready 11a60000.sysmmu platform: probe deferral - supplier 10023c00.power-domain not ready 11e20000.sysmmu platform: probe deferral - supplier 10023c80.power-domain not ready 12d00000.hdmi platform: probe deferral - supplier 10023c20.power-domain not ready 10048000.clock-controller platform: probe deferral - supplier 10023ca0.power-domain not ready 12260000.sysmmu platform: probe deferral - supplier 10048000.clock-controller not ready 12270000.sysmmu platform: probe deferral - supplier 10048000.clock-controller not ready 122a0000.sysmmu platform: probe deferral - supplier 10048000.clock-controller not ready 122b0000.sysmmu platform: probe deferral - supplier 10048000.clock-controller not ready 123b0000.sysmmu platform: probe deferral - supplier 10048000.clock-controller not ready 123c0000.sysmmu platform: probe deferral - supplier 10048000.clock-controller not ready 12c10000.mixer platform: probe deferral - supplier 10023c20.power-domain not ready 13000000.gpu platform: probe deferral - supplier 10023c60.power-domain not ready Probably the message can be adjusted a bit, this would significantly help me finding that is the source of the problem. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland