Hi Saravana, On 13.01.2021 20:23, Saravana Kannan wrote: > On Tue, Jan 12, 2021 at 11:04 PM Marek Szyprowski > <m.szyprowski@xxxxxxxxxxx> wrote: >> On 12.01.2021 21:51, Saravana Kannan wrote: >>> On Mon, Jan 11, 2021 at 11:11 PM Marek Szyprowski >>> <m.szyprowski@xxxxxxxxxxx> wrote: >>>> On 11.01.2021 22:47, Saravana Kannan wrote: >>>>> On Mon, Jan 11, 2021 at 6:18 AM Marek Szyprowski >>>>> <m.szyprowski@xxxxxxxxxxx> wrote: >>>>>> On 11.01.2021 12:12, Marek Szyprowski wrote: >>>>>>> On 18.12.2020 04:17, Saravana Kannan wrote: >>>>>>>> Cyclic dependencies in some firmware was one of the last remaining >>>>>>>> reasons fw_devlink=on couldn't be set by default. Now that cyclic >>>>>>>> dependencies don't block probing, set fw_devlink=on by default. >>>>>>>> >>>>>>>> Setting fw_devlink=on by default brings a bunch of benefits (currently, >>>>>>>> only for systems with device tree firmware): >>>>>>>> * Significantly cuts down deferred probes. >>>>>>>> * Device probe is effectively attempted in graph order. >>>>>>>> * Makes it much easier to load drivers as modules without having to >>>>>>>> worry about functional dependencies between modules (depmod is still >>>>>>>> needed for symbol dependencies). >>>>>>>> >>>>>>>> If this patch prevents some devices from probing, it's very likely due >>>>>>>> to the system having one or more device drivers that "probe"/set up a >>>>>>>> device (DT node with compatible property) without creating a struct >>>>>>>> device for it. If we hit such cases, the device drivers need to be >>>>>>>> fixed so that they populate struct devices and probe them like normal >>>>>>>> device drivers so that the driver core is aware of the devices and their >>>>>>>> status. See [1] for an example of such a case. >>>>>>>> >>>>>>>> [1] - >>>>>>>> https://protect2.fireeye.com/v1/url?k=68f5d8ba-376ee1f5-68f453f5-0cc47a30d446-324e64700545ab93&q=1&e=fb455b9e-c8c7-40d0-8e3c-d9d3713d519b&u=https%3A%2F%2Flore.kernel.org%2Flkml%2FCAGETcx9PiX%3D%3DmLxB9PO8Myyk6u2vhPVwTMsA5NkD-ywH5xhusw%40mail.gmail.com%2F >>>>>>>> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx> >>>>>>> This patch landed recently in linux next-20210111 as commit >>>>>>> e590474768f1 ("driver core: Set fw_devlink=on by default"). Sadly it >>>>>>> breaks Exynos IOMMU operation, what causes lots of devices being >>>>>>> deferred and not probed at all. I've briefly checked and noticed that >>>>>>> exynos_sysmmu_probe() is never called after this patch. This is really >>>>>>> strange for me, as the SYSMMU controllers on Exynos platform are >>>>>>> regular platform devices registered by the OF code. The driver code is >>>>>>> here: drivers/iommu/exynos-iommu.c, example dts: >>>>>>> arch/arm/boot/dts/exynos3250.dtsi (compatible = "samsung,exynos-sysmmu"). >>>>>> Okay, I found the source of this problem. It is caused by Exynos power >>>>>> domain driver, which is not platform driver yet. I will post a patch, >>>>>> which converts it to the platform driver. >>>>> Thanks Marek! Hopefully the debug logs I added were sufficient to >>>>> figure out the reason. >>>> Frankly, it took me a while to figure out that device core waits for the >>>> power domain devices. Maybe it would be possible to add some more debug >>>> messages or hints? Like the reason of the deferred probe in >>>> /sys/kernel/debug/devices_deferred ? >>> There's already a /sys/devices/.../<device>/waiting_for_supplier file >>> that tells you if the device is waiting for a supplier device to be >>> added. That file goes away once the device probes. If the file has 1, >>> then it's waiting for the supplier device to be added (like your >>> case). If it's 0, then the device is just waiting on one of the >>> existing suppliers to probe. You can find the existing suppliers >>> through /sys/devices/.../<device>/supplier:*/supplier. Also, flip >>> these dev_dbg() to dev_info() if you need more details about deferred >>> probing. >> Frankly speaking I doubt that anyone will find those. Even experienced >> developer might need some time to figure it out. >> >> I expect that such information will be at least in the mentioned >> /sys/kernel/debug/devices_deferred file. We already have infrastructure >> for putting the deferred probe reason there, see dev_err_probe() >> function. Even such a simple change makes the debugging this issue much >> easier: >> >> diff --git a/drivers/base/core.c b/drivers/base/core.c >> index cd8e518fadd6..ceb5aed5a84c 100644 >> --- a/drivers/base/core.c >> +++ b/drivers/base/core.c >> @@ -937,12 +937,13 @@ int device_links_check_suppliers(struct device *dev) >> mutex_lock(&fwnode_link_lock); >> if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) && >> !fw_devlink_is_permissive()) { >> - dev_dbg(dev, "probe deferral - wait for supplier %pfwP\n", >> + ret = dev_err_probe(dev, -EPROBE_DEFER, >> + "probe deferral - wait for supplier %pfwP\n", >> list_first_entry(&dev->fwnode->suppliers, >> struct fwnode_link, >> c_hook)->supplier); >> mutex_unlock(&fwnode_link_lock); >> - return -EPROBE_DEFER; >> + return ret; >> } >> mutex_unlock(&fwnode_link_lock); >> >> @@ -955,9 +956,9 @@ int device_links_check_suppliers(struct device *dev) >> if (link->status != DL_STATE_AVAILABLE && >> !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) { >> device_links_missing_supplier(dev); >> - dev_dbg(dev, "probe deferral - supplier %s not >> ready\n", >> + ret = dev_err_probe(dev, -EPROBE_DEFER, >> + "probe deferral - supplier %s not ready\n", >> dev_name(link->supplier)); >> - ret = -EPROBE_DEFER; >> break; >> } >> WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE); >> >> >> After such change: >> >> # cat /sys/kernet/debug/devices_deferred > Sweet! I wasn't aware of this file at all. > > However, on a side note, one of my TODO items is to not add devices to > the deferred probe list if they'll never probe yet (due to suppliers > not having probed). On a board I tested on, it cut down really_probe() > calls by 75%! So the probe attempt itself effectively happens in graph > order (which I think is pretty cool). So that's going to conflict with > this file. I'll have to see what to do about that. > > Thanks for this pointer. Let me sit on this for 2 weeks and see how I > can incorporate your suggestion while allowing for the above. And then > I'll send out a patch. Does that work? Fine for me. Even if you want to change the core not to probe devices that miss their suppliers (what's good imho), the 'devices_deferred' file might still contain all of them. For user it is just a list of devices that are not yet available in the system with the optional reasons for that. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland