On Fri, Jul 1, 2022 at 1:10 AM Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > On Thu, Jun 30, 2022 at 11:12 PM Tony Lindgren <tony@xxxxxxxxxxx> wrote: > > > > * Tony Lindgren <tony@xxxxxxxxxxx> [220701 08:33]: > > > * Saravana Kannan <saravanak@xxxxxxxxxx> [220630 23:25]: > > > > On Thu, Jun 30, 2022 at 4:26 PM Rob Herring <robh@xxxxxxxxxx> wrote: > > > > > > > > > > On Thu, Jun 30, 2022 at 5:11 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, Jun 27, 2022 at 2:10 AM Tony Lindgren <tony@xxxxxxxxxxx> wrote: > > > > > > > > > > > > > > * Saravana Kannan <saravanak@xxxxxxxxxx> [220623 08:17]: > > > > > > > > On Thu, Jun 23, 2022 at 12:01 AM Tony Lindgren <tony@xxxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > > > * Saravana Kannan <saravanak@xxxxxxxxxx> [220622 19:05]: > > > > > > > > > > On Tue, Jun 21, 2022 at 9:59 PM Tony Lindgren <tony@xxxxxxxxxxx> wrote: > > > > > > > > > > > This issue is no directly related fw_devlink. It is a side effect of > > > > > > > > > > > removing driver_deferred_probe_check_state(). We no longer return > > > > > > > > > > > -EPROBE_DEFER at the end of driver_deferred_probe_check_state(). > > > > > > > > > > > > > > > > > > > > Yes, I understand the issue. But driver_deferred_probe_check_state() > > > > > > > > > > was deleted because fw_devlink=on should have short circuited the > > > > > > > > > > probe attempt with an -EPROBE_DEFER before reaching the bus/driver > > > > > > > > > > probe function and hitting this -ENOENT failure. That's why I was > > > > > > > > > > asking the other questions. > > > > > > > > > > > > > > > > > > OK. So where is the -EPROBE_DEFER supposed to happen without > > > > > > > > > driver_deferred_probe_check_state() then? > > > > > > > > > > > > > > > > device_links_check_suppliers() call inside really_probe() would short > > > > > > > > circuit and return an -EPROBE_DEFER if the device links are created as > > > > > > > > expected. > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > Hmm so I'm not seeing any supplier for the top level ocp device in > > > > > > > > > the booting case without your patches. I see the suppliers for the > > > > > > > > > ocp child device instances only. > > > > > > > > > > > > > > > > Hmmm... this is strange (that the device link isn't there), but this > > > > > > > > is what I suspected. > > > > > > > > > > > > > > Yup, maybe it's because of the supplier being a device in the child > > > > > > > interconnect for the ocp. > > > > > > > > > > > > Ugh... yeah, this is why the normal (not SYNC_STATE_ONLY) device link > > > > > > isn't being created. > > > > > > > > > > > > So the aggregated view is something like (I had to set tabs = 4 space > > > > > > to fit it within 80 cols): > > > > > > > > > > > > ocp: ocp { <========================= Consumer > > > > > > compatible = "simple-pm-bus"; > > > > > > power-domains = <&prm_per>; <=========== Supplier ref > > > > > > > > > > > > l4_wkup: interconnect@44c00000 { > > > > > > compatible = "ti,am33xx-l4-wkup", "simple-pm-bus"; > > > > > > > > > > > > segment@200000 { /* 0x44e00000 */ > > > > > > compatible = "simple-pm-bus"; > > > > > > > > > > > > target-module@0 { /* 0x44e00000, ap 8 58.0 */ > > > > > > compatible = "ti,sysc-omap4", "ti,sysc"; > > > > > > > > > > > > prcm: prcm@0 { > > > > > > compatible = "ti,am3-prcm", "simple-bus"; > > > > > > > > > > > > prm_per: prm@c00 { <========= Actual Supplier > > > > > > compatible = "ti,am3-prm-inst", "ti,omap-prm-inst"; > > > > > > }; > > > > > > }; > > > > > > }; > > > > > > }; > > > > > > }; > > > > > > }; > > > > > > > > > > > > The power-domain supplier is the great-great-great-grand-child of the > > > > > > consumer. It's not clear to me how this is valid. What does it even > > > > > > mean? > > > > > > > > > > > > Rob, is this considered a valid DT? > > > > > > > > > > Valid DT for broken h/w. > > > > > > > > I'm not sure even in that case it's valid. When the parent device is > > > > in reset (when the SoC is coming out of reset), there's no way the > > > > descendant is functional. And if the descendant is not functional, how > > > > is the parent device powered up? This just feels like an incorrect > > > > representation of the real h/w. > > > > > > It should be correct representation based on scanning the interconnects > > > and looking at the documentation. Some interconnect parts are wired > > > always-on and some interconnect instances may be dual-mapped. > > Thanks for helping to debug this. Appreciate it. > > > > > > > We have a quirk to probe prm/prcm first with pdata_quirks_init_clocks(). > > :'( > > I checked out the code. These prm devices just get populated with NULL > as the parent. So they are effectively top level devices from the > perspective of driver core. > > > > Maybe that also now fails in addition to the top level interconnect > > > probing no longer producing -EPROBE_DEFER. > > As far as I can tell pdata_quirks_init_clocks() is just adding these > prm devices (amongst other drivers). So I don't expect that to fail. > > > > > > > > > So the domain must be default on and then simple-pm-bus is going to > > > > > hold a reference to the domain preventing it from ever getting powered > > > > > off and things seem to work. Except what happens during suspend? > > > > > > > > But how can simple-pm-bus even get a reference? The PM domain can't > > > > get added until we are well into the probe of the simple-pm-bus and > > > > AFAICT the genpd attach is done before the driver probe is even > > > > called. > > > > > > The prm/prcm gets of_platform_populate() called on it early. > > :'( > > > The hackish patch below makes things boot for me, not convinced this > > is the preferred fix compared to earlier deferred probe handling though. > > Going back to the init level tinkering seems like a step back to me. > > The goal of fw_devlink is to avoid init level tinkering and it does > help with that in general. But these kinds of quirks are going to need > a few exceptions -- with them being quirks and all. And this change > will avoid an unnecessary deferred probe (that used to happen even > before my change). > > The other option to handle this quirk is to create the invalid > (consumer is parent of supplier) fwnode_link between the prm device > and its consumers when the prm device is populated. Then fw_devlink > will end up creating a device link when ocp gets added. But I'm not > sure if it's going to be easy to find and add all those consumers. > > I'd say, for now, let's go with this patch below. I'll see if I can > get fw_devlink to handle these odd quirks without breaking the normal > cases or making them significantly slower. But that'll take some time > and I'm not sure there'll be a nice solution. Can you check if this hack helps? If so, then I can think about whether we can pick it up without breaking everything else. Copy-paste tab mess up warning. -Saravana 8< ---------------- diff --git a/drivers/of/property.c b/drivers/of/property.c index 967f79b59016..f671a7528719 100644 --- a/drivers/of/property.c +++ b/drivers/of/property.c @@ -1138,18 +1138,6 @@ static int of_link_to_phandle(struct device_node *con_np, return -ENODEV; } - /* - * Don't allow linking a device node as a consumer of one of its - * descendant nodes. By definition, a child node can't be a functional - * dependency for the parent node. - */ - if (of_is_ancestor_of(con_np, sup_np)) { - pr_debug("Not linking %pOFP to %pOFP - is descendant\n", - con_np, sup_np); - of_node_put(sup_np); - return -EINVAL; - } - /* * Don't create links to "early devices" that won't have struct devices * created for them. @@ -1163,6 +1151,25 @@ static int of_link_to_phandle(struct device_node *con_np, of_node_put(sup_np); return -ENODEV; } + + /* + * Don't allow linking a device node as a consumer of one of its + * descendant nodes. By definition, a child node can't be a functional + * dependency for the parent node. + * + * However, if the child node already has a device while the parent is + * in the process of being added, it's probably some weird quirk + * handling. So, don't both checking if the consumer is an ancestor of + * the supplier. + */ + if (!sup_dev && of_is_ancestor_of(con_np, sup_np)) { + pr_debug("Not linking %pOFP to %pOFP - is descendant\n", + con_np, sup_np); + put_device(sup_dev); + of_node_put(sup_np); + return -EINVAL; + } +