Replying again because the previous email accidentally included HTML. Thanks for taking the time to reconsider the wording Frank. Your intention was clear to me in the first email too. A kernel command line option can also completely disable this functionality easily and cleanly. Can we pick that as an option? I've an implementation of that in the v5 series I sent out last week. -Saravana On Mon, Jul 15, 2019 at 7:39 AM Frank Rowand <frowand.list@xxxxxxxxx> wrote: > > On 7/15/19 7:26 AM, Frank Rowand wrote: > > HiRob, > > > > Sorry for such a late reply... > > > > > > On 7/1/19 8:25 PM, Saravana Kannan wrote: > >> On Mon, Jul 1, 2019 at 6:32 PM Rob Herring <robh+dt@xxxxxxxxxx> wrote: > >>> > >>> On Mon, Jul 1, 2019 at 6:48 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > >>>> > >>>> Add device-links after the devices are created (but before they are > >>>> probed) by looking at common DT bindings like clocks and > >>>> interconnects. > >>>> > >>>> Automatically adding device-links for functional dependencies at the > >>>> framework level provides the following benefits: > >>>> > >>>> - Optimizes device probe order and avoids the useless work of > >>>> attempting probes of devices that will not probe successfully > >>>> (because their suppliers aren't present or haven't probed yet). > >>>> > >>>> For example, in a commonly available mobile SoC, registering just > >>>> one consumer device's driver at an initcall level earlier than the > >>>> supplier device's driver causes 11 failed probe attempts before the > >>>> consumer device probes successfully. This was with a kernel with all > >>>> the drivers statically compiled in. This problem gets a lot worse if > >>>> all the drivers are loaded as modules without direct symbol > >>>> dependencies. > >>>> > >>>> - Supplier devices like clock providers, interconnect providers, etc > >>>> need to keep the resources they provide active and at a particular > >>>> state(s) during boot up even if their current set of consumers don't > >>>> request the resource to be active. This is because the rest of the > >>>> consumers might not have probed yet and turning off the resource > >>>> before all the consumers have probed could lead to a hang or > >>>> undesired user experience. > >>>> > >>>> Some frameworks (Eg: regulator) handle this today by turning off > >>>> "unused" resources at late_initcall_sync and hoping all the devices > >>>> have probed by then. This is not a valid assumption for systems with > >>>> loadable modules. Other frameworks (Eg: clock) just don't handle > >>>> this due to the lack of a clear signal for when they can turn off > >>>> resources. This leads to downstream hacks to handle cases like this > >>>> that can easily be solved in the upstream kernel. > >>>> > >>>> By linking devices before they are probed, we give suppliers a clear > >>>> count of the number of dependent consumers. Once all of the > >>>> consumers are active, the suppliers can turn off the unused > >>>> resources without making assumptions about the number of consumers. > >>>> > >>>> By default we just add device-links to track "driver presence" (probe > >>>> succeeded) of the supplier device. If any other functionality provided > >>>> by device-links are needed, it is left to the consumer/supplier > >>>> devices to change the link when they probe. > >>>> > >>>> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx> > >>>> --- > >>>> drivers/of/Kconfig | 9 ++++++++ > >>>> drivers/of/platform.c | 52 +++++++++++++++++++++++++++++++++++++++++++ > >>>> 2 files changed, 61 insertions(+) > >>>> > >>>> diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig > >>>> index 37c2ccbefecd..7c7fa7394b4c 100644 > >>>> --- a/drivers/of/Kconfig > >>>> +++ b/drivers/of/Kconfig > >>>> @@ -103,4 +103,13 @@ config OF_OVERLAY > >>>> config OF_NUMA > >>>> bool > >>>> > >>>> +config OF_DEVLINKS > >>> > >>> I'd prefer this not be a config option. After all, we want one kernel > >>> build that works for all platforms. > >> > >> We need a lot more changes before one kernel build can work for all > >> platforms. At least until then, I think we need this. Lot less chance > >> of breaking existing platforms before all the missing pieces are > >> created. > >> > >>> A kernel command line option to disable might be useful for debugging. > >> > >> Or we can have a command line to enable this for platforms that want > >> to use it and have it default off. > > > > > Given the fragility of the current boot sequence (without this patch set) > > and the potential breakage of existing systems, I think that if we choose > > to accept this patch set that it should first bake in the -next tree for > > at least one major release cycle. Maybe even two major release cycles. > > I probably didn't state that very well. I was trying to not sound like > I was accusing this patch series of being fragile. The issue is that > adding the patches to systems that weren't expecting the new ordering > may cause boot problems for some systems. I'm not concerned with > pointing fingers, just want to make sure that we proceed cautiously > until we know that the resulting system is robust. > > -Frank > > > > > -Frank > > > > > >> > >>>> + bool "Device links from DT bindings" > >>>> + help > >>>> + Common DT bindings like clocks, interconnects, etc represent a > >>>> + consumer device's dependency on suppliers devices. This option > >>>> + creates device links from these common bindings so that consumers are > >>>> + probed only after all their suppliers are active and suppliers can > >>>> + tell when all their consumers are active. > >>>> + > >>>> endif # OF > >>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c > >>>> index 04ad312fd85b..a53717168aca 100644 > >>>> --- a/drivers/of/platform.c > >>>> +++ b/drivers/of/platform.c > >>>> @@ -61,6 +61,57 @@ struct platform_device *of_find_device_by_node(struct device_node *np) > >>>> EXPORT_SYMBOL(of_find_device_by_node); > >>>> > >>>> #ifdef CONFIG_OF_ADDRESS > >>>> +static int of_link_binding(struct device *dev, char *binding, char *cell) > >>> > >>> Under CONFIG_OF_ADDRESS seems like a strange location. > >> > >> Yeah, but the rest of the file seems to be under this. So I'm not > >> touching that. I can probably move this function further down (close > >> to platform populate) if you want that. > >>> > >>>> +{ > >>>> + struct of_phandle_args sup_args; > >>>> + struct platform_device *sup_dev; > >>>> + unsigned int i = 0, links = 0; > >>>> + u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER; > >>>> + > >>>> + while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i, > >>>> + &sup_args)) { > >>>> + i++; > >>>> + sup_dev = of_find_device_by_node(sup_args.np); > >>>> + if (!sup_dev) > >>>> + continue; > >>>> + if (device_link_add(dev, &sup_dev->dev, dl_flags)) > >>>> + links++; > >>>> + put_device(&sup_dev->dev); > >>>> + } > >>>> + if (links < i) > >>>> + return -ENODEV; > >>>> + return 0; > >>>> +} > >>>> + > >>>> +/* > >>>> + * List of bindings and their cell names (use NULL if no cell names) from which > >>>> + * device links need to be created. > >>>> + */ > >>>> +static char *link_bindings[] = { > >>> > >>> const > >> > >> Ack > >> > >>> > >>>> +#ifdef CONFIG_OF_DEVLINKS > >>>> + "clocks", "#clock-cells", > >>>> + "interconnects", "#interconnect-cells", > >>> > >>> Planning to add others? > >> > >> Not in this patch. > >> > >> Regulators are the other big missing piece that I'm aware of now but > >> they need a lot of discussion (see email from David and my reply). > >> > >> Not sure what other resources are shared where they can be "turned > >> off" and cause devices set up at boot to fail. For example, I don't > >> think interrupts need functional dependency tracking because they > >> aren't really turned off by consumer 1 in a way that breaks things for > >> consumer 2. Just masked and the consumer 2 can unmask and use it once > >> it probes. > >> > >> I'm only intimately familiar with clocks, interconnects and regulators > >> (to some extent). I'm open to adding other supplier categories in > >> future patches as I educate myself of those or if other people want to > >> add support for more categories. > >> > >> -Saravana > >> > >>>> +#endif > >>>> +}; > >>>> + > >>>> +static int of_link_to_suppliers(struct device *dev) > >>>> +{ > >>>> + unsigned int i = 0; > >>>> + bool done = true; > >>>> + > >>>> + if (unlikely(!dev->of_node)) > >>>> + return 0; > >>>> + > >>>> + for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++) > >>>> + if (of_link_binding(dev, link_bindings[i * 2], > >>>> + link_bindings[i * 2 + 1])) > >>>> + done = false; > >>>> + > >>>> + if (!done) > >>>> + return -ENODEV; > >>>> + return 0; > >>>> +} > >>>> + > >>>> /* > >>>> * The following routines scan a subtree and registers a device for > >>>> * each applicable node. > >>>> @@ -524,6 +575,7 @@ static int __init of_platform_default_populate_init(void) > >>>> if (!of_have_populated_dt()) > >>>> return -ENODEV; > >>>> > >>>> + platform_bus_type.add_links = of_link_to_suppliers; > >>>> /* > >>>> * Handle certain compatibles explicitly, since we don't want to create > >>>> * platform_devices for every node in /reserved-memory with a > >>>> -- > >>>> 2.22.0.410.gd8fdbe21b5-goog > >>>> > >> > > > > >