On 7/15/19 11:40 AM, Saravana Kannan wrote: > Replying again because the previous email accidentally included HTML. > > Thanks for taking the time to reconsider the wording Frank. Your > intention was clear to me in the first email too. > > A kernel command line option can also completely disable this > functionality easily and cleanly. Can we pick that as an option? I've > an implementation of that in the v5 series I sent out last week. Yes, Rob suggested a command line option for debugging, and I am fine with that. But even with that, I would like a lot of testing so that we have a chance of finding systems that have trouble with the changes and could potentially be fixed before impacting a large number of users. -Frank > > -Saravana > > On Mon, Jul 15, 2019 at 7:39 AM Frank Rowand <frowand.list@xxxxxxxxx> wrote: >> >> On 7/15/19 7:26 AM, Frank Rowand wrote: >>> HiRob, >>> >>> Sorry for such a late reply... >>> >>> >>> On 7/1/19 8:25 PM, Saravana Kannan wrote: >>>> On Mon, Jul 1, 2019 at 6:32 PM Rob Herring <robh+dt@xxxxxxxxxx> wrote: >>>>> >>>>> On Mon, Jul 1, 2019 at 6:48 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote: >>>>>> >>>>>> Add device-links after the devices are created (but before they are >>>>>> probed) by looking at common DT bindings like clocks and >>>>>> interconnects. >>>>>> >>>>>> Automatically adding device-links for functional dependencies at the >>>>>> framework level provides the following benefits: >>>>>> >>>>>> - Optimizes device probe order and avoids the useless work of >>>>>> attempting probes of devices that will not probe successfully >>>>>> (because their suppliers aren't present or haven't probed yet). >>>>>> >>>>>> For example, in a commonly available mobile SoC, registering just >>>>>> one consumer device's driver at an initcall level earlier than the >>>>>> supplier device's driver causes 11 failed probe attempts before the >>>>>> consumer device probes successfully. This was with a kernel with all >>>>>> the drivers statically compiled in. This problem gets a lot worse if >>>>>> all the drivers are loaded as modules without direct symbol >>>>>> dependencies. >>>>>> >>>>>> - Supplier devices like clock providers, interconnect providers, etc >>>>>> need to keep the resources they provide active and at a particular >>>>>> state(s) during boot up even if their current set of consumers don't >>>>>> request the resource to be active. This is because the rest of the >>>>>> consumers might not have probed yet and turning off the resource >>>>>> before all the consumers have probed could lead to a hang or >>>>>> undesired user experience. >>>>>> >>>>>> Some frameworks (Eg: regulator) handle this today by turning off >>>>>> "unused" resources at late_initcall_sync and hoping all the devices >>>>>> have probed by then. This is not a valid assumption for systems with >>>>>> loadable modules. Other frameworks (Eg: clock) just don't handle >>>>>> this due to the lack of a clear signal for when they can turn off >>>>>> resources. This leads to downstream hacks to handle cases like this >>>>>> that can easily be solved in the upstream kernel. >>>>>> >>>>>> By linking devices before they are probed, we give suppliers a clear >>>>>> count of the number of dependent consumers. Once all of the >>>>>> consumers are active, the suppliers can turn off the unused >>>>>> resources without making assumptions about the number of consumers. >>>>>> >>>>>> By default we just add device-links to track "driver presence" (probe >>>>>> succeeded) of the supplier device. If any other functionality provided >>>>>> by device-links are needed, it is left to the consumer/supplier >>>>>> devices to change the link when they probe. >>>>>> >>>>>> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx> >>>>>> --- >>>>>> drivers/of/Kconfig | 9 ++++++++ >>>>>> drivers/of/platform.c | 52 +++++++++++++++++++++++++++++++++++++++++++ >>>>>> 2 files changed, 61 insertions(+) >>>>>> >>>>>> diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig >>>>>> index 37c2ccbefecd..7c7fa7394b4c 100644 >>>>>> --- a/drivers/of/Kconfig >>>>>> +++ b/drivers/of/Kconfig >>>>>> @@ -103,4 +103,13 @@ config OF_OVERLAY >>>>>> config OF_NUMA >>>>>> bool >>>>>> >>>>>> +config OF_DEVLINKS >>>>> >>>>> I'd prefer this not be a config option. After all, we want one kernel >>>>> build that works for all platforms. >>>> >>>> We need a lot more changes before one kernel build can work for all >>>> platforms. At least until then, I think we need this. Lot less chance >>>> of breaking existing platforms before all the missing pieces are >>>> created. >>>> >>>>> A kernel command line option to disable might be useful for debugging. >>>> >>>> Or we can have a command line to enable this for platforms that want >>>> to use it and have it default off. >>> >> >>> Given the fragility of the current boot sequence (without this patch set) >>> and the potential breakage of existing systems, I think that if we choose >>> to accept this patch set that it should first bake in the -next tree for >>> at least one major release cycle. Maybe even two major release cycles. >> >> I probably didn't state that very well. I was trying to not sound like >> I was accusing this patch series of being fragile. The issue is that >> adding the patches to systems that weren't expecting the new ordering >> may cause boot problems for some systems. I'm not concerned with >> pointing fingers, just want to make sure that we proceed cautiously >> until we know that the resulting system is robust. >> >> -Frank >> >>> >>> -Frank >>> >>> >>>> >>>>>> + bool "Device links from DT bindings" >>>>>> + help >>>>>> + Common DT bindings like clocks, interconnects, etc represent a >>>>>> + consumer device's dependency on suppliers devices. This option >>>>>> + creates device links from these common bindings so that consumers are >>>>>> + probed only after all their suppliers are active and suppliers can >>>>>> + tell when all their consumers are active. >>>>>> + >>>>>> endif # OF >>>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c >>>>>> index 04ad312fd85b..a53717168aca 100644 >>>>>> --- a/drivers/of/platform.c >>>>>> +++ b/drivers/of/platform.c >>>>>> @@ -61,6 +61,57 @@ struct platform_device *of_find_device_by_node(struct device_node *np) >>>>>> EXPORT_SYMBOL(of_find_device_by_node); >>>>>> >>>>>> #ifdef CONFIG_OF_ADDRESS >>>>>> +static int of_link_binding(struct device *dev, char *binding, char *cell) >>>>> >>>>> Under CONFIG_OF_ADDRESS seems like a strange location. >>>> >>>> Yeah, but the rest of the file seems to be under this. So I'm not >>>> touching that. I can probably move this function further down (close >>>> to platform populate) if you want that. >>>>> >>>>>> +{ >>>>>> + struct of_phandle_args sup_args; >>>>>> + struct platform_device *sup_dev; >>>>>> + unsigned int i = 0, links = 0; >>>>>> + u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER; >>>>>> + >>>>>> + while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i, >>>>>> + &sup_args)) { >>>>>> + i++; >>>>>> + sup_dev = of_find_device_by_node(sup_args.np); >>>>>> + if (!sup_dev) >>>>>> + continue; >>>>>> + if (device_link_add(dev, &sup_dev->dev, dl_flags)) >>>>>> + links++; >>>>>> + put_device(&sup_dev->dev); >>>>>> + } >>>>>> + if (links < i) >>>>>> + return -ENODEV; >>>>>> + return 0; >>>>>> +} >>>>>> + >>>>>> +/* >>>>>> + * List of bindings and their cell names (use NULL if no cell names) from which >>>>>> + * device links need to be created. >>>>>> + */ >>>>>> +static char *link_bindings[] = { >>>>> >>>>> const >>>> >>>> Ack >>>> >>>>> >>>>>> +#ifdef CONFIG_OF_DEVLINKS >>>>>> + "clocks", "#clock-cells", >>>>>> + "interconnects", "#interconnect-cells", >>>>> >>>>> Planning to add others? >>>> >>>> Not in this patch. >>>> >>>> Regulators are the other big missing piece that I'm aware of now but >>>> they need a lot of discussion (see email from David and my reply). >>>> >>>> Not sure what other resources are shared where they can be "turned >>>> off" and cause devices set up at boot to fail. For example, I don't >>>> think interrupts need functional dependency tracking because they >>>> aren't really turned off by consumer 1 in a way that breaks things for >>>> consumer 2. Just masked and the consumer 2 can unmask and use it once >>>> it probes. >>>> >>>> I'm only intimately familiar with clocks, interconnects and regulators >>>> (to some extent). I'm open to adding other supplier categories in >>>> future patches as I educate myself of those or if other people want to >>>> add support for more categories. >>>> >>>> -Saravana >>>> >>>>>> +#endif >>>>>> +}; >>>>>> + >>>>>> +static int of_link_to_suppliers(struct device *dev) >>>>>> +{ >>>>>> + unsigned int i = 0; >>>>>> + bool done = true; >>>>>> + >>>>>> + if (unlikely(!dev->of_node)) >>>>>> + return 0; >>>>>> + >>>>>> + for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++) >>>>>> + if (of_link_binding(dev, link_bindings[i * 2], >>>>>> + link_bindings[i * 2 + 1])) >>>>>> + done = false; >>>>>> + >>>>>> + if (!done) >>>>>> + return -ENODEV; >>>>>> + return 0; >>>>>> +} >>>>>> + >>>>>> /* >>>>>> * The following routines scan a subtree and registers a device for >>>>>> * each applicable node. >>>>>> @@ -524,6 +575,7 @@ static int __init of_platform_default_populate_init(void) >>>>>> if (!of_have_populated_dt()) >>>>>> return -ENODEV; >>>>>> >>>>>> + platform_bus_type.add_links = of_link_to_suppliers; >>>>>> /* >>>>>> * Handle certain compatibles explicitly, since we don't want to create >>>>>> * platform_devices for every node in /reserved-memory with a >>>>>> -- >>>>>> 2.22.0.410.gd8fdbe21b5-goog >>>>>> >>>> >>> >>> >> >