On Fri, Oct 2, 2020 at 5:14 PM Laurent Pinchart <laurent.pinchart@xxxxxxxxxxxxxxxx> wrote: > > Hi Saravana, > > On Fri, Oct 02, 2020 at 12:56:30PM -0700, Saravana Kannan wrote: > > On Fri, Oct 2, 2020 at 11:35 AM 'Grygorii Strashko' via kernel-team wrote: > > > On 02/10/2020 21:27, Laurent Pinchart wrote: > > > > On Fri, Oct 02, 2020 at 10:58:55AM -0700, Saravana Kannan wrote: > > > >> On Fri, Oct 2, 2020 at 10:55 AM Laurent Pinchart wrote: > > > >>> On Fri, Oct 02, 2020 at 10:51:51AM -0700, Saravana Kannan wrote: > > > >>>> On Fri, Oct 2, 2020 at 7:08 AM Rob Herring <robh+dt@xxxxxxxxxx> wrote: > > > >>>>> On Thu, Oct 1, 2020 at 5:59 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > > >>>>>> > > > >>>>>> When commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when > > > >>>>>> adding all top level devices") optimized the fwnode parsing when all top > > > >>>>>> level devices are added, it missed out optimizing this for platform > > > >>>>>> where the top level devices are added through the init_machine() path. > > > >>>>>> > > > >>>>>> This commit does the optimization for all paths by simply moving the > > > >>>>>> fw_devlink_pause/resume() inside of_platform_default_populate(). > > > >>>>>> > > > >>>>>> Reported-by: Tomi Valkeinen <tomi.valkeinen@xxxxxx> > > > >>>>>> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx> > > > >>>>>> --- > > > >>>>>> drivers/of/platform.c | 19 +++++++++++++++---- > > > >>>>>> 1 file changed, 15 insertions(+), 4 deletions(-) > > > >>>>>> > > > >>>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > > >>>>>> index 071f04da32c8..79972e49b539 100644 > > > >>>>>> --- a/drivers/of/platform.c > > > >>>>>> +++ b/drivers/of/platform.c > > > >>>>>> @@ -501,8 +501,21 @@ int of_platform_default_populate(struct device_node *root, > > > >>>>>> const struct of_dev_auxdata *lookup, > > > >>>>>> struct device *parent) > > > >>>>>> { > > > >>>>>> - return of_platform_populate(root, of_default_bus_match_table, lookup, > > > >>>>>> - parent); > > > >>>>>> + int ret; > > > >>>>>> + > > > >>>>>> + /* > > > >>>>>> + * fw_devlink_pause/resume() are only safe to be called around top > > > >>>>>> + * level device addition due to locking constraints. > > > >>>>>> + */ > > > >>>>>> + if (!root) > > > >>>>>> + fw_devlink_pause(); > > > >>>>>> + > > > >>>>>> + ret = of_platform_populate(root, of_default_bus_match_table, lookup, > > > >>>>>> + parent); > > > >>>>> > > > >>>>> of_platform_default_populate() vs. of_platform_populate() is just a > > > >>>>> different match table. I don't think the behavior should otherwise be > > > >>>>> different. > > > >>>>> > > > >>>>> There's also of_platform_probe() which has slightly different matching > > > >>>>> behavior. It should not behave differently either with respect to > > > >>>>> devlinks. > > > >>>> > > > >>>> So I'm trying to do this only when the top level devices are added for > > > >>>> the first time. of_platform_default_populate() seems to be the most > > > >>>> common path. For other cases, I think we just need to call > > > >>>> fw_devlink_pause/resume() wherever the top level devices are added for > > > >>>> the first time. As I said in the other email, we can't add > > > >>>> fw_devlink_pause/resume() by default to of_platform_populate(). > > > >>>> > > > >>>> Do you have other ideas for achieving "call fw_devlink_pause/resume() > > > >>>> only when top level devices are added for the first time"? > > > >>> > > > >>> I'm not an expert in this domain, but before investigating it, would you > > > >>> be able to share a hack patch that implements this (in the most simple > > > >>> way) to check if it actually fixes the delays I experience on my system > > > >>> ? > > > >> > > > >> So I take it the patch I sent out didn't work for you? Can you tell me > > > >> what machine/DT you are using? > > > > > > > > I've replied to the patch: > > > > > > > > Based on v5.9-rc5, before the patch: > > > > > > > > [ 0.652887] cpuidle: using governor menu > > > > [ 12.349476] No ATAGs? > > > > > > > > After the patch: > > > > > > > > [ 0.650460] cpuidle: using governor menu > > > > [ 12.262101] No ATAGs? > > > > > > > > I'm using an AM57xx EVM, whose DT is not upstream, but it's essentially > > > > a am57xx-beagle-x15-revb1.dts (it includes that DTS) with a few > > > > additional nodes for GPIO keys, LCD panel, backlight and touchscreen. > > > > > > > > > > hope you are receiving my mails as I've provided you with all required information already [1] > > > > Laurent/Grygorii, > > > > Looks like I'm definitely missing emails. Sorry about the confusion. > > > > I have some other urgent things on my plate right now. Is it okay if I > > get to this in a day or two? In the end, we'll find a solution that > > addresses most/all of the delay. > > No issue on my side. Hi Laurent, Sorry it took awhile for me to get back to this. Can you try throwing around fw_devlink_pause/resume() around the of_platform_populate() call in arch/arm/mach-omap2/pdata-quirks.c? Just trying to verify the cause/fix. If it fixes the issue, then considering Rob's comments [1], a good short term solution might be to have the suggestion above and some way to do pause/resume only when the top level devices are added. > By the way, during initial investigations, I've traced code paths to > figure out if there was a particular step that would consume a large > amount of time, and found out that of_platform_populate() ends up > executing devlink-related code that seems to have an O(n^3) complexity > on the number of devices, with a few dozens of milliseconds for each > iteration. That's a very bad complexity. As you said, the complexity of fw_devlink parsing can be O(N^2). There are other ways to improve it to make it O(N) but it has a bunch of additional complexity and memory increase. When I tried to do it that way the first time, I was question whether O(N^2) actually translated to measurable difference. Looks like we do now :) I have something in mind for how to do it with O(N) complexity, but I expect it to take a while to get in. So in the meantime, I'm thinking of using fw_devlink_pause/resume() as a short term optimization. -Saravana [1] - https://lore.kernel.org/linux-omap/CAL_Jsq+6mxtFei3+1ic4c5XCftJ8nZK6_S5_d15yEXQ02BTNKw@xxxxxxxxxxxxxx/