On Fri, Oct 22, 2021 at 10:00 AM Amit Pundir <amit.pundir@xxxxxxxxxx> wrote: > > On Fri, 22 Oct 2021 at 05:13, Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > > > On Thu, Oct 21, 2021 at 4:21 AM Amit Pundir <amit.pundir@xxxxxxxxxx> wrote: > > > > > > Hi Saravana, > > > > > > This patch broke v5.15-rc6 on RB5 (sm8250 | qcom/qrb5165-rb5.dts). > > > I can't boot past this point https://www.irccloud.com/pastebin/raw/Nv6ZwHmW. > > > > Amit top posting? How did that happen? :) > > > > The fact you are seeing this issue is super strange though. The driver > > literally does nothing other than allowing some sync_state() callbacks > > to happen. I also grepped for the occurence of "simple-bus" in > > arch/arm64/boot/dts/qcom/ and the only instance for 8250 is for the > > soc node. > > > > The only thing I can think of is that without my patch some > > sync_state() callbacks weren't getting called and maybe it was masking > > some other issue. > > > > Can you try to boot with this log (see log patch below) and see if the > > device hangs right after a sync_state() callback? Also, looking at the > > different sync_state() implementations in upstream, I'm guessing one > > of the devices isn't voting for interconnect bandwidth when it should > > have. > > > > Another thing you could do is boot without the simple-bus changes and > > then look for all instances of "state_synced" in /sys/devices and then > > see if any of them has the value "0" after boot up is complete. > > Turned out RB5 is not even reaching up to > device_links_flush_sync_list() and seem to be stuck somewhere in > device_links_driver_bound(). So I added more print logs to narrow down > to any specific lock state but those additional prints seem to have > added enough delay to unblock that particular driver (Serial: > 8250/16550 driver if I understood the logs correctly) and I eventually > booted to UI. Ugh... I think I know what's going on. It popped into my head over the weekend. Couple of ways to confirm my theory: 1. After it finishes booting in both cases, can you compare the output of the command below? I'm expecting to see a significant drop in the number of device links. ls -l /sys/class/devlink | wc -l 2. Can you try out this terrible hack patch (not final fix, no code reviews please) on top of Tot to see if it fixes your issue without having to add hacky logs? Thanks, Saravana --- a/drivers/bus/simple-pm-bus.c +++ b/drivers/bus/simple-pm-bus.c @@ -38,10 +38,12 @@ static int simple_pm_bus_probe(struct platform_device *pdev) * a device that has a more specific driver. */ if (match && match->data) { - if (of_property_match_string(np, "compatible", match->compatible) == 0) + if (of_property_match_string(np, "compatible", match->compatible) == 0) { + of_platform_populate(np, NULL, lookup, &pdev->dev); return 0; - else + } else { return -ENODEV; + } }