2018-03-01 17:44 GMT+01:00 David Lechner <david@xxxxxxxxxxxxxx>: > On 03/01/2018 02:36 AM, Bartosz Golaszewski wrote: >> >> 2018-02-28 22:40 GMT+01:00 David Lechner <david@xxxxxxxxxxxxxx>: >>> >>> On 02/28/2018 06:38 AM, Bartosz Golaszewski wrote: >>>> >>>> >>>> >>>> I think I found the reason for the strange crashes we were >>>> experiencing (emac core->name being NULL) thanks to Sekhar who pointed >>>> me in the right direction. >>>> >>>> The mdio driver fails to probe with v7 due to the supplied clock rate >>>> being wrong. Before failing we register the emac clock with >>>> pm_clk_add_clk(). When clock_ops puts the clock, it decreases the >>>> reference count of the clock, but we never actually increased it in >>>> the first place in the line above. The core clock code then destroys >>>> the associated clk_core structure. When the next user comes around (in >>>> our case the clk debug functions) the system crashes. >>>> >>>> I believe there to be two issues: one is with v7 - we need to increase >>>> the clock reference count in davinci_psc_genpd_attach_dev(). >>>> >>>> Second is the error path in the clock framework - we should remove the >>>> destroyed clk_core from the debug list, which is not being done now. >>>> >>>> Why we even need to track the refcount of clk_core is a mistery for me >>>> though. Stephen, Mike? >>>> >>>> Best regards, >>>> Bartosz Golaszewski >>> >>> >>> >>> Great find. I figured it had to be something like this, but I wasn't >>> able to reproduce the problem yet. >>> >>> I suppose it is time to spin up a v8 with some fixes. >> >> >> I still don't know why the mdio clock rate is much lower than in >> mainline though. Any ideas? >> >> Thanks, >> Bart >> > > Now that you have fixed the crash, can you answer the questions I have > asked earlier? > >> Can you post the output of this command so that I can see how your > > clocks are setup: > > cat /sys/kernel/debug/clk/clk_summary > >> Using your workaround, can you run: > > > cat /sys/kernel/debug/pm_genpd/pm_genpd_summary > > If you see: > 1e27000.clock-controller: emac off-0 > > then genpd is not working like it is supposed to. You should see something > like this for device that are working: > 1e27000.clock-controller: uart2 on > /devices/platform/soc@1c00000/1d0d000.serial active Hi David, Sekhar, I tried booting the board today over tftp but didn't succeed. I then switched to a normal boot from SD card and the boot process froze at the same moment (right after the DHCP config, or after rtc config if I disabled DHCP in bootargs). I then realized that the emac clock can't be the culprit. After some digging I found out that the late_initcall to clk_disable_unused() disables sysclk6 - the parent of the arm clock, which of course freezes the device. If I remove the call to clk_disable_unused(), I can boot just fine. The following other clocks are disabled before pll0_sysclk6: pll1_sysclk3 pll0_obsclk pll0_sysclk7 davinci_lpsc_clk_enable() is never called for these clocks - in fact it's not called for any parent that's not explicitly defined in psc-da850.c - I believe this may be one of the reasons. I will get back to debugging it tomorrow. Best regards, Bartosz Golaszewski -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html