Hi Neil, On Mon, 21 Oct 2019 at 19:55, Neil Armstrong <narmstrong@xxxxxxxxxxxx> wrote: > > Hi Anand, > > On 21/10/2019 16:11, Anand Moon wrote: > > Hi Martin, > > > > On Fri, 18 Oct 2019 at 23:40, Martin Blumenstingl > > <martin.blumenstingl@xxxxxxxxxxxxxx> wrote: > >> > >> Hi Anand, > >> > >> On Fri, Oct 18, 2019 at 4:04 PM Anand Moon <linux.amoon@xxxxxxxxx> wrote: > >> [...] > >>>> Next step it to try narrow down the clock causing the issue. > >>>> Remove clk_ignore_unused from the command line and add CLK_INGORE_UNUSED > >>>> to the flag of some clocks your clock controller (g12a I think) until > >>>> > >>>> The peripheral clock gates already have this flag (something we should > >>>> fix someday) so don't bother looking there. > >>>> > >>>> Most likely the source of the pwm is getting disabled between the > >>>> late_init call and the probe of the PWM module. Since the pwm is already > >>>> active (w/o a driver), gating the clock source shuts dowm the power to > >>>> the cores. > >>>> > >>>> Looking a the possible inputs in pwm driver, I'd bet on fdiv4. > >>>> > >>> > >>> I had give this above steps a try but with little success. > >>> I am still looking into this much close. > >> it's not clear to me if you have only tested with the PWM and/or > >> FCLK_DIV4 clocks. can you please describe what you have tested so far? > >> > > Sorry for delayed response. > > > > I had just looked into clk related to SD_EMMC_A/B/C, > > with adding CLK_IGNORE/CRITICAL. > > Also looked into clk_summary for eMMC and microSD card, > > to identify the root cause, but I failed to move ahead. > > > >> for reference - my way of debugging this in the past was: > >> 1. add some printks to clk_disable_unused_subtree (right after the > >> clk_core_is_enabled check) to see which clocks are being disabled > >> 2. add CLK_IGNORE_UNUSED or CLK_IS_CRITICAL to the clocks which are > >> being disabled based on the information from step #1 > >> 3. (at some point I had a working kernel with lots of clocks with > >> CLK_IGNORE_UNUSED/CLK_IS_CRITICAL) > >> 4. start dropping the CLK_IGNORE_UNUSED/CLK_IS_CRITICAL flags again > >> until you have traced it down to the clocks that are the actual issue > >> (so far I always had only one clock which caused issues, but it may be > >> multiple) > >> 5. investigate (and/or ask on the mailing list, Amlogic developers are > >> reading the mails here as well) for the few clocks from step #4 > >> > > > > Thanks for you valuable suggestion. I have your patch to debug this > > [0] https://patchwork.kernel.org/patch/9725921/mbox/ > > > > So from the fist step I could identify that all the clk were getting closed > > after some core cpu clk was failing. Here is the log. > > > > step1: [1] https://pastebin.com/p13F9HGG > > > > so I marked these clk as CLK_IGNORE_UNUSED and finally > > I made it to boot using microSD card. > > > > After this just I converted these CLK to CLK_IS_CRITICAL > > as mostly these are used the CPU clk for now. > > Here is boot log successful for as of now. > > > > Finally: [2] https://pastebin.com/qB6pMyGQ > > > > I know clk maintainer are against marking flags as *CLK_IS_CRITICAL* > > But this is just the step to move ahead. > > Thanks for the extensive debug. > > > > > Attach is my local clk and dts patch.Just for testing. > > [3] clk_critical.patch > > > Could you test with only the following changes: > diff --git a/drivers/clk/meson/g12a.c b/drivers/clk/meson/g12a.c > index ea4c791f106d..f49f5463363e 100644 > --- a/drivers/clk/meson/g12a.c > +++ b/drivers/clk/meson/g12a.c > @@ -298,6 +298,7 @@ static struct clk_regmap g12a_fclk_div2 = { > &g12a_fclk_div2_div.hw > }, > .num_parents = 1, > + .flags = CLK_IS_CRITICAL, > }, > }; > > @@ -672,7 +673,7 @@ static struct clk_regmap g12b_cpub_clk = { > &g12a_sys_pll.hw > }, > .num_parents = 2, > - .flags = CLK_SET_RATE_PARENT, > + .flags = CLK_SET_RATE_PARENT | CLK_IS_CRITICAL, > }, > }; > Yes these changes work at my end, I want to narrow down my changes, this looks pretty good. Best Regards -Anand