> -----Original Message----- > From: Lucas Stach [mailto:l.stach@xxxxxxxxxxxxxx] > Sent: 2021年1月6日 23:10 > To: Bough Chen <haibo.chen@xxxxxxx>; Fabio Estevam > <festevam@xxxxxxxxx>; Angus Ainslie <angus@xxxxxxxx>; Peng Fan > <peng.fan@xxxxxxx>; Abel Vesa <abel.vesa@xxxxxxx>; Stephen Boyd > <sboyd@xxxxxxxxxx>; Michael Turquette <mturquette@xxxxxxxxxxxx> > Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>; Guido Günther <agx@xxxxxxxxxxx>; > linux-mmc <linux-mmc@xxxxxxxxxxxxxxx>; Adrian Hunter > <adrian.hunter@xxxxxxxxx>; dl-linux-imx <linux-imx@xxxxxxx>; Sascha Hauer > <kernel@xxxxxxxxxxxxxx>; moderated list:ARM/FREESCALE IMX / MXC ARM > ARCHITECTURE <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx> > Subject: Re: sdhci timeout on imx8mq > > Hi Bough, > > Am Mittwoch, dem 06.01.2021 um 09:29 +0000 schrieb Bough Chen: > > > -----Original Message----- > > > From: Lucas Stach [mailto:l.stach@xxxxxxxxxxxxxx] > > > Sent: 2021年1月5日 23:07 > > > To: Bough Chen <haibo.chen@xxxxxxx>; Fabio Estevam > > > <festevam@xxxxxxxxx>; Angus Ainslie <angus@xxxxxxxx>; Leonard > > > Crestez <leonard.crestez@xxxxxxx>; Peng Fan <peng.fan@xxxxxxx>; Abel > > > Vesa <abel.vesa@xxxxxxx>; Stephen Boyd <sboyd@xxxxxxxxxx>; Michael > > > Turquette <mturquette@xxxxxxxxxxxx> > > > Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>; Guido Günther < > > > agx@xxxxxxxxxxx>; linux-mmc <linux-mmc@xxxxxxxxxxxxxxx>; Adrian > > > Hunter <adrian.hunter@xxxxxxxxx>; dl-linux-imx <linux-imx@xxxxxxx>; > > > Sascha Hauer <kernel@xxxxxxxxxxxxxx>; moderated list:ARM/FREESCALE > > > IMX / MXC ARM ARCHITECTURE <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx> > > > Subject: Re: sdhci timeout on imx8mq > > > > > > Hi all, > > > > > > Am Mittwoch, dem 08.07.2020 um 01:32 +0000 schrieb BOUGH CHEN: > > > > > -----Original Message----- > > > > > From: Fabio Estevam [mailto:festevam@xxxxxxxxx] > > > > > Sent: 2020年7月7日 20:45 > > > > > To: Angus Ainslie <angus@xxxxxxxx> > > > > > Cc: BOUGH CHEN <haibo.chen@xxxxxxx>; Ulf Hansson > > > > > <ulf.hansson@xxxxxxxxxx>; Guido Günther <agx@xxxxxxxxxxx>; > > > > > linux- > > > > > mmc <linux-mmc@xxxxxxxxxxxxxxx>; Adrian Hunter > > > > > <adrian.hunter@xxxxxxxxx>; dl-linux-imx <linux-imx@xxxxxxx>; > > > > > Sascha Hauer < kernel@xxxxxxxxxxxxxx>; moderated > > > > > list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE > > > > > <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx> > > > > > Subject: Re: sdhci timeout on imx8mq > > > > > > > > > > Hi Angus, > > > > > > > > > > On Tue, Jun 30, 2020 at 4:39 PM Angus Ainslie <angus@xxxxxxxx> > > > > > wrote: > > > > > > > > > > > Has there been any progress with this. I'm getting this on > > > > > > about 50% of > > > > > > > > > > Not from my side, sorry. > > > > > > > > > > Bough, > > > > > > > > > > Do you know why this problem affects the imx8mq-evk versions > > > > > that are populated with the Micron eMMC and not the ones with > > > > > Sandisk eMMC? > > > > > > > > Hi Angus, > > > > > > > > Can you show me the full fail log? I do not meet this issue on my > > > > side, besides, which kind of uboot do you use? > > > > > > I was finally able to bisect this issue, which wasn't that much fun > > > due to the issue not being reproducible 100%. :/ Turns out that the > > > issue is even more interesting than I thought and likely doesn't > > > have anything to do with SDHCI or used bootloader versions. Here's > > > my current debugging state: > > > > > > I've bisected the issue down to b04383b6a558 (clk: imx8mq: Define > > > gates for > > > pll1/2 fixed dividers). The change itself looks fine to me, still > > > CC'ed Leonard for good measure. > > > > > > In my testing the following partial revert fixes the issue: > > > > > > --- a/drivers/clk/imx/clk-imx8mq.c > > > +++ b/drivers/clk/imx/clk-imx8mq.c > > > @@ -365,7 +365,7 @@ static int imx8mq_clocks_probe(struct > > > platform_device *pdev) > > > hws[IMX8MQ_SYS1_PLL_133M_CG] = > > > imx_clk_hw_gate("sys1_pll_133m_cg", "sys1_pll_out", base + 0x30, > > > 15); > > > hws[IMX8MQ_SYS1_PLL_160M_CG] = > > > imx_clk_hw_gate("sys1_pll_160m_cg", "sys1_pll_out", base + 0x30, > > > 17); > > > hws[IMX8MQ_SYS1_PLL_200M_CG] = > > > imx_clk_hw_gate("sys1_pll_200m_cg", "sys1_pll_out", base + 0x30, 19); > > > - hws[IMX8MQ_SYS1_PLL_266M_CG] = > > > imx_clk_hw_gate("sys1_pll_266m_cg", "sys1_pll_out", base + 0x30, > > > 21); > > > hws[IMX8MQ_SYS1_PLL_400M_CG] = > > > imx_clk_hw_gate("sys1_pll_400m_cg", "sys1_pll_out", base + 0x30, > > > 23); > > > hws[IMX8MQ_SYS1_PLL_800M_CG] = > > > imx_clk_hw_gate("sys1_pll_800m_cg", "sys1_pll_out", base + 0x30, > > > 25); > > > > > > @@ -375,7 +375,7 @@ static int imx8mq_clocks_probe(struct > > > platform_device *pdev) > > > hws[IMX8MQ_SYS1_PLL_133M] = > > > imx_clk_hw_fixed_factor("sys1_pll_133m", "sys1_pll_133m_cg", 1, 6); > > > hws[IMX8MQ_SYS1_PLL_160M] = > > > imx_clk_hw_fixed_factor("sys1_pll_160m", "sys1_pll_160m_cg", 1, 5); > > > hws[IMX8MQ_SYS1_PLL_200M] = > > > imx_clk_hw_fixed_factor("sys1_pll_200m", "sys1_pll_200m_cg", 1, 4); > > > - hws[IMX8MQ_SYS1_PLL_266M] = > > > imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_266m_cg", 1, 3); > > > + hws[IMX8MQ_SYS1_PLL_266M] = > > > + imx_clk_hw_fixed_factor("sys1_pll_266m", "sys1_pll_out", 1, 3); > > > hws[IMX8MQ_SYS1_PLL_400M] = > > > imx_clk_hw_fixed_factor("sys1_pll_400m", "sys1_pll_400m_cg", 1, 2); > > > hws[IMX8MQ_SYS1_PLL_800M] = > > > imx_clk_hw_fixed_factor("sys1_pll_800m", "sys1_pll_800m_cg", 1, 1); > > > > > > The sys1_pll_266m is the parent of nand_usdhc_bus. I've validated > > > that the SDHCI driver properly enables this bus clock across the > > > problematic card access. > > > So what I think is happening here is that both nand_usdhc_bus and > > > sys1_pll_266m are initially enabled. Sometime during boot > > > sys1_pll_266m gets disabled due to runtime PM on the enet_axi clock, > > > which is a direct child of sys1_pll_266m. At this point > > > nand_usdhc_bus is still enabled, but no consumer has claimed the > > > clock yet, so the parent clock gets disabled while this branch of > > > the clock tree is still active. > > > > Hi Lucas, > > > > According to the clock tree, if nand_usdhc_bus is still enabled, then > > sys1_pll_266m has no chance to disable. > > This statement is only correct after the SDHCI driver is probed an has enabled > nand_usdhc_bus. Before the driver probes the refcounts on the clocks are not > synchronized, so sys1_pll_266m_cg can be disabled, while nand_usdhc_bus is > enabled (from software running before Linux), even though no consumer is > using nand_usdhc_bus, yet. Yes, agree. For current case, uboot gate on the sys1_pll_266m, then boot the Linux. In Linux, after clock driver probe, due the the support of sys1_pll_266m_cg, this sys1_pll_266m is gate off by clock driver due to no default consumer. > > > sys1_pll_266m_cg 1 1 0 > 800000000 0 0 50000 Y > > sys1_pll_266m 1 1 0 > 266666666 0 0 50000 Y > > nand_usdhc_bus 0 0 0 > 266666666 0 0 50000 N > > nand_usdhc_rawnand_clk 0 0 0 > 266666666 0 0 50000 N > > enet_axi 1 1 0 > 266666666 0 0 50000 Y > > enet1_root_clk 2 2 0 > 266666666 0 0 50000 Y > > > > > > This issue seems related with the following errta: > > > > e11232: USDHC: uSDHC setting requirement for IPG_CLK and AHB_BUS > > clocks > > Description: uSDHC AHB_BUS and IPG_CLK clocks must be synchronized. > > Due to current physical design implementation, AHB_BUS and IPG_CLK > > must come from same clock source to maintain clock sync. > > Workaround: Set AHB_BUS and IPG_CLK to clock source from PLL1. > > > > After sys1_pll_266m gate off/on, seems need to sync the USDHC AHB bus > > and USDHC IPG_clk again. (Here usdhc AHB BUS source from > > nand_usdhc_bus.) > > This sync is handle by hardware, and maybe need some time, during this > > sync period, usdhc operation may has issue. > > Where in HW is this synchronization done? If it's at the uSDHC controller side, I > would expect this issue to show up even with the commit reverted, as > nand_usdhc_bus gets gated due to runtime PM from the controller side. The > only difference with the commit in question is that now the clock branch can be > gated _before_ nand_usdhc_bus. If the synchronization is done somewhere in > the clock tree than this might be an issue. > Not in uSDHC side. This synchronization should be done somewhere in clock tree(hardware side). > > > > I just double check our local v5.10 branch, already revert the commit > > b04383b6a558 (clk: imx8mq: Define gates for pll1/2 fixed dividers). > > So to fix this issue, one method is revert this patch, another method > > is keep the 'nand_usdhc_bus' always on. Add change like this: > > > > diff --git a/drivers/clk/imx/clk-imx8mq.c > > b/drivers/clk/imx/clk-imx8mq.c index 779ea69e639c..939806b36916 100644 > > --- a/drivers/clk/imx/clk-imx8mq.c > > +++ b/drivers/clk/imx/clk-imx8mq.c > > @@ -433,7 +433,7 @@ static int imx8mq_clocks_probe(struct > > platform_device *pdev) > > /* BUS */ > > hws[IMX8MQ_CLK_MAIN_AXI] = > > imx8m_clk_hw_composite_bus_critical("main_axi", imx8mq_main_axi_sels, > > base + 0x8800); > > hws[IMX8MQ_CLK_ENET_AXI] = > imx8m_clk_hw_composite_bus("enet_axi", imx8mq_enet_axi_sels, base + > 0x8880); > > - hws[IMX8MQ_CLK_NAND_USDHC_BUS] = > imx8m_clk_hw_composite_bus("nand_usdhc_bus", imx8mq_nand_usdhc_sels, > base + 0x8900); > > + hws[IMX8MQ_CLK_NAND_USDHC_BUS] = > > + imx8m_clk_hw_composite_bus_critical("nand_usdhc_bus", > > + imx8mq_nand_usdhc_sels, base + 0x8900); > > hws[IMX8MQ_CLK_VPU_BUS] = > > imx8m_clk_hw_composite_bus("vpu_bus", imx8mq_vpu_bus_sels, base + > > 0x8980); > > hws[IMX8MQ_CLK_DISP_AXI] = > > imx8m_clk_hw_composite_bus("disp_axi", imx8mq_disp_axi_sels, base + > > 0x8a00); > > hws[IMX8MQ_CLK_DISP_APB] = > > imx8m_clk_hw_composite_bus("disp_apb", imx8mq_disp_apb_sels, base + > > 0x8a80); > > > > > > What you think? Or any other suggestion? > > This is suboptimal, as it will not allow to gate the uSDHC controller AHB clock in > runtime suspend. Also my testing shows that it's the gate _before_ the > nand_usdhc_bus slice that's causing the issue. So my minimal fix from the > previous mail would still be better, as it allows to gate the nand_usdhc_bus > clock, while keeping sys1_pll_266m enabled. Whether to choose your minimal fix or revert the commit, let's involve clock team member, Abel/Jacky, any comment? Our local tree just revert this commit, I think there are some other reason, Jacky, could you help clarify that? Best Regards Haibo Chen > > Regards, > Lucas > >