Hi all, > Hi Jacky, > > Am Donnerstag, dem 07.01.2021 um 01:30 +0000 schrieb Jacky Bai: > > > -----Original Message----- > > > From: Fabio Estevam [mailto:festevam@xxxxxxxxx] > > > Sent: Thursday, January 7, 2021 2:57 AM > > > To: Lucas Stach <l.stach@xxxxxxxxxxxxxx> > > > Cc: Bough Chen <haibo.chen@xxxxxxx>; Angus Ainslie > > > <angus@xxxxxxxx>; > > > Leonard Crestez <leonard.crestez@xxxxxxx>; Peng Fan > > > <peng.fan@xxxxxxx>; Abel Vesa <abel.vesa@xxxxxxx>; Stephen Boyd > > > <sboyd@xxxxxxxxxx>; Michael Turquette <mturquette@xxxxxxxxxxxx>; > > > Ulf > > > Hansson <ulf.hansson@xxxxxxxxxx>; Guido Günther <agx@xxxxxxxxxxx>; > > > linux-mmc <linux-mmc@xxxxxxxxxxxxxxx>; Adrian Hunter > > > <adrian.hunter@xxxxxxxxx>; dl-linux-imx <linux-imx@xxxxxxx>; Sascha > > > Hauer <kernel@xxxxxxxxxxxxxx>; moderated list:ARM/FREESCALE IMX / > > > MXC > > > ARM ARCHITECTURE <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx> > > > Subject: Re: sdhci timeout on imx8mq > > > > > > Hi Lucas, > > > > > > On Tue, Jan 5, 2021 at 12:06 PM Lucas Stach > > > <l.stach@xxxxxxxxxxxxxx> > > > wrote: > > > > > > > The reference manual states about this situation: "For any clock, > > > > its > > > > source must be left on when it is kept on. Behavior is undefined > > > > if > > > > this rule is violated." > > > > And it seems this is exactly what's happening here: some kind of > > > > glitch is introduced in the nand_usdhc_bus clock, which prevents > > > > the > > > > SDHCI controller from working, even though the clock branch is > > > > properly enabled later on. On my system the SDHCI timeout and > > > > following runtime suspend/resume cycle on the nand_usdhc_bus > > > > clock > > > > seem to get it back into a working state. > > > > > > I think your analysis is correct and I recall helping a customer > > > with a similar > > > issue: > > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcomm > > > unity.nxp.com%2Ft5%2Fi-MX-Processors%2FExternal-clock-that-provide- > > > root > > > -clock-for-SAI3-and-SPDIF%2Fm-p%2F1019834&data=04%7C01%7Cping > > > .bai%40nxp.com%7C8d250a158cce469c378308d8b274d6d1%7C686ea1d3bc > > > 2b4c6fa92cd99c5c301635%7C0%7C0%7C637455562183497049%7CUnknow > > > n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha > > > WwiLCJXVCI6Mn0%3D%7C1000&sdata=VkxuhmhDifzOxxfIgFz9PR5gKC1 > > > SyQhGeSHYysX1Co4%3D&reserved=0 > > > > > > > For the customer case, it seem not the same issue. the customer issue > > is caused by clock source change while parent has no clock output. > > This is inherit limitation for the CCM clock slice when using the > > smart interface to change the clock parent. > > > > For current mmc timeout issue, I think we can have a try with > > nand_usdhc_bus clock gated at the beginning of kernel boot, directly > > modify the nand_usdhc_bus > > Clock's HW register gate bit in clock-imx8mq.c. > > While this might be an option to fix this specific issue, I would hope > we can come up with something more generic, as the current clock > framework behavior allows to violate the system specification > constraint that parent clocks must not be disabled when any of the > children are active. This seems like a fundamental issue and might hurt > us also with other clocks than this specific nand_usdhc_bus clock. I am not sure if an error in the fec driver has the same or similar cause as this. But I noticed that the SOC hangs when accessing the timecounter register while the FEC is down. The CCGR10 seems to gate the ENET_REF_CLK_ROOT and the ENET_TIMER_CLK_ROOT. The clocks are disabled as soon as the interface is down. [1] https://lore.kernel.org/lkml/20210220065654.25598-1-heiko.thiery@xxxxxxxxx/ > > Can you tell us if there were other issues found with the PLL1/2 gating > patch? The fact that, according to Bough, it's reverted in your tree > seems to suggest this. > > Regards, > Lucas Thank you -- Heiko