> -----Original Message----- > From: Tim Harvey [mailto:tharvey@xxxxxxxxxxxxx] > Sent: 2021年11月4日 0:50 > To: Bough Chen <haibo.chen@xxxxxxx> > Cc: Linux MMC List <linux-mmc@xxxxxxxxxxxxxxx>; Marcel Ziswiler > <marcel@xxxxxxxxxxxx>; Fabio Estevam <festevam@xxxxxxxxx>; Schrempf > Frieder <frieder.schrempf@xxxxxxxxxx>; Adam Ford <aford173@xxxxxxxxx>; > Lucas Stach <l.stach@xxxxxxxxxxxxxx>; Peng Fan <peng.fan@xxxxxxx>; Frank > Li <frank.li@xxxxxxx>; Adrian Hunter <adrian.hunter@xxxxxxxxx>; Shawn Guo > <shawnguo@xxxxxxxxxx>; Ulf Hansson <ulf.hansson@xxxxxxxxxx>; Sascha > Hauer <s.hauer@xxxxxxxxxxxxxx>; Pengutronix Kernel Team > <kernel@xxxxxxxxxxxxxx>; dl-linux-imx <linux-imx@xxxxxxx>; Cale Collins > <ccollins@xxxxxxxxxxxxx> > Subject: Re: IMX8MM eMMC CQHCI timeout > > On Sun, Oct 31, 2021 at 6:57 PM Bough Chen <haibo.chen@xxxxxxx> wrote: > > > > > -----Original Message----- > > > From: Tim Harvey [mailto:tharvey@xxxxxxxxxxxxx] > > > Sent: 2021年10月30日 4:47 > > > To: Linux MMC List <linux-mmc@xxxxxxxxxxxxxxx>; Marcel Ziswiler > > > <marcel@xxxxxxxxxxxx>; Fabio Estevam <festevam@xxxxxxxxx>; Schrempf > > > Frieder <frieder.schrempf@xxxxxxxxxx>; Adam Ford > > > <aford173@xxxxxxxxx>; Bough Chen <haibo.chen@xxxxxxx>; Lucas Stach > > > <l.stach@xxxxxxxxxxxxxx>; Peng Fan <peng.fan@xxxxxxx>; Frank Li > > > <frank.li@xxxxxxx> > > > Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>; Shawn Guo > > > <shawnguo@xxxxxxxxxx>; Ulf Hansson <ulf.hansson@xxxxxxxxxx>; Sascha > > > Hauer <s.hauer@xxxxxxxxxxxxxx>; Pengutronix Kernel Team > > > <kernel@xxxxxxxxxxxxxx>; dl-linux-imx <linux-imx@xxxxxxx>; Cale > > > Collins <ccollins@xxxxxxxxxxxxx> > > > Subject: IMX8MM eMMC CQHCI timeout > > > > > > Greetings, > > > > > > I've encountered the following MMC CQHCI timeout message a couple of > > > times now on IMX8MM boards with eMMC with a 5.10 based kernel: > > > > > > [ 224.356283] mmc2: cqhci: ============ CQHCI REGISTER DUMP > > > =========== > > > [ 224.362764] mmc2: cqhci: Caps: 0x0000310a | Version: > > > 0x00000510 > > > [ 224.369250] mmc2: cqhci: Config: 0x00001001 | Control: > 0x00000000 > > > [ 224.375726] mmc2: cqhci: Int stat: 0x00000000 | Int enab: > 0x00000006 > > > [ 224.382197] mmc2: cqhci: Int sig: 0x00000006 | Int Coal: > 0x00000000 > > > [ 224.388665] mmc2: cqhci: TDL base: 0x8003f000 | TDL up32: > 0x00000000 > > > [ 224.395129] mmc2: cqhci: Doorbell: 0xbf01dfff | TCN: > 0x00000000 > > > [ 224.401598] mmc2: cqhci: Dev queue: 0x00000000 | Dev Pend: > 0x08000000 > > > [ 224.408064] mmc2: cqhci: Task clr: 0x00000000 | SSC1: > 0x00011000 > > > [ 224.414532] mmc2: cqhci: SSC2: 0x00000001 | DCMD rsp: > > > 0x00000800 > > > [ 224.420997] mmc2: cqhci: RED mask: 0xfdf9a080 | TERRI: > > > 0x00000000 > > > [ 224.427467] mmc2: cqhci: Resp idx: 0x0000000d | Resp arg: > > > 0x00000000 [ 224.433934] mmc2: sdhci: ============ SDHCI REGISTER > > > DUMP =========== [ 224.440404] mmc2: sdhci: Sys addr: 0x7c722000 > | Version: > > > 0x00000002 [ 224.446877] mmc2: sdhci: Blk size: 0x00000200 | Blk cnt: > > > 0x00000020 [ 224.453346] mmc2: sdhci: Argument: 0x00018000 | Trn > > > mode: 0x00000023 > > > [ 224.459811] mmc2: sdhci: Present: 0x01f88008 | Host ctl: > 0x00000030 > > > [ 224.466281] mmc2: sdhci: Power: 0x00000002 | Blk gap: > > > 0x00000080 > > > [ 224.472752] mmc2: sdhci: Wake-up: 0x00000008 | Clock: > > > 0x0000000f > > > [ 224.479225] mmc2: sdhci: Timeout: 0x0000008f | Int stat: > 0x00000000 > > > [ 224.485690] mmc2: sdhci: Int enab: 0x107f4000 | Sig enab: > > > 0x107f4000 [ 224.492161] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: > 0x00000502 > > > [ 224.498628] mmc2: sdhci: Caps: 0x07eb0000 | Caps_1: > > > 0x8000b407 > > > [ 224.505097] mmc2: sdhci: Cmd: 0x00000d1a | Max curr: > 0x00ffffff > > > [ 224.511575] mmc2: sdhci: Resp[0]: 0x00000000 | Resp[1]: > 0xffc003ff > > > [ 224.518043] mmc2: sdhci: Resp[2]: 0x328f5903 | Resp[3]: > 0x00d07f01 > > > [ 224.524512] mmc2: sdhci: Host ctl2: 0x00000088 [ 224.528986] > mmc2: > > > sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xfe179020 [ 224.535451] > > > mmc2: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP > ==== [ > > > 224.543052] mmc2: sdhci-esdhc-imx: cmd debug status: 0x2120 [ > > > 224.548740] mmc2: sdhci-esdhc-imx: data debug status: 0x2200 [ > > > 224.554510] mmc2: sdhci-esdhc-imx: trans debug status: 0x2300 [ > > > 224.560368] mmc2: sdhci-esdhc-imx: dma debug status: 0x2400 [ > > > 224.566054] mmc2: sdhci-esdhc-imx: adma debug status: 0x2510 [ > > > 224.571826] mmc2: sdhci-esdhc-imx: fifo debug status: 0x2680 [ > > > 224.577608] mmc2: sdhci-esdhc-imx: async fifo debug status: 0x2750 > > > [ 224.583900] mmc2: sdhci: > > > ============================================ > > > > > > I don't know how to make the issue occur, both times it occured > > > simply > > reading > > > a file in the rootfs ext4 fs on the emmc. > > > > > > Some research shows: > > > - > > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fco > > > mmu > > > > nity.nxp.com%2Ft5%2Fi-MX-Processors%2FThe-issues-on-quot-mmc0-cqhci- > > > tim > > > > eout-for-tag-0-quot%2Fm-p%2F993779&data=04%7C01%7Chaibo.chen%4 > > > > 0nxp.com%7C1dc0981634f5460a779808d99b1d5a88%7C686ea1d3bc2b4c6fa9 > > > > 2cd99c5c301635%7C0%7C0%7C637711372651089473%7CUnknown%7CTWFp > > > > bGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI > > > > 6Mn0%3D%7C1000&sdata=ITcs7%2FMy%2F1Vx1TMB2VlaY4QhibKuSFBD > > > 6UZhzVFl%2FqY%3D&reserved=0 > > > - > > > https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgit > > > .torad%2F&data=04%7C01%7Chaibo.chen%40nxp.com%7C281983c39 > 6a442e7 > > > > 8d2108d99ee9f858%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6 > 37715 > > > > 549993442194%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQ > IjoiV2l > > > > uMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=CyMZIUVjzXj > 2tD3 > > > MfO4kUAOXr5SazgtJSRlhro9wOvU%3D&reserved=0 > > > > ex.com%2Fcgit%2Flinux-toradex.git%2Fcommit%2F%3Fh%3Dtoradex_5.4-2.3. > > > x > -imx%26id%3Dfd33531be843566c59a5fc655f204bbd36d7f3c6&data=04% > > > > 7C01%7Chaibo.chen%40nxp.com%7C1dc0981634f5460a779808d99b1d5a88% > > > > 7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637711372651089473 > > > %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI > iLCJ > > > > BTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=xaamzPb2CdW6YDzW > > > g8uBb0PjomkoWAziu5qglvMbT2I%3D&reserved=0 > > > > > > I'm not clear if this info is up-to-date. The NXP 5.4 kernel did not > > enable this > > > feature but if I'm not mistaken CQHCI support itself didn't land in > > mainline until > > > a later kernel so it would make sense it was not enabled at that > > > time. I > > do see > > > the NXP 5.10 kernels have this enabled so I'm curious if it is an > > > issue > > there. > > > > > > Any other IMX8MM or other SoC users know what this could be about or > > > what > > I > > > could do for a test to try to reproduce it so I can see if it occurs > > > in > > other kernel > > > versions? > > > > Hi Tim, > > > > I'm debugging this issue those days, but unfortunately, still not find > > the root cause. > > The register value of Doorbell, Dev Queue, Dev Pend seems abnormal. > > This issue happens on all i.MX SoC which support cmdq feature when cpu > > loading is high.. Now I lack a mmc logic analyzer, make it not easy to > > debug this issue. So stll need some time. Sorry about that. > > If you want to make mmc work stable, you can disable the cmdq as a > > workaround. > > > > Best Regards > > Haibo Chen > > Haibo, > > Thanks for the information. Do you know how to easily reproduce it reliably for > testing? Still not, can only meet this issue randomly after few hours stress test under high CPU loading. My next step is : 1, find a way to reproduce this issue easily 2, get emmc logic analyzer. > > I have tried the following on an eMMC filesystem: > stress --cpu 32 --io 32 & > dd if=/dev/zero of=foo bs=1M count=1000 & dd if=/dev/zero of=foo bs=1M > count=1000 & rm foo > > I'm unable to reproduce the issue that way, and it has only happened randomly > once or twice. > > Perhaps we should disable CMDQ for now until you can sort this out? I can > submit a patch for that. Yes, please. Best Regards Haibo Chen > > Best regards, > > Tim
Attachment:
smime.p7s
Description: S/MIME cryptographic signature