> Hello, > >> Some eMMC devices (e.g., BGSD4R and AIM20F) may enter an > unresponsive > >> state after encountering CRC errors during RPMB writes (CMD25). This > >> prevents the device from switching back to the main partition via > >> CMD6, blocking further I/O operations. > >Different cards on the same platform? > >Can you share which platform, and few lines from the log supporting your > analysis? > > I tested on R-Car Gen3/4 platforms, which use the same host controller IP and > the tmio_mmc host driver. > The tests were conducted on different board and eMMC combinations: > - Gen3 Board with Samsung eMMC (BGSD4R) → Issue observed > - Gen3 Board with Micron eMMC (AIM20F, new version) → Issue observed > - Gen3 Board with Micron eMMC (AIM20F, old version) → No issue > - Gen4 Board with Micron eMMC (G1M15L) → No issue > > The issue only occurs in the RPMB partition during write operations, where a > CRC error is triggered. > To investigate further, I hacked the host driver to generate a dummy CRC > during the CMD25 data phase. > The reproduced log is as follows: > $ ./mmc rpmb read-counter /dev/mmcblk0rpmb > [ 75.557848] w_t: -->START_CMD6 (arg: 3b30301) > [ 75.557863] w_t: resp[0]=900 > [ 75.557875] w_t: -->START_CMD13 (arg: 10000) > [ 75.557884] w_t: resp[0]=900 > [ 75.557894] w_t: -->START_CMD23 (arg: 1) > [ 75.557903] w_t: resp[0]=900 > [ 75.557915] w_t: -->START_CMD25 (arg: 0) > [ 75.557924] w_t: resp[0]=900 > [ 75.557931] !!!!!!!!!!!!!!!!, make a dummy write CRC on DAT > [ 75.563631] w_t: (data_err) -84 stat=20820604 error=5800 (which means > eMMC device feedbacked nagative CRC status) > [ 75.563672] renesas_sdhi_internal_dmac ee140000.sd: > __mmc_blk_ioctl_cmd: data error -84 > [ 75.573112] w_t: -->START_CMD6 (arg: 3b30001) > [ 75.573132] w_t: (cmd_err -110) stat=20c00401 error=12000 > [ 75.573154] w_t: -->START_CMD6 (arg: 3b30001) > [ 75.573169] w_t: (cmd_err -110) stat=20c00401 error=12000 > [ 75.573183] w_t: -->START_CMD6 (arg: 3b30001) > [ 75.573197] w_t: (cmd_err -110) stat=20c00401 error=12000 > [ 75.573211] w_t: -->START_CMD6 (arg: 3b30001) > [ 75.573225] w_t: (cmd_err -110) stat=20c00401 error=12000 > After this issue occurs, the eMMC device no longer responds to CMD6, even > subsequent accesses to the main partition proceed abnormally. > However, if we perform an eMMC card reset at this point, the retry of CMD6 > works as expected. Thank you for sharing it. > > BTW, > I now believe that sending CMD12 is a better solution in this case rather than > performing a reset. > According to information from the eMMC vendor, even in a closed-end write > operation (CMD23 + CMD25), CMD12 is required if any communication error > occurs. > The JESD84 specification also mentions a similar requirement: "A stop > command is not required at the end of this type of multiple block write unless > terminated with an error." > I just simply tested this approach on the affected board, and it can work > successfully. OK. Please note that some host controllers do that as auto-cmd. > > >> > >> The root cause is suspected to be a firmware/hardware issue in > >> specific eMMC models. A workaround is to perform a hardware reset via > >> mmc_hw_reset() > >> when the partition switch fails, followed by a retry. > >Same fw bug in 2 different products? > > > >Why do we need to fix it here? > >The ioctl will eventually return an error, and reset is needed anyway. > >If the eMMC is the primary storage, the platform is rebooting without being > aware what went wrong. > > In the main partition, a similar reset operation is already implemented in > mmc_blk_issue_rw_rq(), So I believe applying the same approach for RPMB > should be acceptable. > case MMC_BLK_ABORT: > if (!mmc_blk_reset(md, card->host, type)) > break; > mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq); > mmc_blk_rw_try_restart(mq, new_req, mqrq_cur); > return; The code that you are citing does no longer exist. It was removed a while ago - see https://lore.kernel.org/linux-block/1511962879-24262-23-git-send-email-adrian.hunter@xxxxxxxxx/ My point is that you are recovering silently on an ioctl error that is better for the sender to be aware of and recover by himself. Thanks, Avri > > > Best Regards, > Guan Wang