Huijin Park wrote on Thu 3/03/22 21:16: > Hi, sorry for the late reply. > I guess the problem occurs depending on the eMMC model. > Because I tested again and there was no problem. > The eMMC model used for the test are as follows. > (THGBMJG6C1LBAIL, KLM8G1GETF-B041) > Anyway I guess the cause is interval time. > I wrote a debug patch assuming that the reason was that some mmc models > could not process CMD1 delivered at short intervals. > I copied the polling function and adds interval minimum time parameter. > I set the minimum time to 1ms. You can adjust it. > Please test if there is no problem with the debug patch. > > >Hi, > > > >> Am 02.03.2022 um 09:20 schrieb Jean Rene Dawin <jdawin@xxxxxxxxxxxxxxxxxxxxx>: > >> > >> Ulf Hansson wrote on Tue 1/03/22 14:38: > >>> On Thu, 17 Feb 2022 at 21:12, H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> wrote: > >>>> > >>> > >>> From: Ulf Hansson <ulf.hansson@xxxxxxxxxx> > >>> Date: Tue, 1 Mar 2022 14:24:21 +0100 > >>> Subject: [PATCH] mmc: core: Extend timeout to 2s for MMC_SEND_OP_COND > >>> > >>> It looks like the timeout for the MMC_SEND_OP_COND (CMD1) might have become > >>> a bit too small due to recent changes. Therefore, let's extend it to 2s, > >>> which is probably more inline with its previous value, to fix the reported > >>> timeout problems. > >>> > >>> While at it, let's add a define for the timeout value, rather than using > >>> a hard-coded value for it. > >>> > >>> Reported-by: Jean Rene Dawin <jdawin@xxxxxxxxxxxxxxxxxxxxx> > >>> Reported-by: H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> > >>> Cc: Huijin Park <huijin.park@xxxxxxxxxxx> > >>> Fixes: 76bfc7ccc2fa ("mmc: core: adjust polling interval for CMD1") > >>> Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx> > >>> --- > >>> drivers/mmc/core/mmc_ops.c | 4 +++- > >>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c > >>> index d63d1c735335..1f57174b3cf3 100644 > >>> --- a/drivers/mmc/core/mmc_ops.c > >>> +++ b/drivers/mmc/core/mmc_ops.c > >>> @@ -21,6 +21,7 @@ > >>> > >>> #define MMC_BKOPS_TIMEOUT_MS (120 * 1000) /* 120s */ > >>> #define MMC_SANITIZE_TIMEOUT_MS (240 * 1000) /* 240s */ > >>> +#define MMC_OP_COND_TIMEOUT_MS 2000 /* 2s */ > >>> > >>> static const u8 tuning_blk_pattern_4bit[] = { > >>> 0xff, 0x0f, 0xff, 0x00, 0xff, 0xcc, 0xc3, 0xcc, > >>> @@ -232,7 +233,8 @@ int mmc_send_op_cond(struct mmc_host *host, u32 > >>> ocr, u32 *rocr) > >>> cmd.arg = mmc_host_is_spi(host) ? 0 : ocr; > >>> cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R3 | MMC_CMD_BCR; > >>> > >>> - err = __mmc_poll_for_busy(host, 1000, &__mmc_send_op_cond_cb, &cb_data); > >>> + err = __mmc_poll_for_busy(host, MMC_OP_COND_TIMEOUT_MS, > >>> + &__mmc_send_op_cond_cb, &cb_data); > >>> if (err) > >>> return err; > >>> > >>> -- > >>> 2.25.1 > >> > >> Hi, > >> > >> thanks. But testing with this patch still gives the same errors: > >> > >> [ 52.259940] mmc1: Card stuck being busy! __mmc_poll_for_busy > >> [ 52.273380] mmc1: error -110 doing runtime resume > >> > >> and the system gets stuck eventually. > > > >Same result from my tests. > > > >BR and thanks, > >Nikolaus > > > From c2458cb998dd8e275fefba52dd2532beb153c82d Mon Sep 17 00:00:00 2001 > From: Huijin Park <huijin.park@xxxxxxxxxxx> > Date: Thu, 3 Mar 2022 20:43:22 +0900 > Subject: [PATCH] mmc: core: extend timeout and set min time for op_cond > > This patch extends timeout to 2s and sets minimal interval time for > checking op_cond stuck problem. > > Signed-off-by: Huijin Park <huijin.park@xxxxxxxxxxx> > > diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c > index d63d1c735335..ccad6379d183 100644 > --- a/drivers/mmc/core/mmc_ops.c > +++ b/drivers/mmc/core/mmc_ops.c > @@ -21,6 +21,7 @@ > > #define MMC_BKOPS_TIMEOUT_MS (120 * 1000) /* 120s */ > #define MMC_SANITIZE_TIMEOUT_MS (240 * 1000) /* 240s */ > +#define MMC_OP_COND_TIMEOUT_MS ( 2 * 1000) /* 2s */ > > static const u8 tuning_blk_pattern_4bit[] = { > 0xff, 0x0f, 0xff, 0x00, 0xff, 0xcc, 0xc3, 0xcc, > @@ -179,6 +180,47 @@ int mmc_go_idle(struct mmc_host *host) > return err; > } > > +static int ____mmc_poll_for_busy(struct mmc_host *host, unsigned int udelay_min, > + unsigned int timeout_ms, > + int (*busy_cb)(void *cb_data, bool *busy), > + void *cb_data) > +{ > + int err; > + unsigned long timeout; > + unsigned int udelay = udelay_min, udelay_max = 32768; > + bool expired = false; > + bool busy = false; > + > + timeout = jiffies + msecs_to_jiffies(timeout_ms) + 1; > + do { > + /* > + * Due to the possibility of being preempted while polling, > + * check the expiration time first. > + */ > + expired = time_after(jiffies, timeout); > + > + err = (*busy_cb)(cb_data, &busy); > + if (err) > + return err; > + > + /* Timeout if the device still remains busy. */ > + if (expired && busy) { > + pr_err("%s: Card stuck being busy! %s\n", > + mmc_hostname(host), __func__); > + return -ETIMEDOUT; > + } > + > + /* Throttle the polling rate to avoid hogging the CPU. */ > + if (busy) { > + usleep_range(udelay, udelay * 2); > + if (udelay < udelay_max) > + udelay *= 2; > + } > + } while (busy); > + > + return 0; > +} > + > static int __mmc_send_op_cond_cb(void *cb_data, bool *busy) > { > struct mmc_op_cond_busy_data *data = cb_data; > @@ -232,7 +274,8 @@ int mmc_send_op_cond(struct mmc_host *host, u32 ocr, u32 *rocr) > cmd.arg = mmc_host_is_spi(host) ? 0 : ocr; > cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R3 | MMC_CMD_BCR; > > - err = __mmc_poll_for_busy(host, 1000, &__mmc_send_op_cond_cb, &cb_data); > + err = ____mmc_poll_for_busy(host, 1000, MMC_OP_COND_TIMEOUT_MS, > + &__mmc_send_op_cond_cb, &cb_data); > if (err) > return err; > > -- > 2.17.1 > Hi, thanks for the patch. I started with a value of 10 ms : err = ____mmc_poll_for_busy(host, 10, MMC_OP_COND_TIMEOUT_MS, &__mmc_send_op_cond_cb, &cb_data); and the result was agian [ 23.349932] mmc1: Card stuck being busy! __mmc_poll_for_busy [ 23.355936] mmc1: error -110 doing runtime resume Same with 100 ms and 500 ms. The messages seem to come from __mmc_poll_for_busy and not ____mmc_poll_for_busy. Yet, changing the value in the call of ____mmc_poll_for_busy to 1000 ms changed the behaviour: err = ____mmc_poll_for_busy(host, 1000, MMC_OP_COND_TIMEOUT_MS, &__mmc_send_op_cond_cb, &cb_data); With this I didn't get any "stuck" errors and the board seemed stable. Regards, Jean Rene Dawin