Re: [BUG] mmc: core: adjust polling interval for CMD1

Ulf Hansson <ulf.hansson@xxxxxxxxxx> · Fri, 4 Mar 2022 11:08:58 +0100

On Fri, 4 Mar 2022 at 10:28, Jean Rene Dawin
<jdawin@xxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Ulf Hansson wrote on Thu  3/03/22 11:40:
> > On Wed, 2 Mar 2022 at 10:40, H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> wrote:
> >
> > Alright, it starts to sound like we might need a revert (or at least a
> > way to restore the previous behaviour) - even if that would just paper
> > over the real problem. The real problem is more likely related to how
> > the host driver manages restoring of the power to the card, which
> > happens when runtime resuming it.
> >
> > In any case, just to make sure the loop timeout itself isn't the
> > problem, can you run the below debug patch please? The intent is to
> > figure out how long the worst case timeout is, when it's working with
> > CMD1. As soon as the timeout exceeds the worst case, there is a
> > message printed to the log.
> >
> > The below patch is based upon that the offending commit has been reverted.
> >
> > Kind regards
> > Uffe
> >
> > From: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> > Date: Thu, 3 Mar 2022 11:00:04 +0100
> > Subject: [PATCH] mmc: core: DEBUG - Measure and log worst case CMD1 loop
> >
> > Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> > ---
> >  drivers/mmc/core/mmc_ops.c | 11 +++++++++++
> >  include/linux/mmc/host.h   |  1 +
> >  2 files changed, 12 insertions(+)
> >
> > diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
> > index 9946733a34c6..3f03d9e8a3a4 100644
> > --- a/drivers/mmc/core/mmc_ops.c
> > +++ b/drivers/mmc/core/mmc_ops.c
> > @@ -177,11 +177,15 @@ int mmc_send_op_cond(struct mmc_host *host, u32
> > ocr, u32 *rocr)
> >  {
> >         struct mmc_command cmd = {};
> >         int i, err = 0;
> > +       s64 cmd1_ms;
> > +       ktime_t time_start;
> >
> >         cmd.opcode = MMC_SEND_OP_COND;
> >         cmd.arg = mmc_host_is_spi(host) ? 0 : ocr;
> >         cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R3 | MMC_CMD_BCR;
> >
> > +       time_start = ktime_get();
> > +
> >         for (i = 100; i; i--) {
> >                 err = mmc_wait_for_cmd(host, &cmd, 0);
> >                 if (err)
> > @@ -211,6 +215,13 @@ int mmc_send_op_cond(struct mmc_host *host, u32
> > ocr, u32 *rocr)
> >                         cmd.arg = cmd.resp[0] | BIT(30);
> >         }
> >
> > +       cmd1_ms = ktime_to_ms(ktime_sub(ktime_get(), time_start));
> > +       if (!err && cmd1_ms > host->cmd1_ms) {
> > +               pr_warn("%s: CMD1 timeout increased to %lld ms, loop=%d\n",
> > +                       mmc_hostname(host), cmd1_ms, i);
> > +               host->cmd1_ms = cmd1_ms;
> > +       }
> > +
> >         if (rocr && !mmc_host_is_spi(host))
> >                 *rocr = cmd.resp[0];
> >
> > diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
> > index 7afb57cab00b..c2ca3bb05620 100644
> > --- a/include/linux/mmc/host.h
> > +++ b/include/linux/mmc/host.h
> > @@ -301,6 +301,7 @@ struct mmc_host {
> >         unsigned int            f_min;
> >         unsigned int            f_max;
> >         unsigned int            f_init;
> > +       s64                     cmd1_ms;
> >         u32                     ocr_avail;
> >         u32                     ocr_avail_sdio; /* SDIO-specific OCR */
> >         u32                     ocr_avail_sd;   /* SD-specific OCR */
> > --
> > 2.25.1
>
> Hi,
>
> thanks. With this patch applied with reverted 76bfc7ccc2fa9d382576f6013b57a0ef93d5a722
> I don't get the "stuck" errors in __mmc_poll_for_busy and only one message of
>
> [    1.095194] mmc1: CMD1 timeout increased to 64 ms, loop=98

Okay, so for some reason polling doesn't work, if we poll too
frequently. Clearly the 1s total timeout should be sufficient.

I will prepare a patch that restores the old behaviour and post it
within a few minutes.

Kind regards
Uffe