On Sat, 19 Mar 2022 at 02:52, Brian Norris <briannorris@xxxxxxxxxxxx> wrote: > > Way back in commit 4f25580fb84d ("mmc: core: changes frequency to > hs_max_dtr when selecting hs400es"), Rockchip engineers noticed that > some eMMC don't respond to SEND_STATUS commands very reliably if they're > still running at a low initial frequency. As mentioned in that commit, > JESD84-B51 P49 suggests a sequence in which the host: > 1. sets HS_TIMING > 2. bumps the clock ("<= 52 MHz") > 3. sends further commands > > It doesn't exactly require that we don't use a lower-than-52MHz > frequency, but in practice, these eMMC don't like it. > > The aforementioned commit tried to get that right for HS400ES, although > it's unclear whether this ever truly worked as committed into mainline, > as other changes/refactoring adjusted the sequence in conflicting ways: > > 08573eaf1a70 ("mmc: mmc: do not use CMD13 to get status after speed mode > switch") > > 53e60650f74e ("mmc: core: Allow CMD13 polling when switching to HS mode > for mmc") > > In any case, today we do step 3 before step 2. Let's fix that, and also > apply the same logic to HS200/400, where this eMMC has problems too. > > Resolves errors like this seen when booting some RK3399 Gru/Scarlet > systems: > > [ 2.058881] mmc1: CQHCI version 5.10 > [ 2.097545] mmc1: SDHCI controller on fe330000.mmc [fe330000.mmc] using ADMA > [ 2.209804] mmc1: mmc_select_hs400es failed, error -84 > [ 2.215597] mmc1: error -84 whilst initialising MMC card > [ 2.417514] mmc1: mmc_select_hs400es failed, error -110 > [ 2.423373] mmc1: error -110 whilst initialising MMC card > [ 2.605052] mmc1: mmc_select_hs400es failed, error -110 > [ 2.617944] mmc1: error -110 whilst initialising MMC card > [ 2.835884] mmc1: mmc_select_hs400es failed, error -110 > [ 2.841751] mmc1: error -110 whilst initialising MMC card > > Fixes: 08573eaf1a70 ("mmc: mmc: do not use CMD13 to get status after speed mode switch") > Fixes: 53e60650f74e ("mmc: core: Allow CMD13 polling when switching to HS mode for mmc") > Fixes: 4f25580fb84d ("mmc: core: changes frequency to hs_max_dtr when selecting hs400es") > Cc: Shawn Lin <shawn.lin@xxxxxxxxxxxxxx> > Signed-off-by: Brian Norris <briannorris@xxxxxxxxxxxx> > --- > > Changes in v2: > * Use ext_csd.hs200_max_dtr for HS200 > * Retest on top of 3b6c472822f8 ("mmc: core: Improve fallback to speed > modes if eMMC HS200 fails") > > drivers/mmc/core/mmc.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c > index e7ea45386c22..b75fa4b67964 100644 > --- a/drivers/mmc/core/mmc.c > +++ b/drivers/mmc/core/mmc.c > @@ -1385,12 +1385,17 @@ static int mmc_select_hs400es(struct mmc_card *card) > } > > mmc_set_timing(host, MMC_TIMING_MMC_HS); > + > + /* > + * Bump to HS frequency. Some cards don't handle SEND_STATUS reliably > + * at the initial frequency. > + */ > + mmc_set_clock(host, card->ext_csd.hs_max_dtr); > + > err = mmc_switch_status(card, true); > if (err) > goto out_err; > > - mmc_set_clock(host, card->ext_csd.hs_max_dtr); > - > /* Switch card to DDR with strobe bit */ > val = EXT_CSD_DDR_BUS_WIDTH_8 | EXT_CSD_BUS_WIDTH_STROBE; > err = mmc_switch(card, EXT_CSD_CMD_SET_NORMAL, > @@ -1482,6 +1487,12 @@ static int mmc_select_hs200(struct mmc_card *card) > old_timing = host->ios.timing; > mmc_set_timing(host, MMC_TIMING_MMC_HS200); > > + /* > + * Bump to HS200 frequency. Some cards don't handle SEND_STATUS > + * reliably at the initial frequency. > + */ > + mmc_set_clock(host, card->ext_csd.hs200_max_dtr); > + If the mmc_switch_status() fails with -EBADMSG, we should probably restore the clock to its previous value. Otherwise I think we could potentially break the fallback implemented in 3b6c472822f8 ("mmc: core: Improve fallback to speed modes if eMMC HS200 fails") Moreover, this change means that we will be calling mmc_set_bus_speed() from mmc_select_timing(), to actually set the same HS200 clock rate again. That is unnecessary, can we please avoid that in some way. > /* > * For HS200, CRC errors are not a reliable way to know the > * switch failed. If there really is a problem, we would expect Kind regards Uffe