On Fri 30 Aug 2019 at 13:07, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > On Fri, 30 Aug 2019 at 12:01, Jan Kaisrlik <ja.kaisrlik@xxxxxxxxx> wrote: >> >> On Thu, Aug 29, 2019 at 9:50 AM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: >> > >> > On Wed, 28 Aug 2019 at 12:44, Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: >> > > >> > > On Tue, 27 Aug 2019 at 17:55, Jan Kaisrlik <ja.kaisrlik@xxxxxxxxx> wrote: >> > > > >> > > > On Tue, Aug 27, 2019 at 3:14 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: >> > > > > >> > > > > On Fri, 23 Aug 2019 at 11:36, Jan Kaisrlik <ja.kaisrlik@xxxxxxxxx> wrote: >> > > > > > >> > > > > > čt 22. 8. 2019 v 15:59 odesílatel Ulf Hansson <ulf.hansson@xxxxxxxxxx> napsal: >> > > > > > > >> > > > > > > + some meson driver folkz >> > > > > > > >> > > > > > > On Thu, 22 Aug 2019 at 10:27, Jan Kaisrlik <ja.kaisrlik@xxxxxxxxx> wrote: >> > > > > > > > >> > > > > > > > st 21. 8. 2019 v 17:12 odesílatel Ulf Hansson <ulf.hansson@xxxxxxxxxx> napsal: >> > > > > > > > > >> > > > > > > > > + Chaotian Jing >> > > > > > > > > >> > > > > > > > > On Tue, 20 Aug 2019 at 13:42, <ja.kaisrlik@xxxxxxxxx> wrote: >> > > > > > > > > > >> > > > > > > > > > From: Jan Kaisrlik <ja.kaisrlik@xxxxxxxxx> >> > > > > > > > > > >> > > > > > > > > > This reverts commit 3a0681c7448b174e5dcfd19e9079cdd281c35f1a. >> > > > > > > > > > >> > > > > > > > > > Turns out the patch breaks initialization of Toshiba THGBMNG5. >> > > > > > > > > > [ 1.648951] mmc0: mmc_select_hs200 failed, error -84 >> > > > > > > > > > [ 1.648988] mmc0: error -84 whilst initialising MMC card >> > > > > > > > > >> > > > > > > > > For exactly this reason, when getting CRC errors on the first attempt, >> > > > > > > > > doing a retry makes little sense. >> > > > > > > > > >> > > > > > > > > I have looped in Chaotian who has some more details about the problem. >> > > > > > > > > >> > > > > > > > > In any case, Jan, what HW and mmc controller are you using? >> > > > > > > > >> > > > > > > > It's a custom board based on Amlogic A113D. The compatibility in dts >> > > > > > > > is set to "alogic,meson-axg-mmc". >> > > > > > > >> > > > > > > Good. I have looped in some of the relevant developers/maintainers. >> > > > > > > >> > > > > > > > >> > > > > > > > In the different revision of HW we are using Kingston EMMC04G. The >> > > > > > > > card has no such problem and is working fine without this patch. >> > > > > > > > We observed it only on mention Toshiba card. >> > > > > > > >> > > > > > > I see. Of course it would also be interesting to see what CMD6 command >> > > > > > > that is that fails. Would you mind adding some debug/trace to find out >> > > > > > > what command it is that fails? >> > > > > > >> > > > > > Providing a log with following debug option `dyndbg="func mmc_mrq_pr_debug +p` >> > > > > >> > > > > Thanks! >> > > > > >> > > > > > >> > > > > > # dmesg | grep mmc0 >> > > > > > [ 1.557984] mmc0: starting CMD52 arg 00000c00 flags 00000195 >> > > > > > [ 1.563989] mmc0: starting CMD52 arg 80000c08 flags 00000195 >> > > > > > [ 1.575219] mmc0: starting CMD0 arg 00000000 flags 000000c0 >> > > > > > [ 1.593142] mmc0: starting CMD8 arg 000001aa flags 000002f5 >> > > > > > [ 1.604439] mmc0: starting CMD5 arg 00000000 flags 000002e1 >> > > > > > [ 1.623875] mmc0: starting CMD55 arg 00000000 flags 000000f5 >> > > > > > [ 1.631024] mmc0: starting CMD55 arg 00000000 flags 000000f5 >> > > > > > [ 1.640221] mmc0: starting CMD55 arg 00000000 flags 000000f5 >> > > > > > [ 1.646802] mmc0: starting CMD55 arg 00000000 flags 000000f5 >> > > > > > [ 1.652397] mmc0: starting CMD1 arg 00000000 flags 000000e1 >> > > > > > [ 1.669894] mmc0: starting CMD1 arg 40ff8080 flags 000000e1 >> > > > > > [ 1.682925] mmc0: starting CMD1 arg 40ff8080 flags 000000e1 >> > > > > > [ 1.697073] mmc0: starting CMD1 arg 40ff8080 flags 000000e1 >> > > > > > [ 1.704327] mmc0: starting CMD0 arg 00000000 flags 000000c0 >> > > > > > [ 1.722251] mmc0: starting CMD1 arg 40200000 flags 000000e1 >> > > > > > [ 1.745317] mmc0: starting CMD1 arg 40200000 flags 000000e1 >> > > > > > [ 1.752813] mmc0: starting CMD2 arg 00000000 flags 00000007 >> > > > > > [ 1.771731] mmc0: starting CMD3 arg 00010000 flags 00000015 >> > > > > > [ 1.784771] mmc0: starting CMD9 arg 00010000 flags 00000007 >> > > > > > [ 1.790433] mmc0: starting CMD7 arg 00010000 flags 00000015 >> > > > > > [ 1.795691] mmc0: starting CMD8 arg 00000000 flags 000000b5 >> > > > > > [ 1.800800] mmc0: blksz 512 blocks 1 flags 00000200 tsac 50 ms nsac 0 >> > > > > > [ 1.818402] mmc0: starting CMD6 arg 03af0101 flags 0000049d >> > > > > > [ 1.818845] mmc0: starting CMD13 arg 00010000 flags 00000195 >> > > > > > [ 1.824349] mmc0: starting CMD6 arg 03220101 flags 0000049d >> > > > > > [ 1.829992] mmc0: starting CMD13 arg 00010000 flags 00000195 >> > > > > > [ 1.835493] mmc0: starting CMD6 arg 03b70201 flags 0000049d >> > > > > > [ 1.841119] mmc0: starting CMD13 arg 00010000 flags 00000195 >> > > > > > [ 1.846628] mmc0: starting CMD8 arg 00000000 flags 000000b5 >> > > > > > [ 1.851719] mmc0: blksz 512 blocks 1 flags 00000200 tsac 50 ms nsac 0 >> > > > > > [ 1.860347] mmc0: starting CMD6 arg 03b90201 flags 0000049d >> > > > > > [ 1.864421] mmc0: mmc_select_hs200 failed, error -84 >> > > > > > [ 1.868892] mmc0: error -84 whilst initialising MMC card >> > > > > >> > > > > Alright, so the CMD6 command that tries to switch the card into HS200 >> > > > > mode is the one that fails. >> > > > > >> > > > > > >> > > > > > I cannot provide more verbose logs. When I enable more it (f.e. file >> > > > > > core.c) the initialization of card was successful. >> > > > > >> > > > > That's an interesting observation! >> > > > > >> > > > > Perhaps the card is still in some kind of busy state, after the CMD8 >> > > > > has been sent to read the EXT_CSD, when verifying the earlier bus >> > > > > width switch (mmc_compare_ext_csds()). >> > > > >> > > > I'm curious, do you know if there is a command that says if the card >> > > > is busy/ready? >> > > >> > > Yes, the CMD13. >> > > >> > > However, CMD13 is not allowed to be sent for some command sequences. >> > > For example sending CMD6 to switch to HS200 mode, is one case where we >> > > must avoid it. >> > > >> > > For these scenarios, either we rely on the host HW to detect when the >> > > card stop signals busy or we simply insert a delay (according to spec) >> > > after sending the CMD6 command. >> > > >> > > > >> > > > > >> > > > > > If you want to see any another logs fell free to ask. >> > > > > >> > > > > Would you mind trying one thing, in a way to narrow down the problem? >> > > > > >> > > > > Add a delay (msleep() or usleep_range() with some different values up >> > > > > to 50 ms) somewhere after mmc_compare_ext_csds() has been executed, >> > > > > but also before having mmc_select_bus_width() to return the error >> > > > > code? >> > > > >> > > > I've added msleep just before exit point of mmc_select_bus_width() and >> > > > observation is following >> > > > * 1ms or 2 ms - failing (time to time it was successfully initialized) >> > > > * 5 ms - success in all attempts >> > > >> > > Very good! >> > > >> > > Let me post a debug patch in short while, that can try to verify if >> > > the card is busy during this period. >> > >> > Here it is, please give it a try and see what happens. >> > >> > You may also want to run a second test, changing the second parameter >> > to false when calling poll_for_busy(), as that switches from CMD13 to >> > use of the host driver's ->card_busy() callback, when checking for >> > busy. >> >> I've applied the patch and test poll_for_busy() with different >> send_status parameter as you recommend (true/false). >> The outputs are the same >> [ 1.526554] meson-gx-mmc ffe07000.mmc: allocated mmc-pwrseq >> [ 1.682886] mmc0: mmc_select_hs200 failed, error -84 >> [ 1.693081] mmc0: error -84 whilst initialising MMC card > > Alright, my guess that the card was busy was solely incorrect. > >> >> > >> > Kind regards >> > Uffe >> > >> > From: Ulf Hansson <ulf.hansson@xxxxxxxxxx> >> > Date: Thu, 29 Aug 2019 09:42:11 +0200 >> > Subject: [PATCH] mmc: core: DEBUG for mmc_select_bus_width() >> > >> > Signed-off-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx> >> > --- >> > drivers/mmc/core/mmc.c | 47 ++++++++++++++++++++++++++++++++++++++++++ >> > 1 file changed, 47 insertions(+) >> > >> > diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c >> > index c8804895595f..ff3a4c166f20 100644 >> > --- a/drivers/mmc/core/mmc.c >> > +++ b/drivers/mmc/core/mmc.c >> > @@ -975,6 +975,52 @@ static void mmc_set_bus_speed(struct mmc_card *card) >> > mmc_set_clock(card->host, max_dtr); >> > } >> > >> > +static void poll_for_busy(struct mmc_card *card, bool send_status) >> > +{ >> > + struct mmc_host *host = card->host; >> > + int err; >> > + unsigned long timeout; >> > + unsigned int timeout_ms = 5000; >> > + u32 status = 0; >> > + bool expired = false; >> > + bool busy = false; >> > + >> > + if (!send_status && !host->ops->card_busy) { >> > + mmc_delay(timeout_ms); >> >> fyi this code hasn't been executed. > > Yeah, thanks, expected. > >> >> > + return; >> > + } >> > + >> > + timeout = jiffies + msecs_to_jiffies(timeout_ms) + 1; >> > + do { >> > + expired = time_after(jiffies, timeout); >> > + if (send_status) { >> > + err = __mmc_send_status(card, &status, 0); >> >> I've added some print here as well to see whats the value of the status >> >> + printk("%s: %s status=0x%x >> R1_CURRENT_STATE(status)=0x%x \n", >> + mmc_hostname(host),__func__, status, >> R1_CURRENT_STATE(status)); > > Great! > >> >> [ 1.676970] mmc0: poll_for_busy status=0x900 R1_CURRENT_STATE(status)=0x4 > > This means the card reports that it's "READY_FOR_DATA" and the current > state is the transfer state. > > Nothing weird is going on at card side, it seems. > >> [ 1.685821] mmc0: mmc_select_hs200 failed, error -84 >> [ 1.698029] mmc0: error -84 whilst initialising MMC card > > I am out of ideas. :-( > > Perhaps the meson developers can think of something? Could it be that > the driver isn't behaving as it should when switching bus width? Maybe > something happens on pinctrl level? I have not seen this particular issue myself. Since the last round of updates we did a few month ago, things have been quite stable We get our information only from the datasheet and vendor code. These are publicly available and it has proved quite insufficient judging by the number of issue we had in the past years. IOW, we would love to know more about this controller too ;) So, to answer your question, AFAIK, there is nothing special regarding bus width switch ... Regarding pinctrl, I think there a bias-pull-up on data line set in DT. Maybe it's worth trying to disable that a see how it goes ? OTOH, it this was the problem, I wonder how it could have got this far and why a delay would help ? > > Anyway, if no progress is made the next few days, let's do the revert. > And thanks a lot for helping out with the tests, etc! > > Kind regards > Uffe