On 6/16/23 12:02, Ulf Hansson wrote:
On Thu, 15 Jun 2023 at 17:37, Marek Vasut <marex@xxxxxxx> wrote:
On 6/15/23 17:35, Adrian Hunter wrote:
On 15/06/23 18:14, Ulf Hansson wrote:
On Mon, 12 Jun 2023 at 10:59, Marek Vasut <marex@xxxxxxx> wrote:
On 6/12/23 06:59, Adrian Hunter wrote:
On 7/06/23 23:43, Marek Vasut wrote:
On 6/4/23 18:30, Adrian Hunter wrote:
[...]
diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
index 72b664ed90cf..9c3123867a99 100644
--- a/drivers/mmc/core/sd.c
+++ b/drivers/mmc/core/sd.c
@@ -1313,6 +1313,8 @@ static int sd_flush_cache(struct mmc_host *host)
{
struct mmc_card *card = host->card;
u8 *reg_buf, fno, page;
+ unsigned long timeout;
+ bool expired;
u16 offset;
int err;
@@ -1338,11 +1340,15 @@ static int sd_flush_cache(struct mmc_host *host)
goto out;
}
+ timeout = jiffies + msecs_to_jiffies(SD_WRITE_EXTR_SINGLE_TIMEOUT_MS) + 1;
+again:
err = mmc_poll_for_busy(card, SD_WRITE_EXTR_SINGLE_TIMEOUT_MS, false,
MMC_BUSY_EXTR_SINGLE);
if (err)
goto out;
+ expired = time_after(jiffies, timeout);
+
/*
* Read the Flush Cache bit. The card shall reset it, to confirm that
* it's has completed the flushing of the cache.
@@ -1354,8 +1360,12 @@ static int sd_flush_cache(struct mmc_host *host)
goto out;
}
- if (reg_buf[0] & BIT(0))
- err = -ETIMEDOUT;
+ if (reg_buf[0] & BIT(0)) {
I am getting here, multiple times, with expired=0 .
So either the host controller's busy detection does not work, or the
card is not indicating busy by pulling down DAT0.
Can you try to figure out which it is?
The byte 261 bit 0 is never cleared, I had this looping for an hour and the 'Flush Cache' bit just never got cleared. The SD spec 6.00 and 9.00 both indicate the bit should be cleared by the card once cache flush is completed.
I tried three different controllers now -- STM32MP15xx ARM MMCI, i.MX6Q uSDHC, laptop rtsx_pci_sdmmc , they all fail.
I tried to find another card which also has cache, I cannot find any other card, all the rest report no cache. The kingston card SSR (see the 2ff in 6th field, the last f bit 2 is cache supported indication, SSR bit 330):
00000000:08000000:04009000:011b391e:00080000:0002ff00:03000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:
So either this card is weird, or the cards with cache are so rare that nobody noticed the problem yet.
The patch set cover letter says it was tested with 64GB Sandisk Extreme PRO UHS-I A2 card
https://lore.kernel.org/linux-mmc/20210506145829.198823-1-ulf.hansson@xxxxxxxxxx/
I got that one now, tested it, the cache bit is being cleared correctly. I also tested a few more cards and dumped their SSR too:
Kingston Canvas Go! Plus:
80000000:08000000:04009000:011b391e:00080000:0002ff00:03000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:
Flush never finishes
Sandisk Extreme PRO A2 64GiB:
80000000:08000000:04009000:0f05391e:00080000:0002fc00:03000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:
mmc0: flushing cache took 5 ms, 1 iterations, error 0
Goodram IRDM V30 A2 64GiB:
80000000:08000000:0400a001:00fd3a1e:00080000:00023c00:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:
mmc0: flushing cache took 5 ms, 1 iterations, error 0
Samsung Pro Plus 512GiB V30 A2 (ext reg general info is all zeroes, cache not enabled):
80000000:08000000:04009000:0811391e:00080000:0002fc00:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:00000000:
I ordered a Kingston Canvas Go Plus card as you described but it won't arrive for a week.
I'm really interested in what you would find with that one.
It worked just fine, but maybe it is a newer version of hw / firmware - the date is 04/2023
$ grep -H . /sys/class/mmc_host/mmc0/mmc0\:5048/*
grep: /sys/class/mmc_host/mmc0/mmc0:5048/block: Is a directory
/sys/class/mmc_host/mmc0/mmc0:5048/cid:9f54495344363447614b1004af017400
/sys/class/mmc_host/mmc0/mmc0:5048/csd:400e00325b590001cf9f7f800a400000
/sys/class/mmc_host/mmc0/mmc0:5048/date:04/2023
grep: /sys/class/mmc_host/mmc0/mmc0:5048/driver: Is a directory
/sys/class/mmc_host/mmc0/mmc0:5048/dsr:0x404
/sys/class/mmc_host/mmc0/mmc0:5048/erase_size:512
/sys/class/mmc_host/mmc0/mmc0:5048/fwrev:0x1
/sys/class/mmc_host/mmc0/mmc0:5048/hwrev:0x6
/sys/class/mmc_host/mmc0/mmc0:5048/manfid:0x00009f
/sys/class/mmc_host/mmc0/mmc0:5048/name:SD64G
/sys/class/mmc_host/mmc0/mmc0:5048/ocr:0x00200000
/sys/class/mmc_host/mmc0/mmc0:5048/oemid:0x5449
grep: /sys/class/mmc_host/mmc0/mmc0:5048/power: Is a directory
/sys/class/mmc_host/mmc0/mmc0:5048/preferred_erase_size:4194304
/sys/class/mmc_host/mmc0/mmc0:5048/rca:0x5048
/sys/class/mmc_host/mmc0/mmc0:5048/scr:0205848701006432
/sys/class/mmc_host/mmc0/mmc0:5048/serial:0x4b1004af
/sys/class/mmc_host/mmc0/mmc0:5048/ssr:000000000800000004009000011b391e000800000002ff0003000000000000000000000000000000000000000000000000000000000000000000000000000000
grep: /sys/class/mmc_host/mmc0/mmc0:5048/subsystem: Is a directory
/sys/class/mmc_host/mmc0/mmc0:5048/type:SD
/sys/class/mmc_host/mmc0/mmc0:5048/uevent:DRIVER=mmcblk
/sys/class/mmc_host/mmc0/mmc0:5048/uevent:MMC_TYPE=SD
/sys/class/mmc_host/mmc0/mmc0:5048/uevent:MMC_NAME=SD64G
/sys/class/mmc_host/mmc0/mmc0:5048/uevent:MODALIAS=mmc:block
This one I have here is certainly older (this time tested on STM32MP135F):
$ grep -H . /sys/class/mmc_host/mmc0/mmc0\:5048/*
/sys/class/mmc_host/mmc0/mmc0:5048/cid:9f544953443634476136980065013b7e
/sys/class/mmc_host/mmc0/mmc0:5048/csd:400e00325b590001cfff7f800a4000fa
/sys/class/mmc_host/mmc0/mmc0:5048/date:11/2019
/sys/class/mmc_host/mmc0/mmc0:5048/dsr:0x404
/sys/class/mmc_host/mmc0/mmc0:5048/erase_size:512
/sys/class/mmc_host/mmc0/mmc0:5048/fwrev:0x1
/sys/class/mmc_host/mmc0/mmc0:5048/hwrev:0x6
/sys/class/mmc_host/mmc0/mmc0:5048/manfid:0x00009f
/sys/class/mmc_host/mmc0/mmc0:5048/name:SD64G
/sys/class/mmc_host/mmc0/mmc0:5048/ocr:0x00300000
/sys/class/mmc_host/mmc0/mmc0:5048/oemid:0x5449
/sys/class/mmc_host/mmc0/mmc0:5048/preferred_erase_size:4194304
/sys/class/mmc_host/mmc0/mmc0:5048/rca:0x5048
/sys/class/mmc_host/mmc0/mmc0:5048/scr:0205848701006432
/sys/class/mmc_host/mmc0/mmc0:5048/serial:0x36980065
/sys/class/mmc_host/mmc0/mmc0:5048/ssr:000000000800000004009000011b391e000800000002ff0003000000000000000000000000000000000000000000000000000000000000000000000000000000
/sys/class/mmc_host/mmc0/mmc0:5048/type:SD
/sys/class/mmc_host/mmc0/mmc0:5048/uevent:DRIVER=mmcblk
/sys/class/mmc_host/mmc0/mmc0:5048/uevent:MMC_TYPE=SD
/sys/class/mmc_host/mmc0/mmc0:5048/uevent:MMC_NAME=SD64G
/sys/class/mmc_host/mmc0/mmc0:5048/uevent:MODALIAS=mmc:block
cid, csr, date, ocr, serial differ.
I have been trying to follow the progress around this matter. If I
understand correctly we are leaning towards making a card quirk for
this particular SD, to avoid us from turning on and using a broken
cache feature.
Or what are you thinking?
That is probably the simplest option.
Just give me a day or two to test the other newer card.
What would you base that quirk off of ? Date ? We don't know when the
"fixed" cards started to be produced .
Right. It seems like the best we can do is to make a quirk for that
particular version of card that you proved to have failed.
I just sent a few more data points . It is either date, or C_SIZE, or
ERASE_TIMEOUT (if I decode it right). I can also just archive the card,
since we have sample size of the defective card equal 1 . It could just
be a defective card after all, although the fact that only cache would
be defective is unusual.