On 19/12/23 23:18, Liming Sun wrote: > > >> -----Original Message----- >> From: Adrian Hunter <adrian.hunter@xxxxxxxxx> >> Sent: Monday, December 11, 2023 6:39 AM >> To: Liming Sun <limings@xxxxxxxxxx>; Christian Loehle >> <christian.loehle@xxxxxxx>; Ulf Hansson <ulf.hansson@xxxxxxxxxx>; David >> Thompson <davthompson@xxxxxxxxxx> >> Cc: linux-mmc@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx >> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for >> BlueField-3 SoC >> >> On 30/11/23 15:19, Liming Sun wrote: >>> >>> >>>> -----Original Message----- >>>> From: Christian Loehle <christian.loehle@xxxxxxx> >>>> Sent: Monday, November 27, 2023 8:36 AM >>>> To: Liming Sun <limings@xxxxxxxxxx>; Adrian Hunter >>>> <adrian.hunter@xxxxxxxxx>; Ulf Hansson <ulf.hansson@xxxxxxxxxx>; David >>>> Thompson <davthompson@xxxxxxxxxx> >>>> Cc: linux-mmc@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx >>>> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk >> for >>>> BlueField-3 SoC >>>> >>>> On 18/11/2023 13:46, Liming Sun wrote: >>>>> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the >>>>> intermittent eMMC timeout issue reported on some cards under eMMC >>>>> stress test. >>>>> >>>>> Reported error message: >>>>> dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110 >>>>> >>>>> Signed-off-by: Liming Sun <limings@xxxxxxxxxx> >>>>> --- >>>>> drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++- >>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c >>>> b/drivers/mmc/host/sdhci-of-dwcmshc.c >>>>> index 3a3bae6948a8..3c8fe8aec558 100644 >>>>> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c >>>>> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c >>>>> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data >>>> sdhci_dwcmshc_pdata = { >>>>> #ifdef CONFIG_ACPI >>>>> static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = { >>>>> .ops = &sdhci_dwcmshc_ops, >>>>> - .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN, >>>>> + .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN | >>>>> + SDHCI_QUIRK_BROKEN_TIMEOUT_VAL, >>>>> .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN | >>>>> SDHCI_QUIRK2_ACMD23_BROKEN, >>>>> }; >>>> >>>> __mmc_blk_ioctl_cmd: data error ? >>>> What stresstest are you running that issues ioctl commands? >>>> On which commands does the timeout occur? >>>> Anyway you should be able to increase the timeout in ioctl structure >>>> directly, i.e. in userspace, or does that not work? >>> >>> It's running stress test with tool like "fio --name=randrw_stress_round_1 -- >> ioengine=libaio --direct=1 --time_based=1 --end_fsync=1 --ramp_time=5 -- >> norandommap=1 --randrepeat=0 --group_reporting=1 --numjobs=4 -- >> iodepth=128 --rw=randrw --overwrite=1 --runtime=36000 -- >> bssplit=4K/44:8K/1:12K/1:16K/1:24K/1:28K/1:32K/1:40K/32:64K/5:68K/7:72K >> /3:76K/3 --filename=/dev/mmcblk0" >>> The tool(application) is owned by user or with some standard tool. >> >> fio does not send mmc ioctls, so I am also a bit confused about >> how you get "__mmc_blk_ioctl_cmd: data error -110" ? > > There are other activities or background task going on. I assume it's other > MMC access which are affected by the stress FIO and got timeout. Would it make sense? > It depends on whether the IOCTL is overriding the timeout. In struct mmc_ioc_cmd there is data_timeout_ns which overrides the mmc core data timeout calculated by mmc_set_data_timeout(). There is also cmd_timeout_ms for commands. You need to check whether "__mmc_blk_ioctl_cmd: data error -110" is because data_timeout_ns was set too low (but non-zero) by the caller of the IOCTL.