On 25/04/24 19:18, Ulf Hansson wrote: > + Wolfram, Adrian (to see if they have some input) > > On Tue, 23 Apr 2024 at 22:02, Kamal Dasu <kamal.dasu@xxxxxxxxxxxx> wrote: >> >> When erase/trim/discard completion was converted to mmc_poll_for_busy(), >> optional ->card_busy() host ops support was added. sdhci card->busy() >> could return busy for long periods to cause mmc_do_erase() to block during >> discard operation as shown below during mkfs.f2fs : >> >> Info: [/dev/mmcblk1p9] Discarding device >> [ 39.597258] sysrq: Show Blocked State >> [ 39.601183] task:mkfs.f2fs state:D stack:0 pid:1561 tgid:1561 ppid:1542 flags:0x0000000d >> [ 39.610609] Call trace: >> [ 39.613098] __switch_to+0xd8/0xf4 >> [ 39.616582] __schedule+0x440/0x4f4 >> [ 39.620137] schedule+0x2c/0x48 >> [ 39.623341] schedule_hrtimeout_range_clock+0xe0/0x114 >> [ 39.628562] schedule_hrtimeout_range+0x10/0x18 >> [ 39.633169] usleep_range_state+0x5c/0x90 >> [ 39.637253] __mmc_poll_for_busy+0xec/0x128 >> [ 39.641514] mmc_poll_for_busy+0x48/0x70 >> [ 39.645511] mmc_do_erase+0x1ec/0x210 >> [ 39.649237] mmc_erase+0x1b4/0x1d4 >> [ 39.652701] mmc_blk_mq_issue_rq+0x35c/0x6ac >> [ 39.657037] mmc_mq_queue_rq+0x18c/0x214 >> [ 39.661022] blk_mq_dispatch_rq_list+0x3a8/0x528 >> [ 39.665722] __blk_mq_sched_dispatch_requests+0x3a0/0x4ac >> [ 39.671198] blk_mq_sched_dispatch_requests+0x28/0x5c >> [ 39.676322] blk_mq_run_hw_queue+0x11c/0x12c >> [ 39.680668] blk_mq_flush_plug_list+0x200/0x33c >> [ 39.685278] blk_add_rq_to_plug+0x68/0xd8 >> [ 39.689365] blk_mq_submit_bio+0x3a4/0x458 >> [ 39.693539] __submit_bio+0x1c/0x80 >> [ 39.697096] submit_bio_noacct_nocheck+0x94/0x174 >> [ 39.701875] submit_bio_noacct+0x1b0/0x22c >> [ 39.706042] submit_bio+0xac/0xe8 >> [ 39.709424] blk_next_bio+0x4c/0x5c >> [ 39.712973] blkdev_issue_secure_erase+0x118/0x170 >> [ 39.717835] blkdev_common_ioctl+0x374/0x728 >> [ 39.722175] blkdev_ioctl+0x8c/0x2b0 >> [ 39.725816] vfs_ioctl+0x24/0x40 >> [ 39.729117] __arm64_sys_ioctl+0x5c/0x8c >> [ 39.733114] invoke_syscall+0x68/0xec >> [ 39.736839] el0_svc_common.constprop.0+0x70/0xd8 >> [ 39.741609] do_el0_svc+0x18/0x20 >> [ 39.744981] el0_svc+0x68/0x94 >> [ 39.748107] el0t_64_sync_handler+0x88/0x124 >> [ 39.752455] el0t_64_sync+0x168/0x16c > > Thanks for the detailed log! > >> >> Fix skips the card->busy() and uses MMC_SEND_STATUS and R1_STATUS >> check for MMC_ERASE_BUSY busy_cmd case in the mmc_busy_cb() function. >> >> Fixes: 0d84c3e6a5b2 ("mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard") >> Signed-off-by: Kamal Dasu <kamal.dasu@xxxxxxxxxxxx> >> --- >> drivers/mmc/core/mmc_ops.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c >> index 3b3adbddf664..603fbd78c342 100644 >> --- a/drivers/mmc/core/mmc_ops.c >> +++ b/drivers/mmc/core/mmc_ops.c >> @@ -464,7 +464,8 @@ static int mmc_busy_cb(void *cb_data, bool *busy) >> u32 status = 0; >> int err; >> >> - if (data->busy_cmd != MMC_BUSY_IO && host->ops->card_busy) { >> + if (data->busy_cmd != MMC_BUSY_IO && >> + data->busy_cmd != MMC_BUSY_ERASE && host->ops->card_busy) { >> *busy = host->ops->card_busy(host); >> return 0; >> } > > So it seems like the ->card_busy() callback is broken in for your mmc > host-driver and platform. Can you perhaps provide the information > about what HW/driver you are using? > > The point with using the ->card_busy() callback, is to avoid sending > the CMD13. Ideally it should be cheaper/faster and in most cases it > translates to a read of a register. For larger erases, we would > probably end up sending the CMD13 periodically every 32-64 ms, which > shouldn't be a problem. However, for smaller erases and discards, we > may want the benefit the ->card_busy() callback provides us. > > I would suggest that we first try to fix the implementation of the > ->card_busy() callback for your HW. If that isn't possible or fails, > then let's consider the approach you have taken in the $subject patch. Note, sdhci drivers can override host->ops. For example, sdhci-omap.c has: host->mmc_host_ops.card_busy = sdhci_omap_card_busy; Probably, if ->card_busy() cannot be supported, then setting it to NULL would work. host->mmc_host_ops.card_busy = NULL; /* Cannot detect card busy */