On Tue, 9 Mar 2021 at 05:11, Kiwoong Kim <kwmad.kim@xxxxxxxxxxx> wrote: > > Dear All > > I want to discuss about this topic with you guys. > > There is an application to put many IO requests to a SD card right after completion of system resume. > Current MMC stack invokes mmc_rescan asynchronously for PM_POST_SUSPEND. > As reported to me, there could be a race between an IO thread and the kworker for mmc_rescan, > especially when the application mentioned before is installed and the function of mmc_rescan is run later than expected > For a series of IO requests, mmc_rescan, particularily at __mmc_claim_host called in mmc_sd_detect, didn't acquire a host for longer than expected. That's not a problem, in general. As long as the card is inserted and functional, the I/O operations should be completed successfully. It doesn't matter if mmc_rescan() is waiting to claim the host, as it's not important that it gets to run as long as the card remains inserted/functional. > > Below is the call stacks shown after the symptom happened and the system tried to enter into suspend again. > In this case, mmc pm notifier is called with PM_SUSPEND_PREPARE, so it waits for completion or cancelling the work for mmc_rescan. > For the latency, mobile users can see black screen for a long time sometimes, even with pushing a power button to wake up the system. I think I understand what you are saying, but please correct me if I am wrong. The I/O requests keep flowing into the blk queue even after PM_SUSPEND_PREPARE has been fired, thus preventing the earlier scheduled mmc_rescan() from claiming the host? This sounds quite reasonable that it could happen, at least theoretically. Although, I am a bit surprised that nobody has reported about this problem, until now. The design in the mmc core, has remained unchanged in regards to this behaviour, for a very very long time. Let me try to reproduce the problem, I will get back to you soon. In the meantime, I would also appreciate it if you could share, more exactly, how to trigger this problem at your side. > > If there is another way to fix it w/o any change, please let me know. > Or I want to post a patch not to do _mmc_detect_change conditionally > because it helps remove the race and the case assumes SD card initialization is already done. Unfortunately, it's not as easy as that, because we need to detect whether a removable card has been removed after the system has resumed. There are other ways to solve this, but allow me to think a little more about it and get back to you. > > -- > <6>[ 3965.259816] I[0: swapper/0: 0] 22631 0 227 3902403994884 D(2) 3 ffffffc0007ee338 0 ffffffc85678e000 * kworker/u16:4 [__mmc_claim_host] > <6>[ 3965.259839] I[0: swapper/0: 0] Workqueue: kmmcd mmc_rescan > <0>[ 3965.259854] I[0: swapper/0: 0] Call trace: > <4>[ 3965.259870] I[0: swapper/0: 0] [<ffffffc00013e674>] __switch_to+0x80/0x90 > <4>[ 3965.259886] I[0: swapper/0: 0] [<ffffffc000b137f4>] __schedule+0x530/0x76c > <4>[ 3965.259902] I[0: swapper/0: 0] [<ffffffc000b13a9c>] schedule+0x6c/0x7c > <4>[ 3965.259919] I[0: swapper/0: 0] [<ffffffc0007ee338>] __mmc_claim_host+0xb8/0x188 > <4>[ 3965.259937] I[0: swapper/0: 0] [<ffffffc0007eebfc>] mmc_get_card+0x24/0x38 > <4>[ 3965.259956] I[0: swapper/0: 0] [<ffffffc0007f5c44>] mmc_sd_detect+0x90/0xe8 > <4>[ 3965.259973] I[0: swapper/0: 0] [<ffffffc0007f08d8>] mmc_rescan+0xa0/0x2ec > <4>[ 3965.259991] I[0: swapper/0: 0] [<ffffffc0001688e0>] process_one_work+0x26c/0x434 > <4>[ 3965.260008] I[0: swapper/0: 0] [<ffffffc000168dc8>] worker_thread+0x320/0x444 > <4>[ 3965.260026] I[0: swapper/0: 0] [<ffffffc00016dac0>] kthread+0xdc/0xec > .. > <0>[ 3965.259854] I[0: swapper/0: 0] Call trace: > <4>[ 3965.259870] I[0: swapper/0: 0] [<ffffffc00013e674>] __switch_to+0x80/0x90 > <4>[ 3965.259886] I[0: swapper/0: 0] [<ffffffc000b137f4>] __schedule+0x530/0x76c > <4>[ 3965.259902] I[0: swapper/0: 0] [<ffffffc000b13a9c>] schedule+0x6c/0x7c > <4>[ 3965.259919] I[0: swapper/0: 0] [<ffffffc0007ee338>] __mmc_claim_host+0xb8/0x188 > <4>[ 3965.259937] I[0: swapper/0: 0] [<ffffffc0007eebfc>] mmc_get_card+0x24/0x38 > <4>[ 3965.259956] I[0: swapper/0: 0] [<ffffffc0007f5c44>] mmc_sd_detect+0x90/0xe8 > <4>[ 3965.259973] I[0: swapper/0: 0] [<ffffffc0007f08d8>] mmc_rescan+0xa0/0x2ec > <4>[ 3965.259991] I[0: swapper/0: 0] [<ffffffc0001688e0>] process_one_work+0x26c/0x434 > <4>[ 3965.260008] I[0: swapper/0: 0] [<ffffffc000168dc8>] worker_thread+0x320/0x444 > <4>[ 3965.260026] I[0: swapper/0: 0] [<ffffffc00016dac0>] kthread+0xdc/0xec > .. > <0>[ 3965.203379] I[0: swapper/0: 0] Call trace: > <4>[ 3965.203397] I[0: swapper/0: 0] [<ffffffc00013e674>] __switch_to+0x80/0x90 > <4>[ 3965.203414] I[0: swapper/0: 0] [<ffffffc000b137f4>] __schedule+0x530/0x76c > <4>[ 3965.203430] I[0: swapper/0: 0] [<ffffffc000b13a9c>] schedule+0x6c/0x7c > <4>[ 3965.203448] I[0: swapper/0: 0] [<ffffffc000b160e8>] schedule_timeout+0x28/0x214 > <4>[ 3965.203464] I[0: swapper/0: 0] [<ffffffc000b145a0>] wait_for_common+0x114/0x15c > <4>[ 3965.203482] I[0: swapper/0: 0] [<ffffffc000b14710>] wait_for_completion+0x10/0x20 > <4>[ 3965.203498] I[0: swapper/0: 0] [<ffffffc000169990>] flush_work+0xf0/0x11c > <4>[ 3965.203516] I[0: swapper/0: 0] [<ffffffc000169ad0>] __cancel_work_timer+0x114/0x1a0 > <4>[ 3965.203533] I[0: swapper/0: 0] [<ffffffc000169c44>] cancel_delayed_work_sync+0xc/0x1c > <4>[ 3965.203554] I[0: swapper/0: 0] [<ffffffc0007f1000>] mmc_pm_notify+0x78/0x12c > <4>[ 3965.203573] I[0: swapper/0: 0] [<ffffffc00016ebf4>] notifier_call_chain+0x7c/0xec > <4>[ 3965.203590] I[0: swapper/0: 0] [<ffffffc00016ef08>] __blocking_notifier_call_chain+0x44/0x68 > <4>[ 3965.203608] I[0: swapper/0: 0] [<ffffffc00016ef3c>] blocking_notifier_call_chain+0x10/0x20 > <4>[ 3965.203628] I[0: swapper/0: 0] [<ffffffc0001964ec>] pm_notifier_call_chain+0x1c/0x44 > <4>[ 3965.203644] I[0: swapper/0: 0] [<ffffffc000197ad0>] pm_suspend+0x2e0/0x5fc > <4>[ 3965.203662] I[0: swapper/0: 0] [<ffffffc0001961cc>] state_store+0x4c/0x8c > <4>[ 3965.203683] I[0: swapper/0: 0] [<ffffffc0003f35ec>] kobj_attr_store+0x14/0x2c > <4>[ 3965.203703] I[0: swapper/0: 0] [<ffffffc0002aac2c>] sysfs_kf_write+0x40/0x50 > <4>[ 3965.203721] I[0: swapper/0: 0] [<ffffffc0002aa000>] kernfs_fop_write+0xfc/0x148 > <4>[ 3965.203739] I[0: swapper/0: 0] [<ffffffc000249b4c>] vfs_write+0xb8/0x190 > <4>[ 3965.203755] I[0: swapper/0: 0] [<ffffffc000249cf0>] SyS_write+0x44/0x88 Kind regards Uffe