On 11/2/21 23:56, Adrian Hunter wrote:
On 03/11/2021 02:05, Bart Van Assche wrote:
The following deadlock has been observed on a test setup:
* All tags allocated.
* The SCSI error handler calls ufshcd_eh_host_reset_handler()
* ufshcd_eh_host_reset_handler() queues work that calls ufshcd_err_handler()
* ufshcd_err_handler() locks up as follows:
Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt
Call trace:
__switch_to+0x298/0x5d8
__schedule+0x6cc/0xa94
schedule+0x12c/0x298
blk_mq_get_tag+0x210/0x480
__blk_mq_alloc_request+0x1c8/0x284
blk_get_request+0x74/0x134
ufshcd_exec_dev_cmd+0x68/0x640
ufshcd_verify_dev_init+0x68/0x35c
ufshcd_probe_hba+0x12c/0x1cb8
ufshcd_host_reset_and_restore+0x88/0x254
ufshcd_reset_and_restore+0xd0/0x354
ufshcd_err_handler+0x408/0xc58
process_one_work+0x24c/0x66c
worker_thread+0x3e8/0xa4c
kthread+0x150/0x1b4
ret_from_fork+0x10/0x30
Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved
request.
It is worth noting that the error handler itself could always find
a free slot, either by waiting for one, or by taking the reset
path which clears all slots.
I do not agree. As mentioned in the patch description, this patch is a
fix for a scenario in which ufshcd_eh_host_reset_handler() waits until
ufshcd_err_handler() finishes. ufshcd_err_handler() does not finish
since there are no tags and no tags will be freed since that is the
responsibility of ufshcd_eh_host_reset_handler() but it is blocked ...
For UFS-specific patch sets please always cc me on all patches
in a series including the cover letter.
I will do that.
Thanks,
Bart.