On 03/11/2021 15:40, Bart Van Assche wrote: > On 11/2/21 23:56, Adrian Hunter wrote: >> On 03/11/2021 02:05, Bart Van Assche wrote: >>> The following deadlock has been observed on a test setup: >>> * All tags allocated. >>> * The SCSI error handler calls ufshcd_eh_host_reset_handler() >>> * ufshcd_eh_host_reset_handler() queues work that calls ufshcd_err_handler() >>> * ufshcd_err_handler() locks up as follows: >>> >>> Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt >>> Call trace: >>> __switch_to+0x298/0x5d8 >>> __schedule+0x6cc/0xa94 >>> schedule+0x12c/0x298 >>> blk_mq_get_tag+0x210/0x480 >>> __blk_mq_alloc_request+0x1c8/0x284 >>> blk_get_request+0x74/0x134 >>> ufshcd_exec_dev_cmd+0x68/0x640 >>> ufshcd_verify_dev_init+0x68/0x35c >>> ufshcd_probe_hba+0x12c/0x1cb8 >>> ufshcd_host_reset_and_restore+0x88/0x254 >>> ufshcd_reset_and_restore+0xd0/0x354 >>> ufshcd_err_handler+0x408/0xc58 >>> process_one_work+0x24c/0x66c >>> worker_thread+0x3e8/0xa4c >>> kthread+0x150/0x1b4 >>> ret_from_fork+0x10/0x30 >>> >>> Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved >>> request. >> >> It is worth noting that the error handler itself could always find >> a free slot, either by waiting for one, or by taking the reset >> path which clears all slots. > > I do not agree. As mentioned in the patch description, this patch is a fix for a scenario in which ufshcd_eh_host_reset_handler() waits until ufshcd_err_handler() finishes. ufshcd_err_handler() does not finish since there are no tags and no tags will be freed since that is the responsibility of ufshcd_eh_host_reset_handler() but it is blocked ... I am referring to the host controller slots, not block layer tags. The error handler does not need a free tag, it only needs a free slot.