On 14/10/2021 07:14, Bart Van Assche wrote: > On 10/13/21 08:01, Adrian Hunter wrote: >> __ufshcd_issue_tm_cmd() clears req->end_io_data after timing out, >> which races with the completion function ufshcd_tmc_handler() which >> expects req->end_io_data to have a value. >> >> Note __ufshcd_issue_tm_cmd() and ufshcd_tmc_handler() are already >> synchronized using hba->tmf_rqs and hba->outstanding_tasks under the >> host_lock spinlock. >> >> It is also not necessary (nor typical) to clear req->end_io_data because >> the block layer does it before allocating out requests e.g. via >> blk_get_request(). >> >> So fix by not clearing it. >> >> Fixes: f5ef336fd2e4c3 ("scsi: ufs: core: Fix task management completion") >> Signed-off-by: Adrian Hunter <adrian.hunter@xxxxxxxxx> >> --- >> drivers/scsi/ufs/ufshcd.c | 5 ----- >> 1 file changed, 5 deletions(-) >> >> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >> index 95be7ecdfe10..f34b3994d1aa 100644 >> --- a/drivers/scsi/ufs/ufshcd.c >> +++ b/drivers/scsi/ufs/ufshcd.c >> @@ -6550,11 +6550,6 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, >> err = wait_for_completion_io_timeout(&wait, >> msecs_to_jiffies(TM_CMD_TIMEOUT)); >> if (!err) { >> - /* >> - * Make sure that ufshcd_compl_tm() does not trigger a >> - * use-after-free. >> - */ >> - req->end_io_data = NULL; >> ufshcd_add_tm_upiu_trace(hba, task_tag, UFS_TM_ERR); >> dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n", >> __func__, tm_function); > > With this patch applied ufshcd_tmc_handler() can trigger a use-after-free of the stack memory used for the 'wait' completion. AFAICT that would only happen because blk_mq_tagset_busy_iter() is not synchronized with respect to blk_put_request(). But ufshcd_tmc_handler() does not use blk_mq_tagset_busy_iter() anymore, so that can't happen. Wouldn't it be better to keep the code that clears req->end_io_data and to change complete(c) into if(c) complete(c) in ufshcd_tmc_handler()? If that were needed, it would imply the synchronization was broken i.e. why are we referencing a request that has already been through blk_put_request()?