On 6/25/24 8:56 PM, Peter Wang (王信友) wrote:
Sorry I have not explain root-cause clearly. I will add more clear root-cause analyze next version. And it is not an invalid pointer is passed to blk_mq_unique_tag(), I means blk_mq_unique_tag function try access null pointer. It is differnt and cause misunderstanding. The null pinter blk_mq_unique_tag try access is: rq->mq_hctx(NULL)->queue_num. The racing flow is: Thread A ufshcd_err_handler step 1 ufshcd_cmd_inflight(true) step 3 ufshcd_mcq_req_to_hwq blk_mq_unique_tag rq->mq_hctx->queue_num step 5 Thread B ufs_mtk_mcq_intr(cq complete ISR) step 2 scsi_done ... __blk_mq_free_request rq->mq_hctx = NULL; step 4
How about surrounding the blk_mq_unique_tag() call with atomic_inc_not_zero(&req->ref) / atomic_dec(&req->ref)? Thanks, Bart.