On 2024/6/26 11:56, Peter Wang (王信友) wrote: > On Tue, 2024-06-25 at 09:42 -0700, Bart Van Assche wrote: >> >> >> Please include a full root cause analysis when reposting fixes for >> the >> reported crashes. It is not clear to me how it is possible that an >> invalid pointer is passed to blk_mq_unique_tag() (0x194). As I >> mentioned >> in my previous email, freeing a request does not modify the request >> pointer and does not modify the SCSI command pointer either. As one >> can >> derive from the blk_mq_alloc_rqs() call stack, memory for struct >> request >> and struct scsi_cmnd is allocated at request queue allocation time >> and >> is not freed until the request queue is freed. Hence, for a given >> tag, >> neither the request pointer nor the SCSI command pointer changes as >> long >> as a request queue exists. Hence my request for an explanation how it >> is >> possible that an invalid pointer was passed to blk_mq_unique_tag(). >> >> Thanks, >> >> Bart. >> > > Hi Bart, > > Sorry I have not explain root-cause clearly. > I will add more clear root-cause analyze next version. > > And it is not an invalid pointer is passed to blk_mq_unique_tag(), > I means blk_mq_unique_tag function try access null pointer. > It is differnt and cause misunderstanding. > > The null pinter blk_mq_unique_tag try access is: > rq->mq_hctx(NULL)->queue_num. > Hi Peter, What is queue_num's offset of blk_mq_hw_ctx in your machine? gdb vmlinux (gdb) print /x (int)&((struct blk_mq_hw_ctx *)0)->queue_num $5 = 0x164 I read your descriptions and wondered a same race flow as you described following. But I found the offset mismatch, if the racing flow is correct, then the address accessed in blk_mq_unique_tag() should be 0x164, not 0x194. Maybe the offset is different between our machine? What's more, if the racing flow is correct, I did not get how your changes can address this racing flow. > The racing flow is: > > Thread A > ufshcd_err_handler step 1 > ufshcd_cmd_inflight(true) step 3 > ufshcd_mcq_req_to_hwq > blk_mq_unique_tag > rq->mq_hctx->queue_num step 5 > > Thread B > ufs_mtk_mcq_intr(cq complete ISR) step 2 > scsi_done > ... > __blk_mq_free_request > rq->mq_hctx = NULL; step 4 > > Thanks. > Peter > > > >