This is a note to let you know that I've just added the patch titled block: fix request.queuelist usage in flush to the 6.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: block-fix-request.queuelist-usage-in-flush.patch and it can be found in the queue-6.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit 63f2509974d6a40e5d315f68c854a7d4e0808bad Author: Chengming Zhou <chengming.zhou@xxxxxxxxx> Date: Sat Jun 8 22:31:15 2024 +0800 block: fix request.queuelist usage in flush [ Upstream commit d0321c812d89c5910d8da8e4b10c891c6b96ff70 ] Friedrich Weber reported a kernel crash problem and bisected to commit 81ada09cc25e ("blk-flush: reuse rq queuelist in flush state machine"). The root cause is that we use "list_move_tail(&rq->queuelist, pending)" in the PREFLUSH/POSTFLUSH sequences. But rq->queuelist.next == xxx since it's popped out from plug->cached_rq in __blk_mq_alloc_requests_batch(). We don't initialize its queuelist just for this first request, although the queuelist of all later popped requests will be initialized. Fix it by changing to use "list_add_tail(&rq->queuelist, pending)" so rq->queuelist doesn't need to be initialized. It should be ok since rq can't be on any list when PREFLUSH or POSTFLUSH, has no move actually. Please note the commit 81ada09cc25e ("blk-flush: reuse rq queuelist in flush state machine") also has another requirement that no drivers would touch rq->queuelist after blk_mq_end_request() since we will reuse it to add rq to the post-flush pending list in POSTFLUSH. If this is not true, we will have to revert that commit IMHO. This updated version adds "list_del_init(&rq->queuelist)" in flush rq callback since the dm layer may submit request of a weird invalid format (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH), which causes double list_add if without this "list_del_init(&rq->queuelist)". The weird invalid format problem should be fixed in dm layer. Reported-by: Friedrich Weber <f.weber@xxxxxxxxxxx> Closes: https://lore.kernel.org/lkml/14b89dfb-505c-49f7-aebb-01c54451db40@xxxxxxxxxxx/ Closes: https://lore.kernel.org/lkml/c9d03ff7-27c5-4ebd-b3f6-5a90d96f35ba@xxxxxxxxxxx/ Fixes: 81ada09cc25e ("blk-flush: reuse rq queuelist in flush state machine") Cc: Christoph Hellwig <hch@xxxxxx> Cc: ming.lei@xxxxxxxxxx Cc: bvanassche@xxxxxxx Tested-by: Friedrich Weber <f.weber@xxxxxxxxxxx> Signed-off-by: Chengming Zhou <chengming.zhou@xxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Link: https://lore.kernel.org/r/20240608143115.972486-1-chengming.zhou@xxxxxxxxx Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/block/blk-flush.c b/block/blk-flush.c index b0f314f4bc149..9944414bf7eed 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -183,7 +183,7 @@ static void blk_flush_complete_seq(struct request *rq, /* queue for flush */ if (list_empty(pending)) fq->flush_pending_since = jiffies; - list_move_tail(&rq->queuelist, pending); + list_add_tail(&rq->queuelist, pending); break; case REQ_FSEQ_DATA: @@ -261,6 +261,7 @@ static enum rq_end_io_ret flush_end_io(struct request *flush_rq, unsigned int seq = blk_flush_cur_seq(rq); BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH); + list_del_init(&rq->queuelist); blk_flush_complete_seq(rq, fq, seq, error); }