On 2024/6/6 16:44, Friedrich Weber wrote: > On 05/06/2024 16:27, Chengming Zhou wrote: >> On 2024/6/5 21:34, Friedrich Weber wrote: >>> On 05/06/2024 12:54, Friedrich Weber wrote: >>> [...] >>> >>> My results: >>> >>> Booting the Debian (virtual) machine with mainline kernel v6.10-rc2 >>> (c3f38fa61af77b49866b006939479069cd451173): >>> works fine, no crash >>> >>> Booting the Debian (virtual) machine with patch "block: fix >>> request.queuelist usage in flush" applied on top of v6.10-rc2: The >>> Debian (virtual) machine crashes during boot with [1]. >>> >>> Hope this helps! If I can provide anything else, just let me know. >> >> Thanks for your help, I still can't reproduce it myself, don't know why. > > Weird -- when booting the Debian machine into mainline kernel v6.10-rc2 > with "block: fix request.queuelist usage in flush" applied on top, it > crashes reliably for me. The machine having its root on LVM seems to be > essential to reproduce the crash, though. Yeah, right, it seems LVM may create this special request that only has PREFLUSH | POSTFLUSH without any DATA, goes into the flush state machine. Then, cause the request double list_add_tail() without list_del_init(). I don't know the reason behind it, but well, it's allowable in the current flush code. > > Maybe the fact that I'm running the Debian machine virtualized makes the > crash more likely to trigger. I'll try to reproduce on bare metal to > narrow down the reproducer and get back to you. Thanks much for your very detailed process on that thread! > >> Could you help to test with this diff? >> >> diff --git a/block/blk-flush.c b/block/blk-flush.c >> index e7aebcf00714..cca4f9131f79 100644 >> --- a/block/blk-flush.c >> +++ b/block/blk-flush.c >> @@ -263,6 +263,7 @@ static enum rq_end_io_ret flush_end_io(struct request *flush_rq, >> unsigned int seq = blk_flush_cur_seq(rq); >> >> BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH); >> + list_del_init(&rq->queuelist); >> blk_flush_complete_seq(rq, fq, seq, error); >> } > > I used mainline kernel v6.10-rc2 as base and applied: > > - "block: fix request.queuelist usage in flush" > - Your `list_del_init` addition from above > > and if I boot the Debian machine into this kernel, I do not get the > crash anymore. Good to hear. So can I merge these two diffs into one patch and add your Tested-by? > > Happy to run more tests for you, just let me know. Thanks again!