On 07/06/2024 04:37, Chengming Zhou wrote: > On 2024/6/6 16:44, Friedrich Weber wrote: >> [...] >> >> I used mainline kernel v6.10-rc2 as base and applied: >> >> - "block: fix request.queuelist usage in flush" >> - Your `list_del_init` addition from above >> >> and if I boot the Debian machine into this kernel, I do not get the >> crash anymore. > > Good to hear. So can I merge these two diffs into one patch and add > your Tested-by? I applied your merged patch [1] on top of mainline v6.10-rc2 (c3f38fa61a): - I cannot reproduce the crash from [0] anymore in the (virtual) machine with root (on LVM) on software RAID1 - I cannot reproduce the `blk_mq_requeue_work` crash from this thread anymore in the Debian VM with root on LVM. With cache=writeback for the VM disk, I get the expected in-guest WARNING [2] on VM boot. - No crashes on bare-metal either. If write caching is enabled, I get the expected WARNING. So this looks good to me: Tested-by: Friedrich Weber <f.weber@xxxxxxxxxxx> We might backport the merged patch (minus the WARN, possibly) to our downstream Proxmox VE kernel 6.9 to fix the software RAID crash [0] -- if I understand correctly, the merged patch should be safe for now until dm is fixed. Thanks a lot for your work on this! Best, Friedrich [0] https://lore.kernel.org/all/14b89dfb-505c-49f7-aebb-01c54451db40@xxxxxxxxxxx/ [1] diff --git a/block/blk-flush.c b/block/blk-flush.c index c17cf8ed8113..3d72393a1710 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -185,7 +185,7 @@ static void blk_flush_complete_seq(struct request *rq, /* queue for flush */ if (list_empty(pending)) fq->flush_pending_since = jiffies; - list_move_tail(&rq->queuelist, pending); + list_add_tail(&rq->queuelist, pending); break; case REQ_FSEQ_DATA: @@ -263,6 +263,7 @@ static enum rq_end_io_ret flush_end_io(struct request *flush_rq, unsigned int seq = blk_flush_cur_seq(rq); BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH); + list_del_init(&rq->queuelist); blk_flush_complete_seq(rq, fq, seq, error); } @@ -402,6 +403,12 @@ bool blk_insert_flush(struct request *rq) unsigned int policy = blk_flush_policy(fflags, rq); struct blk_flush_queue *fq = blk_get_flush_queue(q, rq->mq_ctx); + /* + * PREFLUSH | POSTFLUSH is a weird invalid format, + * need to fix in the upper layer, catch it here. + */ + WARN_ON_ONCE(policy == (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH)); + /* FLUSH/FUA request must never be merged */ WARN_ON_ONCE(rq->bio != rq->biotail); [2] [ 2.142204] ------------[ cut here ]------------ [ 2.142206] WARNING: CPU: 0 PID: 179 at block/blk-flush.c:410 blk_insert_flush+0xff/0x270 [ 2.142211] Modules linked in: efi_pstore(E) dmi_sysfs(E) qemu_fw_cfg(E) ip_tables(E) x_tables(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) bochs(E) drm_vram_helper(E) drm_ttm_helper(E) ahci(E) ttm(E) i2c_piix4(E) uhci_hcd(E) ehci_hcd(E) libahci(E) pata_acpi(E) floppy(E) [ 2.142225] CPU: 0 PID: 179 Comm: jbd2/dm-0-8 Tainted: G E 6.10.0-rc2-nohardened-patch0607+ #41 [ 2.142226] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 2.142227] RIP: 0010:blk_insert_flush+0xff/0x270 [ 2.142229] Code: cc cc cc cc a9 00 00 04 00 74 3d 44 89 e2 83 ca 01 4d 85 c0 75 69 a9 00 00 02 00 0f 84 15 01 00 00 45 85 e4 0f 85 59 01 00 00 <0f> 0b 48 39 ce 0f 85 44 01 00 00 25 ff ff f9 ff 41 bc 05 00 00 00 [ 2.142230] RSP: 0018:ffffa608c0303a30 EFLAGS: 00010246 [ 2.142231] RAX: 0000000000069801 RBX: ffff93b70dc89600 RCX: ffff93b70dd7baf8 [ 2.142233] RDX: 0000000000000001 RSI: ffff93b70dd7baf8 RDI: ffff93b70d545e00 [ 2.142233] RBP: ffffa608c0303a48 R08: 0000000000000000 R09: 0000000000000000 [ 2.142234] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 2.142235] R13: ffff93b70d127980 R14: 0000000000000000 R15: ffff93b70dc89600 [ 2.142236] FS: 0000000000000000(0000) GS:ffff93b837c00000(0000) knlGS:0000000000000000 [ 2.142237] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.142238] CR2: 00005d28ed347fb8 CR3: 000000010d62e000 CR4: 00000000000006f0 [ 2.142240] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2.142240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2.142241] Call Trace: [ 2.142244] <TASK> [ 2.142248] ? show_regs+0x6c/0x80 [ 2.142251] ? __warn+0x88/0x140 [ 2.142253] ? blk_insert_flush+0xff/0x270 [ 2.142254] ? report_bug+0x182/0x1b0 [ 2.142256] ? handle_bug+0x46/0x90 [ 2.142258] ? exc_invalid_op+0x18/0x80 [ 2.142259] ? asm_exc_invalid_op+0x1b/0x20 [ 2.142261] ? blk_insert_flush+0xff/0x270 [ 2.142262] blk_mq_submit_bio+0x5c9/0x740 [ 2.142265] __submit_bio+0x6e/0x250 [ 2.142267] submit_bio_noacct_nocheck+0x1a3/0x3c0 [ 2.142269] submit_bio_noacct+0x1dc/0x650 [ 2.142271] submit_bio+0xb1/0x110 [ 2.142272] submit_bh_wbc+0x163/0x1a0 [ 2.142274] submit_bh+0x12/0x20 [ 2.142275] journal_submit_commit_record+0x1c5/0x250 [ 2.142278] jbd2_journal_commit_transaction+0x120d/0x1960 [ 2.142281] ? __schedule+0x408/0x15d0 [ 2.142284] kjournald2+0xaa/0x280 [ 2.142285] ? __pfx_autoremove_wake_function+0x10/0x10 [ 2.142288] ? __pfx_kjournald2+0x10/0x10 [ 2.142289] kthread+0xe4/0x110 [ 2.142291] ? __pfx_kthread+0x10/0x10 [ 2.142292] ret_from_fork+0x47/0x70 [ 2.142294] ? __pfx_kthread+0x10/0x10 [ 2.142295] ret_from_fork_asm+0x1a/0x30 [ 2.142298] </TASK> [ 2.142298] ---[ end trace 0000000000000000 ]---