From: Guoqing Jiang <jiangguoqing@xxxxxxxxxx> We have used generic io accounting functions to manage md io stats, then for split bio, md also change it's bi_private and bi_end_io, which could trigger double fault problem. 1146 <0>[33685.629591] traps: PANIC: double fault, error_code: 0x0 1147 <4>[33685.629593] double fault: 0000 [#1] SMP NOPTI 1148 <4>[33685.629594] CPU: 10 PID: 2118287 Comm: kworker/10:0 Tainted: P OE 5.11.8-051108-generic #202103200636 1149 <4>[33685.629595] Hardware name: ASUSTeK COMPUTER INC. KRPG-U8 Series/KRPG-U8 Series, BIOS 4201 09/25/2020 1150 <4>[33685.629595] Workqueue: xfs-conv/md12 xfs_end_io [xfs] 1151 <4>[33685.629596] RIP: 0010:__slab_free+0x23/0x340 1152 <4>[33685.629597] Code: 4c fe ff ff 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 57 49 89 cf 41 56 49 89 fe 41 55 41 54 49 89 f4 53 48 83 e4 f0 48 83 ec 70 <48> 89 54 24 28 0f 1f 44 00 00 41 8b 46 28 4d 8b 6c 24 20 49 8b 5c 1153 <4>[33685.629598] RSP: 0018:ffffa9bc00848fa0 EFLAGS: 00010086 1154 <4>[33685.629599] RAX: ffff94c04d8b10a0 RBX: ffff94437a34a880 RCX: ffff94437a34a880 1155 <4>[33685.629599] RDX: ffff94437a34a880 RSI: ffffcec745e8d280 RDI: ffff944300043b00 1156 <4>[33685.629599] RBP: ffffa9bc00849040 R08: 0000000000000001 R09: ffffffff82a5d6de 1157 <4>[33685.629600] R10: 0000000000000001 R11: 000000009c109000 R12: ffffcec745e8d280 1158 <4>[33685.629600] R13: ffff944300043b00 R14: ffff944300043b00 R15: ffff94437a34a880 1159 <4>[33685.629601] FS: 0000000000000000(0000) GS:ffff94c04d880000(0000) knlGS:0000000000000000 1160 <4>[33685.629601] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 1161 <4>[33685.629602] CR2: ffffa9bc00848f98 CR3: 000000014d04e000 CR4: 0000000000350ee0 1162 <4>[33685.629602] Call Trace: 1163 <4>[33685.629603] <IRQ> 1164 <4>[33685.629603] ? kfree+0x3bc/0x3e0 1165 <4>[33685.629603] ? mempool_kfree+0xe/0x10 1166 <4>[33685.629603] ? mempool_kfree+0xe/0x10 1167 <4>[33685.629604] ? mempool_free+0x2f/0x80 1168 <4>[33685.629604] ? md_end_io+0x4a/0x70 1169 <4>[33685.629604] ? bio_endio+0xdc/0x130 1170 <4>[33685.629605] ? bio_chain_endio+0x2d/0x40 1171 <4>[33685.629605] ? md_end_io+0x5c/0x70 1172 <4>[33685.629605] ? bio_endio+0xdc/0x130 1173 <4>[33685.629605] ? bio_chain_endio+0x2d/0x40 1174 <4>[33685.629606] ? md_end_io+0x5c/0x70 1175 <4>[33685.629606] ? bio_endio+0xdc/0x130 1176 <4>[33685.629606] ? bio_chain_endio+0x2d/0x40 1177 <4>[33685.629607] ? md_end_io+0x5c/0x70 ... repeated ... 1436 <4>[33685.629677] ? bio_endio+0xdc/0x130 1437 <4>[33685.629677] ? bio_chain_endio+0x2d/0x40 1438 <4>[33685.629677] ? md_end_io+0x5c/0x70 1439 <4>[33685.629677] ? bio_endio+0xdc/0x130 1440 <4>[33685.629678] ? bio_chain_endio+0x2d/0x40 1441 <4>[33685.629678] ? md_ 1442 <4>[33685.629679] Lost 357 message(s)! It looks like stack overflow happened for split bio, to fix this, let's keep split bio untouched in md_submit_bio. As a side effect, we need to export bio_chain_endio. [1]. https://lore.kernel.org/linux-raid/3bf04253-3fad-434a-63a7-20214e38cf26@xxxxxxxxx/T/#t Reported-by: Paweł Wiejacha <pawel.wiejacha@xxxxxxxxxxxx> Tested-by: Paweł Wiejacha <pawel.wiejacha@xxxxxxxxxxxx> Fixes: 41d2d848e5c0 ("md: improve io stats accounting") Signed-off-by: Guoqing Jiang <jiangguoqing@xxxxxxxxxx> --- block/bio.c | 3 ++- drivers/md/md.c | 2 +- include/linux/bio.h | 1 + 3 files changed, 4 insertions(+), 2 deletions(-) diff --git a/block/bio.c b/block/bio.c index 44205dfb6b60..759da1f6ab61 100644 --- a/block/bio.c +++ b/block/bio.c @@ -283,10 +283,11 @@ static struct bio *__bio_chain_endio(struct bio *bio) return parent; } -static void bio_chain_endio(struct bio *bio) +void bio_chain_endio(struct bio *bio) { bio_endio(__bio_chain_endio(bio)); } +EXPORT_SYMBOL(bio_chain_endio); /** * bio_chain - chain bio completions diff --git a/drivers/md/md.c b/drivers/md/md.c index 49f897fbb89b..02fd272ff6f7 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -489,7 +489,7 @@ static blk_qc_t md_submit_bio(struct bio *bio) return BLK_QC_T_NONE; } - if (bio->bi_end_io != md_end_io) { + if (bio->bi_end_io != md_end_io && bio->bi_end_io != bio_chain_endio) { struct md_io *md_io; md_io = mempool_alloc(&mddev->md_io_pool, GFP_NOIO); diff --git a/include/linux/bio.h b/include/linux/bio.h index a0b4cfdf62a4..6ea48fa1ad64 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -465,6 +465,7 @@ extern void bio_init(struct bio *bio, struct bio_vec *table, extern void bio_uninit(struct bio *); extern void bio_reset(struct bio *); void bio_chain(struct bio *, struct bio *); +extern void bio_chain_endio(struct bio *bio); extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int); extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *, -- 2.25.1