Re: PROBLEM: double fault in md_end_io

Paweł Wiejacha <pawel.wiejacha@xxxxxxxxxxxx> · Tue, 4 May 2021 23:17:44 +0200

Guoqing's patch fixes the problem. Here's the actual patch I am using:

-static void bio_chain_endio(struct bio *bio)
+void bio_chain_endio(struct bio *bio)
 {
    bio_endio(__bio_chain_endio(bio));
 }
+EXPORT_SYMBOL(bio_chain_endio);

 /**
  * bio_chain - chain bio completions

diff --git drivers/md/md.c drivers/md/md.c
index 04384452a7ab..f157bd6e0478 100644
--- drivers/md/md.c
+++ drivers/md/md.c
@@ -507,7 +507,8 @@ static blk_qc_t md_submit_bio(struct bio *bio)
        return BLK_QC_T_NONE;
    }

-   if (bio->bi_end_io != md_end_io) {
+   if (bio->bi_end_io != md_end_io && bio->bi_end_io !=
+                bio_chain_endio) {
        struct md_io *md_io;

        md_io = mempool_alloc(&mddev->md_io_pool, GFP_NOIO);
diff --git include/linux/bio.h include/linux/bio.h
index 1edda614f7ce..bfb5bd0be397 100644
--- include/linux/bio.h
+++ include/linux/bio.h
@@ -427,6 +427,7 @@ static inline struct bio *bio_kmalloc(gfp_t
gfp_mask, unsigned int nr_iovecs)
 extern blk_qc_t submit_bio(struct bio *);

 extern void bio_endio(struct bio *);
+extern void bio_chain_endio(struct bio *bio);

Thanks,
Paweł Wiejacha

On Fri, 23 Apr 2021 at 08:44, Song Liu <song@xxxxxxxxxx> wrote:
>
> On Thu, Apr 22, 2021 at 7:36 PM Guoqing Jiang <jgq516@xxxxxxxxx> wrote:
> >
> >
> >
> > On 2021/4/10 上午5:40, Paweł Wiejacha wrote:
> > > Hello,
> > >
> > > Two of my machines constantly crash with a double fault like this:
> > >
> > > 1146  <0>[33685.629591] traps: PANIC: double fault, error_code: 0x0
> > > 1147  <4>[33685.629593] double fault: 0000 [#1] SMP NOPTI
> > > 1148  <4>[33685.629594] CPU: 10 PID: 2118287 Comm: kworker/10:0
> > > Tainted: P           OE     5.11.8-051108-generic #202103200636
> > > 1149  <4>[33685.629595] Hardware name: ASUSTeK COMPUTER INC. KRPG-U8
> > > Series/KRPG-U8 Series, BIOS 4201 09/25/2020
> > > 1150  <4>[33685.629595] Workqueue: xfs-conv/md12 xfs_end_io [xfs]
> > > 1151  <4>[33685.629596] RIP: 0010:__slab_free+0x23/0x340
> > > 1152  <4>[33685.629597] Code: 4c fe ff ff 0f 1f 00 0f 1f 44 00 00 55
> > > 48 89 e5 41 57 49 89 cf 41 56 49 89 fe 41 55 41 54 49 89 f4 53 48 83
> > > e4 f0 48 83 ec 70 <48> 89 54 24 28 0f 1f 44 00 00 41 8b 46 28 4d 8b 6c
> > > 24 20 49 8b 5c
> > > 1153  <4>[33685.629598] RSP: 0018:ffffa9bc00848fa0 EFLAGS: 00010086
> > > 1154  <4>[33685.629599] RAX: ffff94c04d8b10a0 RBX: ffff94437a34a880
> > > RCX: ffff94437a34a880
> > > 1155  <4>[33685.629599] RDX: ffff94437a34a880 RSI: ffffcec745e8d280
> > > RDI: ffff944300043b00
> > > 1156  <4>[33685.629599] RBP: ffffa9bc00849040 R08: 0000000000000001
> > > R09: ffffffff82a5d6de
> > > 1157  <4>[33685.629600] R10: 0000000000000001 R11: 000000009c109000
> > > R12: ffffcec745e8d280
> > > 1158  <4>[33685.629600] R13: ffff944300043b00 R14: ffff944300043b00
> > > R15: ffff94437a34a880
> > > 1159  <4>[33685.629601] FS:  0000000000000000(0000)
> > > GS:ffff94c04d880000(0000) knlGS:0000000000000000
> > > 1160  <4>[33685.629601] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > 1161  <4>[33685.629602] CR2: ffffa9bc00848f98 CR3: 000000014d04e000
> > > CR4: 0000000000350ee0
> > > 1162  <4>[33685.629602] Call Trace:
> > > 1163  <4>[33685.629603]  <IRQ>
> > > 1164  <4>[33685.629603]  ? kfree+0x3bc/0x3e0
> > > 1165  <4>[33685.629603]  ? mempool_kfree+0xe/0x10
> > > 1166  <4>[33685.629603]  ? mempool_kfree+0xe/0x10
> > > 1167  <4>[33685.629604]  ? mempool_free+0x2f/0x80
> > > 1168  <4>[33685.629604]  ? md_end_io+0x4a/0x70
> > > 1169  <4>[33685.629604]  ? bio_endio+0xdc/0x130
> > > 1170  <4>[33685.629605]  ? bio_chain_endio+0x2d/0x40
> > > 1171  <4>[33685.629605]  ? md_end_io+0x5c/0x70
> > > 1172  <4>[33685.629605]  ? bio_endio+0xdc/0x130
> > > 1173  <4>[33685.629605]  ? bio_chain_endio+0x2d/0x40
> > > 1174  <4>[33685.629606]  ? md_end_io+0x5c/0x70
> > > 1175  <4>[33685.629606]  ? bio_endio+0xdc/0x130
> > > 1176  <4>[33685.629606]  ? bio_chain_endio+0x2d/0x40
> > > 1177  <4>[33685.629607]  ? md_end_io+0x5c/0x70
> > > ... repeated ...
> > > 1436  <4>[33685.629677]  ? bio_endio+0xdc/0x130
> > > 1437  <4>[33685.629677]  ? bio_chain_endio+0x2d/0x40
> > > 1438  <4>[33685.629677]  ? md_end_io+0x5c/0x70
> > > 1439  <4>[33685.629677]  ? bio_endio+0xdc/0x130
> > > 1440  <4>[33685.629678]  ? bio_chain_endio+0x2d/0x40
> > > 1441  <4>[33685.629678]  ? md_
> > > 1442  <4>[33685.629679] Lost 357 message(s)!
> > >
> > > This happens on:
> > > 5.11.8-051108-generic #202103200636 SMP Sat Mar 20 11:17:32 UTC 2021
> > > and on 5.8.0-44-generic #50~20.04.1-Ubuntu
> > > (https://changelogs.ubuntu.com/changelogs/pool/main/l/linux/linux_5.8.0-44.50/changelog)
> > > which contains backported
> > > https://github.com/torvalds/linux/commit/41d2d848e5c09209bdb57ff9c0ca34075e22783d
> > > ("md: improve io stats accounting").
> > > The 5.8.18-050818-generic #202011011237 SMP Sun Nov 1 12:40:15 UTC
> > > 2020 which does not contain above suspected change does not crash.
> > >
> > > If there's a better way/place to report this bug just let me know. If
> > > not, here are steps to reproduce:
> > >
> > > 1. Create a RAID 0 device using three Micron_9300_MTFDHAL7T6TDP disks.
> > > mdadm --create --verbose /dev/md12 --level=stripe --raid-devices=3
> > > /dev/nvme0n1p1 /dev/nvme1n1p1 /dev/nvme2n1p1
> > >
> > > 2. Setup xfs on it:
> > > mkfs.xfs /dev/md12 and mount it
> > >
> > > 3. Write to a file on this filesystem:
> > > while true; do rm -rf /mnt/md12/crash* ; for i in `seq 8`; do dd
> > > if=/dev/zero of=/mnt/md12/crash$i bs=32K count=50000000 & done; wait;
> > > done
> > > Wait for a crash (usually less than 20 min).
> > >
> > > I couldn't reproduce it with a single dd process (maybe I have to wait
> > > a little longer), but a single cat
> > > /very/large/file/on/cephfs/over100GbE > /mnt/md12/crash is enough for
> > > this double fault to occur.
> >
> > I guess it is related with bio split, if raid0_make_request calls
> > bio_chain for the split bio, then
> > it's bi_end_io is changed to bio_chain_endio. Could you try this?
> >
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -489,7 +489,7 @@ static blk_qc_t md_submit_bio(struct bio *bio)
> >                  return BLK_QC_T_NONE;
> >          }
> >
> > -       if (bio->bi_end_io != md_end_io) {
> > +       if (bio->bi_end_io != md_end_io && bio->bi_end_io !=
> > bio_chain_endio) {
> >
> > If the above works, then we could miss the statistics of split bio from
> > blk_queue_split which is
> > called before hijack bi_end_io, so we may need to move blk_queue_split
> > after hijack bi_end_io.
>
> Thanks Guoqing! This is likely the problem here.
>
> Pawel, please give this a try.
>
> Thanks,
> Song