Hi Kuai On Thu, Nov 14, 2024 at 3:35 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > Hi, > > 在 2024/11/14 0:46, Haris Iqbal 写道: > > On Wed, Nov 13, 2024 at 8:46 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > >> > >> Hi, > >> > >> 在 2024/11/11 21:56, Haris Iqbal 写道: > >>> On Mon, Nov 11, 2024 at 2:39 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > >>>> > >>>> Hi, > >>>> > >>>> 在 2024/11/11 21:29, Haris Iqbal 写道: > >>>>> Hello, > >>>>> > >>>>> I gave both the patches a try, and here are my findings. > >>>>> > >>>> > >>>> Thanks for the test! > >>>> > >>>>> With the first patch by Yu, I did not see any hang or errors. I tried > >>>>> a number of bitmap chunk sizes, and ran fio for few hours, and there > >>>>> was no hang. > >>>> > >>>> This is good news! However, there is still a long road for my approch > >>>> to land, this requires a lot of other changes to work. > >>>>> > >>>>> With the second patch Xiao, I hit the following BUG_ON on the first > >>>>> minute of my fio run. > >>>> > >>>> This is sad. :( > >>>>> > >>>>> [ 113.902982] Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI > >>>>> [ 113.903315] CPU: 38 UID: 0 PID: 9767 Comm: kworker/38:3H Kdump: > >>>>> loaded Not tainted 6.11.5-storage > >>>>> #6.11.5-1+feature+v6.11+20241111.0643+cbe84cc3~deb12 > >>>>> [ 113.904120] Hardware name: Supermicro X10DRi/X10DRi, BIOS 3.3 03/03/2021 > >>>>> [ 113.904519] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] > >>>>> [ 113.904888] RIP: 0010:__add_stripe_bio+0x23f/0x250 [raid456] > >>>> > >>>> Can you provide the addr2line of this? > >>>> > >>>> gdb raid456.ko > >>>> list *(__add_stripe_bio+0x23f) > >>> > >>> Sorry. I missed the first line while copying. > >>> > >>> [ 113.902680] kernel BUG at drivers/md/raid5.c:3525! > >> > >> Can you give the following patch a test again, on the top of Xiao's > >> patch. > >> > >> Thanks, > >> Kuai > >> > >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > >> index 6e318598a7b6..189f784aed00 100644 > >> --- a/drivers/md/raid5.c > >> +++ b/drivers/md/raid5.c > >> @@ -3516,7 +3516,7 @@ static void __add_stripe_bio(struct stripe_head > >> *sh, struct bio *bi, > >> bip = &sh->dev[dd_idx].toread; > >> } > >> > >> - while (*bip && (*bip)->bi_iter.bi_sector < bi->bi_iter.bi_sector) > >> + while (*bip && (*bip)->bi_iter.bi_sector <= bi->bi_iter.bi_sector) > >> bip = &(*bip)->bi_next; > >> > >> if (!forwrite || previous) > > > > Still hangs. Following is the stack trace. > > > > [ 22.702034] netconsole-setup: Test log message to verify netconsole > > configuration. > > [ 134.949923] Oops: general protection fault, probably for > > non-canonical address 0x761acac3b7d57b17: 0000 [#1] PREEMPT SMP PTI > > [ 134.950621] CPU: 35 UID: 0 PID: 833 Comm: md300_raid5 Kdump: loaded > > Not tainted 6.11.5-storage > > #6.11.5-1+feature+v6.11+20241113.0858+ed8e31b5~deb12 > > [ 134.951414] Hardware name: Supermicro X10DRi/X10DRi, BIOS 3.3 03/03/2021 > > [ 134.951814] RIP: 0010:rnbd_dev_bi_end_io+0x1b/0x70 [rnbd_server] > > [ 134.952185] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 > > 0f 1e fa 0f 1f 44 00 00 55 b8 ff ff ff ff 53 48 8b 6f 40 48 89 fb 48 > > 8b 55 08 <f0 > > [ 134.953311] RSP: 0018:ffffb5b94818fb80 EFLAGS: 00010282 > > [ 134.953624] RAX: 00000000ffffffff RBX: ffff96e6a1d8aa80 RCX: 00000000802a0016 > > [ 134.954051] RDX: 761acac3b7d57aa7 RSI: 00000000802a0016 RDI: ffff96e6a1d8aa80 > > [ 134.954476] RBP: ffff96d705c7d8b0 R08: 0000000000000001 R09: 0000000000000001 > > [ 134.954901] R10: ffff96d730c59d40 R11: 0000000000000000 R12: ffff96d71b3e5000 > > [ 134.955326] R13: 0000000000000000 R14: ffff96d730c589d8 R15: ffff96d715882e20 > > [ 134.955752] FS: 0000000000000000(0000) GS:ffff96f63fbc0000(0000) > > knlGS:0000000000000000 > > [ 134.956237] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 134.956578] CR2: 00007fb5962bbe00 CR3: 000000060882c006 CR4: 00000000001706f0 > > [ 134.957003] Call Trace: > > [ 134.957151] <TASK > > [ 134.957274] ? die_addr+0x36/0x90 > > [ 134.957480] ? exc_general_protection+0x1bc/0x3c0 > > [ 134.957762] ? asm_exc_general_protection+0x26/0x30 > > [ 134.958054] ? rnbd_dev_bi_end_io+0x1b/0x70 [rnbd_server] > > [ 134.958377] md_end_clone_io+0x42/0xa0 > > [ 134.958602] md_end_clone_io+0x42/0xa0 > > [ 134.958826] handle_stripe_clean_event+0x240/0x430 [raid456] > > [ 134.959168] handle_stripe+0x783/0x1cb0 [raid456] > > [ 134.959452] ? common_interrupt+0x13/0xa0 > > [ 134.959690] handle_active_stripes.constprop.0+0x353/0x540 [raid456] > > [ 134.960073] raid5d+0x41a/0x600 [raid456] > > > > Maybe the same BIO handled twice - and so the clone (for IO-acct) got > > put again (somehow) into md_account_bio()? > > I think the last change is reasonable, the BUG_ON() can be avoided and > bio chain won't be messed up. > Yes, seem we are making progress, thx! > The problem here looks like bio reference is not correct, I'll need some > time to sort that out, too complicated in raid5. > > Meanwhile, can you try the following workround? I just revert the > changes that I think introduce this problem, noted that performace can > be degraded. Do you want us to try the following change on top of the md/md-6.13 branch without Xiao's patch and your fixup alone, or combine them all together? BTW: we hit similar hung since kernel 4.19. Thx! > > Thanks, > Kuai > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index f09e7677ee9f..07aa453bdb2f 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -5874,17 +5874,6 @@ static int add_all_stripe_bios(struct r5conf *conf, > wait_on_bit(&dev->flags, R5_Overlap, > TASK_UNINTERRUPTIBLE); > return 0; > } > - } > - > - for (dd_idx = 0; dd_idx < sh->disks; dd_idx++) { > - struct r5dev *dev = &sh->dev[dd_idx]; > - > - if (dd_idx == sh->pd_idx || dd_idx == sh->qd_idx) > - continue; > - > - if (dev->sector < ctx->first_sector || > - dev->sector >= ctx->last_sector) > - continue; > > __add_stripe_bio(sh, bi, dd_idx, forwrite, previous); > clear_bit((dev->sector - ctx->first_sector) >> >