Re: Experiencing md raid5 hang and CPU lockup on kernel v6.11

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kuai

On Thu, Nov 14, 2024 at 3:35 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> 在 2024/11/14 0:46, Haris Iqbal 写道:
> > On Wed, Nov 13, 2024 at 8:46 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
> >>
> >> Hi,
> >>
> >> 在 2024/11/11 21:56, Haris Iqbal 写道:
> >>> On Mon, Nov 11, 2024 at 2:39 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> 在 2024/11/11 21:29, Haris Iqbal 写道:
> >>>>> Hello,
> >>>>>
> >>>>> I gave both the patches a try, and here are my findings.
> >>>>>
> >>>>
> >>>> Thanks for the test!
> >>>>
> >>>>> With the first patch by Yu, I did not see any hang or errors. I tried
> >>>>> a number of bitmap chunk sizes, and ran fio for few hours, and there
> >>>>> was no hang.
> >>>>
> >>>> This is good news! However, there is still a long road for my approch
> >>>> to land, this requires a lot of other changes to work.
> >>>>>
> >>>>> With the second patch Xiao, I hit the following BUG_ON on the first
> >>>>> minute of my fio run.
> >>>>
> >>>> This is sad. :(
> >>>>>
> >>>>> [  113.902982] Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI
> >>>>> [  113.903315] CPU: 38 UID: 0 PID: 9767 Comm: kworker/38:3H Kdump:
> >>>>> loaded Not tainted 6.11.5-storage
> >>>>> #6.11.5-1+feature+v6.11+20241111.0643+cbe84cc3~deb12
> >>>>> [  113.904120] Hardware name: Supermicro X10DRi/X10DRi, BIOS 3.3 03/03/2021
> >>>>> [  113.904519] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
> >>>>> [  113.904888] RIP: 0010:__add_stripe_bio+0x23f/0x250 [raid456]
> >>>>
> >>>> Can you provide the addr2line of this?
> >>>>
> >>>> gdb raid456.ko
> >>>> list *(__add_stripe_bio+0x23f)
> >>>
> >>> Sorry. I missed the first line while copying.
> >>>
> >>> [  113.902680] kernel BUG at drivers/md/raid5.c:3525!
> >>
> >> Can you give the following patch a test again, on the top of Xiao's
> >> patch.
> >>
> >> Thanks,
> >> Kuai
> >>
> >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> >> index 6e318598a7b6..189f784aed00 100644
> >> --- a/drivers/md/raid5.c
> >> +++ b/drivers/md/raid5.c
> >> @@ -3516,7 +3516,7 @@ static void __add_stripe_bio(struct stripe_head
> >> *sh, struct bio *bi,
> >>                   bip = &sh->dev[dd_idx].toread;
> >>           }
> >>
> >> -       while (*bip && (*bip)->bi_iter.bi_sector < bi->bi_iter.bi_sector)
> >> +       while (*bip && (*bip)->bi_iter.bi_sector <= bi->bi_iter.bi_sector)
> >>                   bip = &(*bip)->bi_next;
> >>
> >>           if (!forwrite || previous)
> >
> > Still hangs. Following is the stack trace.
> >
> > [   22.702034] netconsole-setup: Test log message to verify netconsole
> > configuration.
> > [  134.949923] Oops: general protection fault, probably for
> > non-canonical address 0x761acac3b7d57b17: 0000 [#1] PREEMPT SMP PTI
> > [  134.950621] CPU: 35 UID: 0 PID: 833 Comm: md300_raid5 Kdump: loaded
> > Not tainted 6.11.5-storage
> > #6.11.5-1+feature+v6.11+20241113.0858+ed8e31b5~deb12
> > [  134.951414] Hardware name: Supermicro X10DRi/X10DRi, BIOS 3.3 03/03/2021
> > [  134.951814] RIP: 0010:rnbd_dev_bi_end_io+0x1b/0x70 [rnbd_server]
> > [  134.952185] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3
> > 0f 1e fa 0f 1f 44 00 00 55 b8 ff ff ff ff 53 48 8b 6f 40 48 89 fb 48
> > 8b 55 08 <f0
> > [  134.953311] RSP: 0018:ffffb5b94818fb80 EFLAGS: 00010282
> > [  134.953624] RAX: 00000000ffffffff RBX: ffff96e6a1d8aa80 RCX: 00000000802a0016
> > [  134.954051] RDX: 761acac3b7d57aa7 RSI: 00000000802a0016 RDI: ffff96e6a1d8aa80
> > [  134.954476] RBP: ffff96d705c7d8b0 R08: 0000000000000001 R09: 0000000000000001
> > [  134.954901] R10: ffff96d730c59d40 R11: 0000000000000000 R12: ffff96d71b3e5000
> > [  134.955326] R13: 0000000000000000 R14: ffff96d730c589d8 R15: ffff96d715882e20
> > [  134.955752] FS:  0000000000000000(0000) GS:ffff96f63fbc0000(0000)
> > knlGS:0000000000000000
> > [  134.956237] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  134.956578] CR2: 00007fb5962bbe00 CR3: 000000060882c006 CR4: 00000000001706f0
> > [  134.957003] Call Trace:
> > [  134.957151]  <TASK
> > [  134.957274]  ? die_addr+0x36/0x90
> > [  134.957480]  ? exc_general_protection+0x1bc/0x3c0
> > [  134.957762]  ? asm_exc_general_protection+0x26/0x30
> > [  134.958054]  ? rnbd_dev_bi_end_io+0x1b/0x70 [rnbd_server]
> > [  134.958377]  md_end_clone_io+0x42/0xa0
> > [  134.958602]  md_end_clone_io+0x42/0xa0
> > [  134.958826]  handle_stripe_clean_event+0x240/0x430 [raid456]
> > [  134.959168]  handle_stripe+0x783/0x1cb0 [raid456]
> > [  134.959452]  ? common_interrupt+0x13/0xa0
> > [  134.959690]  handle_active_stripes.constprop.0+0x353/0x540 [raid456]
> > [  134.960073]  raid5d+0x41a/0x600 [raid456]
> >
> > Maybe the same BIO handled twice - and so the clone (for IO-acct) got
> > put again (somehow) into md_account_bio()?
>
> I think the last change is reasonable, the BUG_ON() can be avoided and
> bio chain won't be messed up.
>
Yes, seem we are making progress, thx!
> The problem here looks like bio reference is not correct, I'll need some
> time to sort that out, too complicated in raid5.
>
> Meanwhile, can you try the following workround? I just revert the
> changes that I think introduce this problem, noted that performace can
> be degraded.
Do you want us to try the following change on top of the md/md-6.13
branch without Xiao's patch and your fixup alone, or combine them all
together?

BTW: we hit similar hung since kernel 4.19.

Thx!
>
> Thanks,
> Kuai
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index f09e7677ee9f..07aa453bdb2f 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -5874,17 +5874,6 @@ static int add_all_stripe_bios(struct r5conf *conf,
>                          wait_on_bit(&dev->flags, R5_Overlap,
> TASK_UNINTERRUPTIBLE);
>                          return 0;
>                  }
> -       }
> -
> -       for (dd_idx = 0; dd_idx < sh->disks; dd_idx++) {
> -               struct r5dev *dev = &sh->dev[dd_idx];
> -
> -               if (dd_idx == sh->pd_idx || dd_idx == sh->qd_idx)
> -                       continue;
> -
> -               if (dev->sector < ctx->first_sector ||
> -                   dev->sector >= ctx->last_sector)
> -                       continue;
>
>                  __add_stripe_bio(sh, bi, dd_idx, forwrite, previous);
>                  clear_bit((dev->sector - ctx->first_sector) >>
>





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux