Re: regression: CPU soft lockup with raid10: check slab-out-of-bounds in md_bitmap_get_counter

Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> · Mon, 6 May 2024 21:52:04 +0800

Hi,

在 2024/05/06 20:44, Heinz Mauelshagen 写道:
Hi,

what fields are you referring to?

For this problem, the field is: dev_sectors, for raid10, it's rdev size,
for raid456, it's array size. And dm-raid is using it as bitmap size.

And while review related code, I found following quite strage as well:

mddev->resync_max_sectors = mddev->dev_sectors;

I'm still checking following fields now, for both md/raid and dm-raid,
with the respect how sync_thread should work:

dev_sectors
resync_max_sectors
array_sectors
recovery_cp
recovery_offset
reshape_position

Thanks,
Kuai

Thanks,
Heinz

On Mon, May 6, 2024 at 8:19 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx 
<mailto:yukuai1@xxxxxxxxxxxxxxx>> wrote:

    Hi,

    在 2024/04/30 19:07, Nigel Croxon 写道:
     >
     > On 4/25/24 12:52 PM, Song Liu wrote:
     >> On Thu, Apr 25, 2024 at 5:10 AM Nigel Croxon <ncroxon@xxxxxxxxxx
    <mailto:ncroxon@xxxxxxxxxx>> wrote:
     >>>
     >>> On 4/24/24 2:57 AM, Yu Kuai wrote:
     >>>> Hi, Nigel
     >>>>
     >>>> 在 2024/04/21 20:30, Nigel Croxon 写道:
     >>>>> On 4/20/24 2:09 AM, Yu Kuai wrote:
     >>>>>> Hi,
     >>>>>>
     >>>>>> 在 2024/04/20 3:49, Nigel Croxon 写道:
     >>>>>>> There is a problem with this commit, it causes a CPU#x soft
    lockup
     >>>>>>>
     >>>>>>> commit 301867b1c16805aebbc306aafa6ecdc68b73c7e5
     >>>>>>> Author: Li Nan <linan122@xxxxxxxxxx
    <mailto:linan122@xxxxxxxxxx>>
     >>>>>>> Date:   Mon May 15 21:48:05 2023 +0800
     >>>>>>> md/raid10: check slab-out-of-bounds in md_bitmap_get_counter
     >>>>>>>
     >>>>>> Did you found this commit by bisect?
     >>>>>>
     >>>>> Yes, found this issue by bisecting...
     >>>>>
     >>>>>>> Message from syslogd@rhel9 at Apr 19 14:14:55 ...
     >>>>>>>    kernel:watchdog: BUG: soft lockup - CPU#3 stuck for 26s!
     >>>>>>> [mdX_resync:6976]
     >>>>>>>
     >>>>>>> dmesg:
     >>>>>>>
     >>>>>>> [  104.245585] CPU: 7 PID: 3588 Comm: mdX_resync Kdump:
    loaded Not
     >>>>>>> tainted 6.9.0-rc4-next-20240419 #1
     >>>>>>> [  104.245588] Hardware name: QEMU Standard PC (Q35 + ICH9,
    2009),
     >>>>>>> BIOS 1.16.2-1.fc38 04/01/2014
     >>>>>>> [  104.245590] RIP: 0010:_raw_spin_unlock_irq+0x13/0x30
     >>>>>>> [  104.245598] Code: 00 00 00 00 00 66 90 90 90 90 90 90 90
    90 90
     >>>>>>> 90 90 90 90 90 90 90 90 0f 1f 44 00 00 c6 07 00 90 90 90 fb
    65 ff
     >>>>>>> 0d 95 9f 75 76 <74> 05 c3 cc cc cc cc 0f 1f 44 00 00 c3 cc
    cc cc cc
     >>>>>>> cc cc cc cc cc
     >>>>>>> [  104.245601] RSP: 0018:ffffb2d74a81bbf8 EFLAGS: 00000246
     >>>>>>> [  104.245603] RAX: 0000000000000000 RBX: 0000000001000000 RCX:
     >>>>>>> 000000000000000c
     >>>>>>> [  104.245604] RDX: 0000000000000000 RSI: 0000000001000000 RDI:
     >>>>>>> ffff926160ccd200
     >>>>>>> [  104.245606] RBP: ffffb2d74a81bcd0 R08: 0000000000000013 R09:
     >>>>>>> 0000000000000000
     >>>>>>> [  104.245607] R10: 0000000000000000 R11: ffffb2d74a81bad8 R12:
     >>>>>>> 0000000000000000
     >>>>>>> [  104.245608] R13: 0000000000000000 R14: ffff926160ccd200 R15:
     >>>>>>> ffff926151019000
     >>>>>>> [  104.245611] FS:  0000000000000000(0000)
     >>>>>>> GS:ffff9273f9580000(0000) knlGS:0000000000000000
     >>>>>>> [  104.245613] CS:  0010 DS: 0000 ES: 0000 CR0:
    0000000080050033
     >>>>>>> [  104.245614] CR2: 00007f23774d2584 CR3: 0000000104098003 CR4:
     >>>>>>> 0000000000370ef0
     >>>>>>> [  104.245616] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
     >>>>>>> 0000000000000000
     >>>>>>> [  104.245617] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
     >>>>>>> 0000000000000400
     >>>>>>> [  104.245618] Call Trace:
     >>>>>>> [  104.245620]  <IRQ>
     >>>>>>> [  104.245623]  ? watchdog_timer_fn+0x1e3/0x260
     >>>>>>> [  104.245630]  ? __pfx_watchdog_timer_fn+0x10/0x10
     >>>>>>> [  104.245634]  ? __hrtimer_run_queues+0x112/0x2a0
     >>>>>>> [  104.245638]  ? hrtimer_interrupt+0xff/0x240
     >>>>>>> [  104.245640]  ? sched_clock+0xc/0x30
     >>>>>>> [  104.245644]  ? __sysvec_apic_timer_interrupt+0x54/0x140
     >>>>>>> [  104.245649]  ? sysvec_apic_timer_interrupt+0x6c/0x90
     >>>>>>> [  104.245652]  </IRQ>
     >>>>>>> [  104.245653]  <TASK>
     >>>>>>> [  104.245654]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
     >>>>>>> [  104.245659]  ? _raw_spin_unlock_irq+0x13/0x30
     >>>>>>> [  104.245661]  md_bitmap_start_sync+0x6b/0xf0
     >>>> Can you give the following patch a test as well? I believe this is
     >>>> the root cause why page > bitmap->pages, dm-raid is using the
    wrong
     >>>> bitmap size.
     >>>>
     >>>> diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
     >>>> index abe88d1e6735..d9c65ef9c9fb 100644
     >>>> --- a/drivers/md/dm-raid.c
     >>>> +++ b/drivers/md/dm-raid.c
     >>>> @@ -4052,7 +4052,8 @@ static int raid_preresume(struct
    dm_target *ti)
     >>>>                 mddev->bitmap_info.chunksize !=
     >>>> to_bytes(rs->requested_bitmap_chunk_sectors)))) {
     >>>>                  int chunksize =
     >>>> to_bytes(rs->requested_bitmap_chunk_sectors) ?:
     >>>> mddev->bitmap_info.chunksize;
     >>>>
     >>>> -               r = md_bitmap_resize(mddev->bitmap,
     >>>> mddev->dev_sectors, chunksize, 0);
     >>>> +               r = md_bitmap_resize(mddev->bitmap,
     >>>> mddev->resync_max_sectors,
     >>>> +                                    chunksize, 0);
     >>>>                  if (r)
     >>>>                          DMERR("Failed to resize bitmap");
     >>>>          }
     >>>>
     >>>> Thanks,
     >>>> Kuai
     >>> Hello Kaui,
     >>>
     >>> Tested and found no issues. Good to go..
     >>>
     >>> -Nigel
     >> Thanks for the fixes and the tests.
     >>
     >> For the next step, do we need both patches or just one of them?
     >>
     >> Song
     >>
     > They both fix the problem independently without the other.

    Sorry that I forgot to reply here, we discussed this on slack...

    For md/raid, we already apply the first patch to fix the soft lockup
    problem, for dm-raid, other than the second patch to fix wrong bitmap
    size, we still need more changes, because some fields in mddev for
    dm-raid10 and dm-raid5 are different, while dm-raid doesn't distinguish
    them. I'm working on that, however, I'm not that familiar with dm-raid
    and I need more time. :)

    Thanks,
    Kuai

     >
     > -Nigel
     >
     > .
     >