Re: mdadm grow raid 5 to 6 failure (crash)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok, well I'm willing to try anything at this point. Do you need
anything from me for a patch? Here is my current kernel details:

Linux homer 5.14.0-305.el9.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 27
11:32:15 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux



On Mon, May 8, 2023 at 3:53 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> 在 2023/05/08 16:27, David Gilmour 写道:
> > This seems to be kicked off somehow from the mdadm assemble command.
> > There is no additional mdadm process before I run the assemble command
> > but there is after:
>
> If this is the case, I'm sorry that I'm not familiar how mdadm works and
> how can this deadlock be bypassed. If you can compile a new kernel, I
> can give you a patch to let such io failed, but I'm not sure if mdadm
> can still make progress or just fail to assemble.
>
> Thanks,
> Kuai
> >
> > #ps -elf | grep " D "
> > 0 S root        3993    3953  0  80   0 - 55416 pipe_r 02:18 pts/1
> > 00:00:00 grep --color=auto  D
> >
> > #mdadm --assemble --verbose --backup-file=/root/mdadm5-6_backup_md127
> > --invalid-backup /dev/md127 /dev/sda /dev/sdh /dev/sdg /dev/sdc
> > /dev/sde /dev/sdf --force
> > mdadm: looking for devices for /dev/md127
> > mdadm: /dev/sda is identified as a member of /dev/md127, slot 0.
> > mdadm: /dev/sdh is identified as a member of /dev/md127, slot 1.
> > mdadm: /dev/sdg is identified as a member of /dev/md127, slot 2.
> > mdadm: /dev/sdc is identified as a member of /dev/md127, slot 3.
> > mdadm: /dev/sde is identified as a member of /dev/md127, slot 4.
> > mdadm: /dev/sdf is identified as a member of /dev/md127, slot 5.
> > mdadm: /dev/md127 has an active reshape - checking if critical section
> > needs to be restored
> > mdadm: No backup metadata on /root/mdadm5-6_backup_md127
> > mdadm: Failed to find backup of critical section
> > mdadm: continuing without restoring backup
> > mdadm: added /dev/sdh to /dev/md127 as 1
> > mdadm: added /dev/sdg to /dev/md127 as 2
> > mdadm: added /dev/sdc to /dev/md127 as 3
> > mdadm: added /dev/sde to /dev/md127 as 4
> > mdadm: added /dev/sdf to /dev/md127 as 5 (possibly out of date)
> > mdadm: added /dev/sda to /dev/md127 as 0
> > <hangs here>
> >
> > #Then in a separate terminal:
> > #ps -elf | grep " D "
> > 1 D root        1249       1  0  80   0 -   936 wait_w 02:15 ?
> > 00:00:00 /sbin/mdadm --monitor --scan --syslog -f
> > --pid-file=/run/mdadm/mdadm.pid
> > 4 D root        4086    4038  0  80   0 -   953 mddev_ 02:22 pts/2
> > 00:00:00 mdadm --assemble --verbose
> > --backup-file=/root/mdadm5-6_backup_md127 --invalid-backup /dev/md127
> > /dev/sda /dev/sdh /dev/sdg /dev/sdc /dev/sde /dev/sdf --force
> > 0 S root        4102    3869  0  80   0 - 55449 pipe_r 02:25 pts/0
> > 00:00:00 grep --color=auto  D
> >
> >
> >
> > On Mon, May 8, 2023 at 1:56 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
> >>
> >> Hi,
> >>
> >> 在 2023/05/08 15:43, David Gilmour 写道:
> >>> Two mdadm processes show up:
> >>>
> >>> #ps -elf | grep " D "
> >>> 1 D root        1251       1  0  80   0 -   936 wait_w 00:28 ?
> >>> 00:00:00 /sbin/mdadm --monitor --scan --syslog -f
> >>> --pid-file=/run/mdadm/mdadm.pid
> >>
> >> So this is the one accessing the array, can you stop this before
> >> assemble?
> >>
> >> Thanks,
> >> Kuai
> >>> 4 D root        4130    4091  0  80   0 -   953 mddev_ 01:33 pts/1
> >>> 00:00:00 mdadm --assemble --verbose
> >>> --backup-file=/root/mdadm5-6_backup_md127 --invalid-backup /dev/md127
> >>> /dev/sda /dev/sdh /dev/sdg /dev/sdc /dev/sde /dev/sdf --force
> >>>
> >>> Process 1251 (mdadm) has a raid5_make_request descriptor in it:
> >>>
> >>> #cat /proc/1251/stack
> >>> [<0>] wait_woken+0x50/0x70
> >>> [<0>] raid5_make_request+0x2cb/0x3e0 [raid456]
> >>> [<0>] md_handle_request+0x135/0x1e0
> >>> [<0>] __submit_bio+0x89/0x130
> >>> [<0>] __submit_bio_noacct+0x81/0x1f0
> >>> [<0>] submit_bh_wbc+0x11e/0x140
> >>> [<0>] block_read_full_folio+0x1d5/0x290
> >>> [<0>] filemap_read_folio+0x43/0x300
> >>> [<0>] do_read_cache_folio+0x112/0x3f0
> >>> [<0>] read_cache_page+0x15/0x90
> >>> [<0>] read_part_sector+0x3a/0x160
> >>> [<0>] read_lba+0xff/0x260
> >>> [<0>] is_gpt_valid.part.0+0x66/0x3d0
> >>> [<0>] find_valid_gpt.constprop.0+0x20e/0x540
> >>> [<0>] efi_partition+0x80/0x390
> >>> [<0>] check_partition+0x103/0x1d0
> >>> [<0>] bdev_disk_changed.part.0+0xb5/0x200
> >>> [<0>] blkdev_get_whole+0x7a/0x90
> >>> [<0>] blkdev_get_by_dev.part.0+0x13b/0x300
> >>> [<0>] blkdev_open+0x4c/0x90
> >>> [<0>] do_dentry_open+0x14f/0x380
> >>> [<0>] do_open+0x21a/0x3d0
> >>> [<0>] path_openat+0x10f/0x2b0
> >>> [<0>] do_filp_open+0xb2/0x160
> >>> [<0>] do_sys_openat2+0x9a/0x160
> >>> [<0>] __x64_sys_openat+0x53/0xa0
> >>> [<0>] do_syscall_64+0x5c/0x90
> >>> [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >>>
> >>> #cat /proc/4130/stack
> >>> [<0>] mddev_suspend.part.0+0xdf/0x150
> >>> [<0>] suspend_lo_store+0xc5/0xf0
> >>> [<0>] md_attr_store+0x83/0xf0
> >>> [<0>] kernfs_fop_write_iter+0x124/0x1b0
> >>> [<0>] new_sync_write+0xff/0x190
> >>> [<0>] vfs_write+0x1ef/0x280
> >>> [<0>] ksys_write+0x5f/0xe0
> >>> [<0>] do_syscall_64+0x5c/0x90
> >>> [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >>>
> >>>
> >>> On Mon, May 8, 2023 at 1:08 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> 在 2023/05/08 13:54, David Gilmour 写道:
> >>>>> I'm not sure what I'm looking for here but here is the output of the
> >>>>> inflight file immediately after the mdadm assemble hangs. Does this
> >>>>> indicate something accessing the array?
> >>>>>
> >>>>> #cat /sys/block/md127/inflight
> >>>>>           1        0
> >>>>>
> >>>>
> >>>> Yes, something is accessing the array. Do you try to grep all the task
> >>>> that is "D" state?
> >>>>
> >>>> ps -elf | grep " D "
> >>>>
> >>>> Is there any task stuck in raid5_make_request?
> >>>>
> >>>> cat /proc/$pid/stack
> >>>>
> >>>>> Also attached is an strace of my mdadm command that hung in case that
> >>>>> reveals something relevant:
> >>>>> strace mdadm --assemble --verbose
> >>>>> --backup-file=/root/mdadm5-6_backup_md127 --invalid-backup /dev/md127
> >>>>> /dev/sda /dev/sdh /dev/sdg /dev/sdc /dev/sde /dev/sdf --force 2>&1 |
> >>>>> tee mdadm_strace_output.txt
> >>>>
> >>>> I don't think this will be helpful, mdadm is unlikely the task that
> >>>> is accessing the array.
> >>>>
> >>>> Thanks,
> >>>> Kuai
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Sun, May 7, 2023 at 7:23 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> 在 2023/05/06 21:19, David Gilmour 写道:
> >>>>>>> >From what I can tell it does look very similar. I stopped the
> >>>>>>> systemd-udevd service and renamed it to systemd-udevd.bak. My system
> >>>>>>> still hung on the assemble command. I'm not savvy enough to decode the
> >>>>>>> details here but does the "mddev_suspend.part.0+0xdf/0x150" line in
> >>>>>>> the process stack output suggest the same i/o block the other post
> >>>>>>> indicates?
> >>>>>>>
> >>>>>>> × systemd-udevd.service - Rule-based Manager for Device Events and Files
> >>>>>>>          Loaded: loaded (/usr/lib/systemd/system/systemd-udevd.service; static)
> >>>>>>>          Active: failed (Result: exit-code) since Sat 2023-05-06 06:59:11
> >>>>>>> MDT; 1min 27s ago
> >>>>>>>        Duration: 1d 20h 16min 29.633s
> >>>>>>> TriggeredBy: × systemd-udevd-kernel.socket
> >>>>>>>                  × systemd-udevd-control.socket
> >>>>>>>            Docs: man:systemd-udevd.service(8)
> >>>>>>>                  man:udev(7)
> >>>>>>>         Process: 27440 ExecStart=/usr/lib/systemd/systemd-udevd
> >>>>>>> (code=exited, status=203/EXEC)
> >>>>>>>        Main PID: 27440 (code=exited, status=203/EXEC)
> >>>>>>>             CPU: 5ms
> >>>>>>>
> >>>>>>> ----------------------
> >>>>>>> #mdadm --assemble --verbose --backup-file=/root/mdadm5-6_backup_md127
> >>>>>>> --invalid-backup /dev/md127 /dev/sda /dev/sdh /dev/sdg /dev/sdc
> >>>>>>> /dev/sdb /dev/sdf --force
> >>>>>>> mdadm: looking for devices for /dev/md127
> >>>>>>> mdadm: /dev/sda is identified as a member of /dev/md127, slot 0.
> >>>>>>> mdadm: /dev/sdh is identified as a member of /dev/md127, slot 1.
> >>>>>>> mdadm: /dev/sdg is identified as a member of /dev/md127, slot 2.
> >>>>>>> mdadm: /dev/sdc is identified as a member of /dev/md127, slot 3.
> >>>>>>> mdadm: /dev/sdb is identified as a member of /dev/md127, slot 4.
> >>>>>>> mdadm: /dev/sdf is identified as a member of /dev/md127, slot 5.
> >>>>>>> mdadm: /dev/md127 has an active reshape - checking if critical section
> >>>>>>> needs to be restored
> >>>>>>> mdadm: No backup metadata on /root/mdadm5-6_backup_md127
> >>>>>>> mdadm: Failed to find backup of critical section
> >>>>>>> mdadm: continuing without restoring backup
> >>>>>>> mdadm: added /dev/sdh to /dev/md127 as 1
> >>>>>>> mdadm: added /dev/sdg to /dev/md127 as 2
> >>>>>>> mdadm: added /dev/sdc to /dev/md127 as 3
> >>>>>>> mdadm: added /dev/sdb to /dev/md127 as 4
> >>>>>>> mdadm: added /dev/sdf to /dev/md127 as 5 (possibly out of date)
> >>>>>>> mdadm: added /dev/sda to /dev/md127 as 0
> >>>>>>>
> >>>>>>> #hangs indefinitely at this point in the output
> >>>>>>>
> >>>>>>> ------------------------------------------
> >>>>>>>
> >>>>>>>
> >>>>>>> root       27454  0.0  0.0   3812  2656 pts/1    D+   07:00   0:00
> >>>>>>> mdadm --assemble --verbose --backup-file=/root/mdadm5-6_backup_md127
> >>>>>>> --invalid-backup /dev/md127 /dev/sda /dev/sdh /dev/sdg /dev/sdc
> >>>>>>> /dev/sdb /dev/sdf --force
> >>>>>>> root       27457  0.0  0.0      0     0 ?        S    07:00   0:00 [md127_raid6]
> >>>>>>>
> >>>>>>> #cat /proc/27454/stack
> >>>>>>> [<0>] mddev_suspend.part.0+0xdf/0x150
> >>>>>>> [<0>] suspend_lo_store+0xc5/0xf0
> >>>>>>> [<0>] md_attr_store+0x83/0xf0
> >>>>>>> [<0>] kernfs_fop_write_iter+0x124/0x1b0
> >>>>>>> [<0>] new_sync_write+0xff/0x190
> >>>>>>> [<0>] vfs_write+0x1ef/0x280
> >>>>>>> [<0>] ksys_write+0x5f/0xe0
> >>>>>>> [<0>] do_syscall_64+0x5c/0x90
> >>>>>>> [<0>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >>>>>>>
> >>>>>>> #cat /proc/27457/stack
> >>>>>>> [<0>] md_thread+0x122/0x160
> >>>>>>> [<0>] kthread+0xe0/0x100
> >>>>>>> [<0>] ret_from_fork+0x22/0x30
> >>>>>>>
> >>>>>>
> >>>>>> Is there any thread stuck at raid5_make_request? something like below:
> >>>>>>
> >>>>>> Apr 23 19:17:22 atom kernel: task:systemd-udevd   state:D stack:    0
> >>>>>> pid: 8121 ppid:   706 flags:0x00000006
> >>>>>> Apr 23 19:17:22 atom kernel: Call Trace:
> >>>>>> Apr 23 19:17:22 atom kernel:  <TASK>
> >>>>>> Apr 23 19:17:22 atom kernel:  __schedule+0x20a/0x550
> >>>>>> Apr 23 19:17:22 atom kernel:  schedule+0x5a/0xc0
> >>>>>> Apr 23 19:17:22 atom kernel:  schedule_timeout+0x11f/0x160
> >>>>>> Apr 23 19:17:22 atom kernel:  ? make_stripe_request+0x284/0x490 [raid456]
> >>>>>> Apr 23 19:17:22 atom kernel:  wait_woken+0x50/0x70
> >>>>>> Apr 23 19:17:22 atom kernel:  raid5_make_request+0x2cb/0x3e0 [raid456]
> >>>>>> Apr 23 19:17:22 atom kernel:  ? sched_show_numa+0xf0/0xf0
> >>>>>> Apr 23 19:17:22 atom kernel:  md_handle_request+0x132/0x1e0
> >>>>>> Apr 23 19:17:22 atom kernel:  ? do_mpage_readpage+0x282/0x6b0
> >>>>>> Apr 23 19:17:22 atom kernel:  __submit_bio+0x86/0x130
> >>>>>> Apr 23 19:17:22 atom kernel:  __submit_bio_noacct+0x81/0x1f0
> >>>>>> Apr 23 19:17:22 atom kernel:  mpage_readahead+0x15c/0x1d0
> >>>>>> Apr 23 19:17:22 atom kernel:  ? blkdev_write_begin+0x20/0x20
> >>>>>> Apr 23 19:17:22 atom kernel:  read_pages+0x58/0x2f0
> >>>>>> Apr 23 19:17:22 atom kernel:  page_cache_ra_unbounded+0x137/0x180
> >>>>>> Apr 23 19:17:22 atom kernel:  force_page_cache_ra+0xc5/0xf0
> >>>>>> Apr 23 19:17:22 atom kernel:  filemap_get_pages+0xe4/0x350
> >>>>>> Apr 23 19:17:22 atom kernel:  filemap_read+0xbe/0x3c0
> >>>>>> Apr 23 19:17:22 atom kernel:  ? make_kgid+0x13/0x20
> >>>>>> Apr 23 19:17:22 atom kernel:  ? deactivate_locked_super+0x90/0xa0
> >>>>>> Apr 23 19:17:22 atom kernel:  blkdev_read_iter+0xaf/0x170
> >>>>>> Apr 23 19:17:22 atom kernel:  new_sync_read+0xf9/0x180
> >>>>>> Apr 23 19:17:22 atom kernel:  vfs_read+0x13c/0x190
> >>>>>> Apr 23 19:17:22 atom kernel:  ksys_read+0x5f/0xe0
> >>>>>> Apr 23 19:17:22 atom kernel:  do_syscall_64+0x59/0x90
> >>>>>>
> >>>>>> By the way, cat /sys/block/mdxx/inflight can prove this as well.
> >>>>>>
> >>>>>> If this is the case, can you find out who is accessing the array?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Kuai
> >>>>>>
> >>>>
> >>> .
> >>>
> >>
> > .
> >
>




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux