Re: Raid5 to raid6 grow interrupted, mdadm hangs on assemble command

Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> · Fri, 5 May 2023 09:34:15 +0800

Hi,

在 2023/05/05 2:02, Jove 写道:
Hi Kuai,

the madm --assemble command also hangs in the kernel. It never completes.

root         142     112  1 19:01 tty1     00:00:00 mdadm --assemble
/dev/md0 /dev/ubdb /dev/ubdc /dev/ubdd /dev/ubde --backup-file
mdadm_raid6_backup.md0 --invalid-backup
root         145       2  0 19:01 ?        00:00:00 [md0_raid6]

[root@LXCNAME ~]# cat /proc/142/stack
[<0>] __switch_to+0x50/0x7f
[<0>] __schedule+0x39c/0x3dd
[<0>] schedule+0x78/0xb9
[<0>] mddev_suspend+0x10b/0x1e8
mddev_suspend is wait for read io to be done, while read io is waiting
for reshape to progress.

So this is just based on if there is a read io beyond reshape position
while mdadm is executed.

[<0>] suspend_lo_store+0x72/0xbb
[<0>] md_attr_store+0x6c/0x8d
[<0>] sysfs_kf_write+0x34/0x37
[<0>] kernfs_fop_write_iter+0x167/0x1d0
[<0>] new_sync_write+0x68/0xd8
[<0>] vfs_write+0xe7/0x12b
[<0>] ksys_write+0x6d/0xa6
[<0>] sys_write+0x10/0x12
[<0>] handle_syscall+0x81/0xb1
[<0>] userspace+0x3db/0x598
[<0>] fork_handler+0x94/0x96

[root@LXCNAME ~]# cat /proc/145/stack
[<0>] __switch_to+0x50/0x7f
[<0>] __schedule+0x39c/0x3dd
[<0>] schedule+0x78/0xb9
[<0>] schedule_timeout+0xd2/0xfb
[<0>] md_thread+0x12c/0x18a
[<0>] kthread+0x11d/0x122
[<0>] new_thread_handler+0x81/0xb2

I have had one case in which mdadm didn't hang and in which the
reshape continued. Sadly, I was using sparse overlay files and the
filesystem could not handle the full 4x 4TB. I had to terminate the
reshape.

This sounds like a dead end for now, normal io beyond reshape position
must wait:

raid5_make_request
 make_stripe_request
  ahead_of_reshape
   wait_woken

Thanks,
Kuai

Best regards,

     Johan

On Thu, May 4, 2023 at 1:41 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:

Hi,

在 2023/04/24 3:09, Jove 写道:
Hi,

I've added two drives to my raid5 array and tried to migrate
it to raid6 with the following command:

mdadm --grow /dev/md0 --raid-devices 4 --level 6
--backup-file=/root/mdadm_raid6_backup.md

This may have been my first mistake, as there are only 5
drives. it should have been --raid-devices 3, I think.

As soon as I started this grow, the filesystems went
unavailable. All processes trying to access files on it hung.
I searched the web which said a reboot during a rebuild
was not problematic if things shut down cleanly, so I
rebooted. The reboot hung too. The drive activity
continued so I let it run overnight. I did wake up to a
rebooted system in emergency mode as it could not
mount all the partitions on the raid array.

The OS tried to reassemble the array and succeeded.
However the udev processes that try to create the dev
entries hang.

I went back to Google and found out how i could reboot
my system without this automatic assemble.
I tried reassembling the array with:

mdadm --verbose --assemble --backup-file mdadm_raid6_backup.md0 /dev/md0

This failed with:
No backup metadata on mdadm_raid6_backup.md0
Failed to find final backup of critical section.
Failed to restore critical section for reshape, sorry.

   I tried again wtih:

mdadm --verbose --assemble --backup-file mdadm_raid6_backup.md0
--invalid-backup /dev/md0

Rhis said in addition to the lines above:

continuying without restoring backup

This seemed to have succeeded in reassembling the
array but it also hangs indefinitely.

/proc/mdstat now shows:

md0 : active (read-only) raid6 sdc1[0] sde[4](S) sdf[5] sdd1[3] sdg1[1]
        7813771264 blocks super 1.2 level 6, 512k chunk, algorithm 18 [4/3] [UUU_]
        bitmap: 1/30 pages [4KB], 65536KB chunk

Read only can't continue reshape progress, see details in
md_check_recovery(), reshape can only start if md_is_rdwr(mddev) pass.
Do you know why this array is read-only?

Again the udev processes trying to access this device hung indefinitely

Eventually, the kernel dumps this in my journal:

Apr 23 19:17:22 atom kernel: task:systemd-udevd   state:D stack:    0
pid: 8121 ppid:   706 flags:0x00000006
Apr 23 19:17:22 atom kernel: Call Trace:
Apr 23 19:17:22 atom kernel:  <TASK>
Apr 23 19:17:22 atom kernel:  __schedule+0x20a/0x550
Apr 23 19:17:22 atom kernel:  schedule+0x5a/0xc0
Apr 23 19:17:22 atom kernel:  schedule_timeout+0x11f/0x160
Apr 23 19:17:22 atom kernel:  ? make_stripe_request+0x284/0x490 [raid456]
Apr 23 19:17:22 atom kernel:  wait_woken+0x50/0x70

Looks like this normal io is waiting for reshape to be done, that's why
it hanged indefinitely.

This really is a kernel bug, perhaps it can be bypassed if reshape can
be done, hopefully automatically if this array can be read/write. Noted
never echo reshape to sync_action, this will corrupt data in your case.

Thanks,
Kuai

.