Re: RAID 10 reshape is stuck - please help

Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> · Wed, 18 Sep 2024 14:09:04 +0800

Hi,

在 2024/09/16 10:47, William Morgan 写道:
A suggestion was made by Dragan Milivojević to try booting a liveOS
with the same kernel and mdadm version from the time the array was
originally made. That would have been Ubuntu 21.10, with kernel 5.13,
and mdadm v4.2. (That conversation isn't archived here because we
forgot to hit reply all.)

Perhaps can you try latest kernel?(6.11)

I was able to complete this task but unfortunately I am afraid it made
no difference at all. Exactly the same behavior is seen.

You should assemble the array as read-only, so that the reshape won't
start, and you'll able to copy the data.

Thanks,
Kuai

I'm going to try to enable more verbose logging from my HBA controller.

On Sun, Sep 15, 2024 at 2:36 PM William Morgan <therealbrewer@xxxxxxxxx> wrote:

Hello,

I posted about this problem several months ago and unfortunately I
never received any suggestions. I haven't been able to fix the problem
on my own, so I am hoping someone here can help.

I have a raid10 array that originally consisted of 6x EXOS 16TB drives
connected through an LSI 9601-16e (SAS2116). Needing more space, I
added 4 more 16TB drives. I used the following commands to add and
grow the array:

[2024-06-24 19:38:12] sudo mdadm /dev/md2 --add /dev/sd[i-l]1
[2024-06-24 19:39:27] sudo mdadm --grow /dev/md2 --raid-devices=10

After 10-11 hours of reshaping (I knew it would take a long time), the
reshape seemed to freeze at 22.1% completed.

In dmesg I saw the following error:

[260007.679410] md: md2: reshape interrupted.
[260144.852441] INFO: task md2_reshape:242508 blocked for more than 122 seconds.
[260144.852459]       Tainted: G           OE 6.9.3-060903-generic #202405300957
[260144.852466] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[260144.852471] task:md2_reshape     state:D stack:0     pid:242508
tgid:242508 ppid:2      flags:0x00004000
[260144.852484] Call Trace:
[260144.852489]  <TASK>
[260144.852496]  __schedule+0x279/0x6a0
[260144.852512]  schedule+0x29/0xd0
[260144.852523]  wait_barrier.part.0+0x180/0x1e0 [raid10]
[260144.852544]  ? __pfx_autoremove_wake_function+0x10/0x10
[260144.852560]  wait_barrier+0x70/0xc0 [raid10]
[260144.852577]  raid10_sync_request+0x177e/0x19e3 [raid10]
[260144.852595]  ? __schedule+0x281/0x6a0
[260144.852605]  md_do_sync+0xa36/0x1390
[260144.852615]  ? __pfx_autoremove_wake_function+0x10/0x10
[260144.852628]  ? __pfx_md_thread+0x10/0x10
[260144.852635]  md_thread+0xa5/0x1a0
[260144.852643]  ? __pfx_md_thread+0x10/0x10
[260144.852649]  kthread+0xe4/0x110
[260144.852659]  ? __pfx_kthread+0x10/0x10
[260144.852667]  ret_from_fork+0x47/0x70
[260144.852675]  ? __pfx_kthread+0x10/0x10
[260144.852683]  ret_from_fork_asm+0x1a/0x30
[260144.852693]  </TASK>

Some other info which may be helpful:

bill@bill-desk:~$ mdadm --version
mdadm - v4.3 - 2024-02-15 - Ubuntu 4.3-1ubuntu2

bill@bill-desk:~$ uname -a
Linux bill-desk 6.9.3-060903-generic #202405300957 SMP PREEMPT_DYNAMIC
Thu May 30 11:39:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

bill@bill-desk:~$ sudo mdadm -D /dev/md2
/dev/md2:
            Version : 1.2
      Creation Time : Sat Nov 20 14:29:13 2021
         Raid Level : raid10
         Array Size : 46877236224 (43.66 TiB 48.00 TB)
      Used Dev Size : 15625745408 (14.55 TiB 16.00 TB)
       Raid Devices : 10
      Total Devices : 10
        Persistence : Superblock is persistent

      Intent Bitmap : Internal

        Update Time : Tue Jun 25 10:05:18 2024
              State : clean, reshaping
     Active Devices : 10
    Working Devices : 10
     Failed Devices : 0
      Spare Devices : 0

             Layout : near=2
         Chunk Size : 512K

Consistency Policy : bitmap

     Reshape Status : 22% complete
      Delta Devices : 4, (6->10)

               Name : bill-desk:2  (local to host bill-desk)
               UUID : 8a321996:5beb9c15:4c3fcf5b:6c8
b6005
             Events : 77923

     Number   Major   Minor   RaidDevice State
        0       8       65        0      active sync set-A   /dev/sde1
        1       8       81        1      active sync set-B   /dev/sdf1
        2       8       97        2      active sync set-A   /dev/sdg1
        3       8      113        3      active sync set-B   /dev/sdh1
        5       8      209        4      active sync set-A   /dev/sdn1
        4       8      193        5      active sync set-B   /dev/sdm1
        9       8      177        6      active sync set-A   /dev/sdl1
        8       8      161        7      active sync set-B   /dev/sdk1
        7       8      145        8      active sync set-A   /dev/sdj1
        6       8      129        9      active sync set-B   /dev/sdi1

bill@bill-desk:~$ cat /proc/mdstat
Personalities : [raid10] [raid0] [raid1] [raid6] [raid5] [raid4]
md1 : active raid10 sdd1[3] sdc1[2] sdb1[1] sda1[0]
       15627786240 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
       bitmap: 0/117 pages [0KB], 65536KB chunk

md2 : active raid10 sdl1[9] sdk1[8] sdj1[7] sdi1[6] sdn1[5] sdh1[3]
sdf1[1] sde1[0] sdg1[2] sdm1[4]
       46877236224 blocks super 1.2 512K chunks 2 near-copies [10/10]
[UUUUUUUUUU]
       [====>................]  reshape = 22.1%
(10380906624/46877236224) finish=2322382.1min speed=261K/sec
       bitmap: 59/146 pages [236KB], 262144KB chunk

unused devices: <none>

In the meantime I have rebooted several times, done some system
software updates, etc. Nothing has improved or fixed it.

I just have no idea how to help this along. It won't finish the
reshape, and i can't mount the array to copy the data off. I have
enough spare disk space to copy the data to a temporary home if I
could access it, but the array won't mount. Or, I don't know if it is
safe to attempt to mount it. Recently one difference I've noticed is
that upon rebooting, the array is called md127 instead of md2 as
before.

BTW, md1 is an unrelated array on the same system, and it's working
fine. Upon reboot, the reshape begins, but 4 seconds later it is
interrupted:

[    2.451982] md/raid10:md1: active with 4 out of 4 devices
[    2.471053] md1: detected capacity change from 0 to 31255572480
[    2.517951] md/raid10:md127: not clean -- starting background reconstruction
[    2.517956] md/raid10:md127: active with 10 out of 10 devices
[    2.541203] md127: detected capacity change from 0 to 93754472448
[    2.928687] raid6: sse2x4   gen()  7064 MB/s
[    2.945680] raid6: sse2x2   gen()  8468 MB/s
[    2.962690] raid6: sse2x1   gen()  6369 MB/s
[    2.962691] raid6: using algorithm sse2x2 gen() 8468 MB/s
[    2.979680] raid6: .... xor() 7029 MB/s, rmw enabled
[    2.979682] raid6: using ssse3x2 recovery algorithm
[    5.946619] EXT4-fs (md1): mounted filesystem
8f645711-4d2b-42bf-877c-a8c993923a7c r/w with ordered data mode. Quota
mode: none.
[  401.676363] md: reshape of RAID array md127
[  405.049914] md: md127: reshape interrupted.
[  615.617649] INFO: task md127_reshape:5304 blocked for more than 122 seconds.
[  615.617684] task:md127_reshape   state:D stack:0     pid:5304
tgid:5304  ppid:2      flags:0x00004000
[  615.617747]  wait_barrier.part.0+0x188/0x1e0 [raid10]
[  615.617781]  wait_barrier+0x70/0xc0 [raid10]
[  615.617798]  raid10_sync_request+0x1545/0x183d [raid10]
[  615.617831]  md_do_sync.cold+0x609/0xa1f
[  615.617862]  md_thread+0xa3/0x1a0
[  615.617875]  ? __pfx_md_thread+0x10/0x10

The portion of dmesg that starts with INFO: repeats several times
every two minutes or so.

I really don't know what to do here. I've been struggling for a few
months trying to get it back to working with no success.

If the reshape isn't ultimately going to be successful, can anyone
explain how to at least mount the array safely so I can copy the data
off?

I can provide further info as needed.

Thank you,
Bill Morgan

.