raid5 reshape stuck at 33%

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Started out with 8 drives, added and grew 4 more at the same time on
mdadm 3.4. The reshape got stuck at 33% and errors showed up in dmesg
about no response for more than 120 seconds, tainted, call trace
etc...I tried rebooting and the reshape won't progress any further. If
I freeze the reshape, then I can mount and read the data, otherwise
mdadm commands won't respond (100% cpu usage)

I also tried booting from the latest debian live image with mdadm 4.1,
but reshape still won't progress past 33%. Suspecting drive issues
(smart tests failed, badblocks), I physically removed one at a time
and forced assemble to see if reshape would progress. I did this
twice, each time with a different drive, which I guess was a bad idea
because the array considers the 12th as a spare now after a --re-add.

Personalities : [raid6] [raid5] [raid4]
md127 : active raid5 sdc1[1] sdi1[9](S) sdd1[11] sde1[12] sdk1[13]
sdl1[14] sdj1[10] sdf1[8] sdg1[6] sdb1[5] sdh1[4] sda1[3]
      27348203008 blocks super 1.2 level 5, 512k chunk, algorithm 2
[12/11] [_UUUUUUUUUUU]
      bitmap: 5/30 pages [20KB], 65536KB chunk

/dev/md127:
           Version : 1.2
     Creation Time : Fri Mar  4 02:28:46 2016
        Raid Level : raid5
        Array Size : 27348203008 (26081.28 GiB 28004.56 GB)
     Used Dev Size : 3906886144 (3725.90 GiB 4000.65 GB)
      Raid Devices : 12
     Total Devices : 12
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Wed Nov 27 23:10:08 2019
             State : clean, degraded
    Active Devices : 11
   Working Devices : 12
    Failed Devices : 0
     Spare Devices : 1

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

     Delta Devices : 4, (8->12)

              Name : debian:one  (local to host debian)
              UUID : a6659be9:4545dfa0:678228ad:294eede4
            Events : 2022146

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       1       8       33        1      active sync   /dev/sdc1
       3       8        1        2      active sync   /dev/sda1
       4       8      113        3      active sync   /dev/sdh1
       5       8       17        4      active sync   /dev/sdb1
       6       8       97        5      active sync   /dev/sdg1
       8       8       81        6      active sync   /dev/sdf1
      10       8      145        7      active sync   /dev/sdj1
      14       8      177        8      active sync   /dev/sdl1
      13       8      161        9      active sync   /dev/sdk1
      12       8       65       10      active sync   /dev/sde1
      11       8       49       11      active sync   /dev/sdd1

       9       8      129        -      spare   /dev/sdi1

The reshape is still frozen, volume groups are mounted and I can read
the data. Don't remember when I tried the revert-reshape option but
had an error about reshape is not aligned, try stop and assemble
again. Not sure if it was when the array had 12 active devices or 11.
Is it possible to get back to the 8+4 active array state, and then
successfully revert still?

The initial grow command specified a backup file on a usb drive, but I
can't find it. I assume mdadm deleted it intentionally.

Thanks for any help.



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux