RAID5 mdadm --grow wrote nothing (Reshape Status : 0% complete) and cannot assemble anymore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everybody, I need some help to retrieve access to a very Big RAID volume, even if remote backups are existing for the most important part of the data into it, RAID volume is completely locked but nothing is destroyed, so I guess you can help me.

No real change has been written to the disks : the reshape (from 3 to 4 disk) didn't actually started (it was stuck at 0%). No disk is defective and I guess nothing wrong or stupid has been attempted. And no power failure. I saved information that were available into dmesg (some lines below). I also used --backup-file=/root/grow_md0.bak but it does not seem to contain anything useful (3 149 824 null bytes), or I don't know.

I have a RAID5 array, which was made of 3 disks : /dev/sdb1, /dev/sdd1 and /dev/sde1
Each one is 8TB

sdc1 is a new disk (I created a GPT partition table and an empty partition, like I always did before placing a disk into a RAID). Then I played :

  * mdadm --add /dev/md0 /dev/sdc1
  * mdadm --detail /dev/md0 (fine, the new disk was shown as spare)
  * mdadm --grow --raid-devices=4 --backup-file=/root/grow_md0.bak /dev/md0

When playing mdadm --detail /dev/md0, it seemed to be fine, and was showing :

    State : clean, reshaping
    Reshape Status : 0% complete

But there was no activity on any disk (just few bytes read when I was reading some file), even after 10 minutes (according to bwm-ng and the HDD led). This question is exactly what happened to me : https://serverfault.com/questions/814025/mdadm-reshape-raid6-does-not-start

So I stopped services that were using the mount point, played umount /media/RAID-VOLUME. Still no activity : I played mdadm --stop /dev/md0 and restarted the computer

I can't manage to mount the array anymore, with or without backup file, with or without the /dev/sdb1 new disk :

    mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
    mdadm: Failed to restore critical section for reshape, sorry.

Here is attached the commands and results, and just below, some things that are may be useful to re-enable access to the data.

Can someone help me into the recovery of this situation ?
If something risky has to be attempted, I can remove the new disk, keeping only 3 drive in place. Then, I would have 2 spare drives (out of 3) that should be enough to backup at least 2 disks before attempting anything risky... Using a 3 disk RAID5, having 2 disk in perfect condition should be fine to recover anything, isn't it ?

Thank you in advance !
Julien

Apr 30 02:48:37 Pix-Server-Sorel kernel: [47933.831101] md: bind<sdc1>
Apr 30 02:48:37 Pix-Server-Sorel kernel: [47934.106022] RAID conf printout:
Apr 30 02:48:37 Pix-Server-Sorel kernel: [47934.106025]  --- level:5 rd:3 wd:3 Apr 30 02:48:37 Pix-Server-Sorel kernel: [47934.106028]  disk 0, o:1, dev:sdd1 Apr 30 02:48:37 Pix-Server-Sorel kernel: [47934.106029]  disk 1, o:1, dev:sde1 Apr 30 02:48:37 Pix-Server-Sorel kernel: [47934.106031]  disk 2, o:1, dev:sdb1
Apr 30 02:49:26 Pix-Server-Sorel kernel: [47982.904011] RAID conf printout:
Apr 30 02:49:26 Pix-Server-Sorel kernel: [47982.904014]  --- level:5 rd:4 wd:4 Apr 30 02:49:26 Pix-Server-Sorel kernel: [47982.904016]  disk 0, o:1, dev:sdd1 Apr 30 02:49:26 Pix-Server-Sorel kernel: [47982.904017]  disk 1, o:1, dev:sde1 Apr 30 02:49:26 Pix-Server-Sorel kernel: [47982.904018]  disk 2, o:1, dev:sdb1 Apr 30 02:49:26 Pix-Server-Sorel kernel: [47982.904019]  disk 3, o:1, dev:sdc1 Apr 30 02:49:26 Pix-Server-Sorel kernel: [47982.904087] md: reshape of RAID array md0 Apr 30 02:49:26 Pix-Server-Sorel kernel: [47982.904090] md: minimum _guaranteed_  speed: 1000 KB/sec/disk. Apr 30 02:49:26 Pix-Server-Sorel kernel: [47982.904092] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape. Apr 30 02:49:26 Pix-Server-Sorel kernel: [47982.904096] md: using 128k window, over a total of 7813894144k. Apr 30 03:02:37 Pix-Server-Sorel kernel: [48773.766672] md: md0: reshape interrupted. Apr 30 03:02:37 Pix-Server-Sorel kernel: [48773.827995] md: reshape of RAID array md0 Apr 30 03:02:37 Pix-Server-Sorel kernel: [48773.827997] md: minimum _guaranteed_  speed: 1000 KB/sec/disk. Apr 30 03:02:37 Pix-Server-Sorel kernel: [48773.827999] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape. Apr 30 03:02:37 Pix-Server-Sorel kernel: [48773.828021] md: using 128k window, over a total of 7813894144k. Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.027993] md: md0: reshape interrupted. Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.112612] md0: detected capacity change from 16002855206912 to 0
Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.112850] md: md0 stopped.
Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.112860] md: unbind<sdc1>
Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.132027] md: export_rdev(sdc1)
Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.132073] md: unbind<sdb1>
Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.148016] md: export_rdev(sdb1)
Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.148261] md: unbind<sde1>
Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.164018] md: export_rdev(sde1)
Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.164268] md: unbind<sdd1>
Apr 30 03:02:37 Pix-Server-Sorel kernel: [48774.200025] md: export_rdev(sdd1)


root@Pix-Server-Sorel:/home/user# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
mdadm: Failed to restore critical section for reshape, sorry.
       Possibly you needed to specify the --backup-file
root@Pix-Server-Sorel:/home/user# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
root@Pix-Server-Sorel:/home/user# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdd1 /dev/sde1
mdadm: Failed to restore critical section for reshape, sorry.
       Possibly you needed to specify the --backup-file
root@Pix-Server-Sorel:/home/user# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
root@Pix-Server-Sorel:/home/user# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 --backup-file /root/grow_md0.bak
mdadm: Failed to restore critical section for reshape, sorry.
root@Pix-Server-Sorel:/home/user# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
root@Pix-Server-Sorel:/home/user# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdd1 /dev/sde1 --backup-file /root/grow_md0.bak
mdadm: Failed to restore critical section for reshape, sorry.
root@Pix-Server-Sorel:/home/user# mdadm --examine --scan --verbose
ARRAY /dev/md/0  level=raid5 metadata=1.2 num-devices=4 UUID=293c6b6c:de6abd61:0a546f46:9996ba16 name=Pix-Server-Sorel:0
   devices=/dev/sdc1,/dev/sde1,/dev/sdb1,/dev/sdd1
root@Pix-Server-Sorel:/home/user# mdadm --examine /dev/md0
root@Pix-Server-Sorel:/home/user# mdadm --examine /dev/sdb
/dev/sdb:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
root@Pix-Server-Sorel:/home/user# mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : 293c6b6c:de6abd61:0a546f46:9996ba16
           Name : Pix-Server-Sorel:0  (local to host Pix-Server-Sorel)
  Creation Time : Sat Mar 17 22:18:02 2018
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 15627788288 (7451.91 GiB 8001.43 GB)
     Array Size : 23441682432 (22355.73 GiB 24004.28 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 656151c6:a45bd737:d6099641:520ed472

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 0
  Delta Devices : 1 (3->4)

    Update Time : Tue Apr 30 03:02:37 2019
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : e8cc435a - correct
         Events : 80978

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
root@Pix-Server-Sorel:/home/user# mdadm --examine /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : 293c6b6c:de6abd61:0a546f46:9996ba16
           Name : Pix-Server-Sorel:0  (local to host Pix-Server-Sorel)
  Creation Time : Sat Mar 17 22:18:02 2018
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 15627788288 (7451.91 GiB 8001.43 GB)
     Array Size : 23441682432 (22355.73 GiB 24004.28 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 1ba9976c:25477f1b:4d8f0f64:5780a217

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 0
  Delta Devices : 1 (3->4)

    Update Time : Tue Apr 30 03:02:37 2019
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 3a026e0e - correct
         Events : 80978

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
root@Pix-Server-Sorel:/home/user# mdadm --examine /dev/sdd1
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : 293c6b6c:de6abd61:0a546f46:9996ba16
           Name : Pix-Server-Sorel:0  (local to host Pix-Server-Sorel)
  Creation Time : Sat Mar 17 22:18:02 2018
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 15627788288 (7451.91 GiB 8001.43 GB)
     Array Size : 23441682432 (22355.73 GiB 24004.28 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 5b2f6332:ade8d470:2a6687eb:4386a7a6

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 0
  Delta Devices : 1 (3->4)

    Update Time : Tue Apr 30 03:02:37 2019
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 6ba0729c - correct
         Events : 80978

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
root@Pix-Server-Sorel:/home/user# mdadm --examine /dev/sde1
/dev/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : 293c6b6c:de6abd61:0a546f46:9996ba16
           Name : Pix-Server-Sorel:0  (local to host Pix-Server-Sorel)
  Creation Time : Sat Mar 17 22:18:02 2018
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 15627788288 (7451.91 GiB 8001.43 GB)
     Array Size : 23441682432 (22355.73 GiB 24004.28 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262056 sectors, after=0 sectors
          State : clean
    Device UUID : 8ca89464:e4353dea:bd1a45f4:8cc7b9a5

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 0
  Delta Devices : 1 (3->4)

    Update Time : Tue Apr 30 03:02:37 2019
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 1f0a2ee3 - correct
         Events : 80978

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
root@Pix-Server-Sorel:/home/user# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sde1[2](S) sdd1[0](S) sdb1[3](S)
      23441682432 blocks super 1.2
       
unused devices: <none>

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux