Re: Stuck array after reshape

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Some more information:

The backup file did not get created.

Output of mdadm --examine on all array members:
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x4
     Array UUID : 0776e3ee:8eadc682:c8ff2ffd:d55da146
           Name : provider:5  (local to host provider)
  Creation Time : Fri Dec 14 19:30:10 2012
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 5857157120 (2792.91 GiB 2998.86 GB)
     Array Size : 11714312192 (11171.64 GiB 11995.46 GB)
  Used Dev Size : 5857156096 (2792.91 GiB 2998.86 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : aeefb8d7:6ccfae75:44060614:0bfb97d8

  Reshape pos'n : 8192 (8.00 MiB 8.39 MB)
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

    Update Time : Tue May 20 18:29:25 2014
       Checksum : dfd2dc2b - correct
         Events : 40365

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAAAA ('A' == active, '.' == missing)
/dev/sdc1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : f9d7a8cc:b8f830c4:b748a00c:d0712fef
  Creation Time : Mon May 18 20:35:08 2009
     Raid Level : raid1
  Used Dev Size : 979840 (957.04 MiB 1003.36 MB)
     Array Size : 979840 (957.04 MiB 1003.36 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Tue May 20 18:04:34 2014
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 8151619e - correct
         Events : 2657


      Number   Major   Minor   RaidDevice State
this     0       8       33        0      active sync   /dev/sdc1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       81        1      active sync   /dev/sdf1
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x4
     Array UUID : 0776e3ee:8eadc682:c8ff2ffd:d55da146
           Name : provider:5  (local to host provider)
  Creation Time : Fri Dec 14 19:30:10 2012
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 5857157120 (2792.91 GiB 2998.86 GB)
     Array Size : 11714312192 (11171.64 GiB 11995.46 GB)
  Used Dev Size : 5857156096 (2792.91 GiB 2998.86 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 4e00dec5:806115fc:f89d0c60:f000afc8

  Reshape pos'n : 8192 (8.00 MiB 8.39 MB)
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

    Update Time : Tue May 20 18:29:25 2014
       Checksum : 907c1c76 - correct
         Events : 40365

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAAAA ('A' == active, '.' == missing)
/dev/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x4
     Array UUID : 0776e3ee:8eadc682:c8ff2ffd:d55da146
           Name : provider:5  (local to host provider)
  Creation Time : Fri Dec 14 19:30:10 2012
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 5857157120 (2792.91 GiB 2998.86 GB)
     Array Size : 11714312192 (11171.64 GiB 11995.46 GB)
  Used Dev Size : 5857156096 (2792.91 GiB 2998.86 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : a912e9ad:a9c802bb:26d296f2:7efe0ef6

  Reshape pos'n : 8192 (8.00 MiB 8.39 MB)
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

    Update Time : Tue May 20 18:29:25 2014
       Checksum : f75ec7b8 - correct
         Events : 40365

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAAAA ('A' == active, '.' == missing)
/dev/sdf1:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : f9d7a8cc:b8f830c4:b748a00c:d0712fef
  Creation Time : Mon May 18 20:35:08 2009
     Raid Level : raid1
  Used Dev Size : 979840 (957.04 MiB 1003.36 MB)
     Array Size : 979840 (957.04 MiB 1003.36 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Tue May 20 18:04:34 2014
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 815161d0 - correct
         Events : 2657


      Number   Major   Minor   RaidDevice State
this     1       8       81        1      active sync   /dev/sdf1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       81        1      active sync   /dev/sdf1
/dev/sdg1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x6
     Array UUID : 0776e3ee:8eadc682:c8ff2ffd:d55da146
           Name : provider:5  (local to host provider)
  Creation Time : Fri Dec 14 19:30:10 2012
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 5857157120 (2792.91 GiB 2998.86 GB)
     Array Size : 11714312192 (11171.64 GiB 11995.46 GB)
  Used Dev Size : 5857156096 (2792.91 GiB 2998.86 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
Recovery Offset : 4096 sectors
          State : active
    Device UUID : 4f8b8fa9:7b1fa78d:f949e727:e49b5647

  Reshape pos'n : 8192 (8.00 MiB 8.39 MB)
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

    Update Time : Tue May 20 18:29:25 2014
       Checksum : 4c41bc6d - correct
         Events : 40365

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 4
   Array State : AAAAAA ('A' == active, '.' == missing)
/dev/sdh1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x4
     Array UUID : 0776e3ee:8eadc682:c8ff2ffd:d55da146
           Name : provider:5  (local to host provider)
  Creation Time : Fri Dec 14 19:30:10 2012
     Raid Level : raid6
   Raid Devices : 6

 Avail Dev Size : 5857157120 (2792.91 GiB 2998.86 GB)
     Array Size : 11714312192 (11171.64 GiB 11995.46 GB)
  Used Dev Size : 5857156096 (2792.91 GiB 2998.86 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 40d5699d:911a2a3f:d91abf18:49ba8467

  Reshape pos'n : 8192 (8.00 MiB 8.39 MB)
  Delta Devices : 1 (5->6)
     New Layout : left-symmetric

    Update Time : Tue May 20 18:29:25 2014
       Checksum : 2a4e0b6 - correct
         Events : 40365

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 5
   Array State : AAAAAA ('A' == active, '.' == missing)




On Tue, 20 May 2014 18:56:17 +0000
Davíð Steinn Geirsson <david@xxxxxx> wrote:

> Hi all,
> 
> I tried to reshape an MD RAID array, going from a 4-disk RAID5 to a
> 6-disk RAID6. This seems to have failed and now I'm afraid to turn the
> machine off.
> 
> What I did:
> mdadm --add /dev/md5 /dev/sdh1
> mdadm --add /dev/md5 /dev/sdg1
> mdadm --grow /dev/md5
> --backup-file /root/vg_3T_reshape_201405_mdbackup --level=6
> --raid-devices=6
> 
> The last command returned with no error, the way it usually does.
> However, now everything that tries to access the array hangs:
> mdadm -D /dev/md5 # hangs
> cat /proc/mdstat # hangs
> Trying to read mounted filesystems also hangs.
> 
> The two new drives are on a brand new IBM M1015 (crossflashed to LSI
> 9211). I have not used this controller previously, but before I tried
> the reshape I did write a GPT partition table and successfully read it
> back from the two drives.
> 
> From dmesg around this time:
> [ 1340.951731] md: bind<sdh1>
> [ 1346.150654] scsi_verify_blk_ioctl: 38 callbacks suppressed
> [ 1346.150662] mdadm: sending ioctl 1261 to a partition!
> [ 1346.150669] mdadm: sending ioctl 1261 to a partition!
> [ 1346.155219] mdadm: sending ioctl 1261 to a partition!
> [ 1346.155228] mdadm: sending ioctl 1261 to a partition!
> [ 1346.160528] mdadm: sending ioctl 1261 to a partition!
> [ 1346.160535] mdadm: sending ioctl 1261 to a partition!
> [ 1346.160688] mdadm: sending ioctl 1261 to a partition!
> [ 1346.160694] mdadm: sending ioctl 1261 to a partition!
> [ 1346.160913] mdadm: sending ioctl 1261 to a partition!
> [ 1346.160918] mdadm: sending ioctl 1261 to a partition!
> [ 1346.185864] md: bind<sdg1>
> [ 1370.267086] scsi_verify_blk_ioctl: 38 callbacks suppressed
> [ 1370.267095] mdadm: sending ioctl 1261 to a partition!
> [ 1370.267103] mdadm: sending ioctl 1261 to a partition!
> [ 1461.662068] mdadm: sending ioctl 1261 to a partition!
> [ 1461.662078] mdadm: sending ioctl 1261 to a partition!
> [ 1521.675927] md/raid:md5: device sde1 operational as raid disk 0
> [ 1521.675937] md/raid:md5: device sdd1 operational as raid disk 3
> [ 1521.675943] md/raid:md5: device sda1 operational as raid disk 2
> [ 1521.675949] md/raid:md5: device sdb1 operational as raid disk 1
> [ 1521.677471] md/raid:md5: allocated 5332kB
> [ 1521.692766] md/raid:md5: raid level 6 active with 4 out of 5
> devices, algorithm 18
> [ 1521.692849] RAID conf printout:
> [ 1521.692853]  --- level:6 rd:5 wd:4
> [ 1521.692859]  disk 0, o:1, dev:sde1
> [ 1521.692864]  disk 1, o:1, dev:sdb1
> [ 1521.692869]  disk 2, o:1, dev:sda1
> [ 1521.692873]  disk 3, o:1, dev:sdd1
> [ 1522.801181] RAID conf printout:
> [ 1522.801190]  --- level:6 rd:6 wd:5
> [ 1522.801196]  disk 0, o:1, dev:sde1
> [ 1522.801201]  disk 1, o:1, dev:sdb1
> [ 1522.801205]  disk 2, o:1, dev:sda1
> [ 1522.801210]  disk 3, o:1, dev:sdd1
> [ 1522.801215]  disk 4, o:1, dev:sdg1
> [ 1522.801230] RAID conf printout:
> [ 1522.801234]  --- level:6 rd:6 wd:5
> [ 1522.801239]  disk 0, o:1, dev:sde1
> [ 1522.801243]  disk 1, o:1, dev:sdb1
> [ 1522.801248]  disk 2, o:1, dev:sda1
> [ 1522.801252]  disk 3, o:1, dev:sdd1
> [ 1522.801256]  disk 4, o:1, dev:sdg1
> [ 1522.801261]  disk 5, o:1, dev:sdh1
> [ 1522.801374] md: reshape of RAID array md5
> [ 1522.801379] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
> [ 1522.801384] md: using maximum available idle IO bandwidth (but not
> more than 200000 KB/sec) for reshape.
> [ 1522.801396] md: using 128k window, over a total of 2928578048k.
> [ 1522.802248] mdadm: sending ioctl 1261 to a partition!
> [ 1522.802256] mdadm: sending ioctl 1261 to a partition!
> [ 1522.883851] mdadm: sending ioctl 1261 to a partition!
> [ 1522.883860] mdadm: sending ioctl 1261 to a partition!
> [ 1525.134837] md: md_do_sync() got signal ... exiting
> [ 1681.128046] INFO: task jbd2/dm-3-8:1494 blocked for more than 120
> seconds.
> [ 1681.128129] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 1681.128206] jbd2/dm-3-8     D ffff88007fc13540     0  1494      2
> 0x00000000
> [ 1681.128217]  ffff88007beab0a0 0000000000000046 ffff88005d0b1470
> ffff88007a3208b0
> [ 1681.128227]  0000000000013540 ffff88007c0dffd8 ffff88007c0dffd8
> ffff88007beab0a0
> [ 1681.128236]  059f7b5300000000 ffffffff81065a2f ffff88007bbe4d70
> ffff88007fc13d90
> [ 1681.128245] Call Trace:
> [ 1681.128263]  [<ffffffff81065a2f>] ? timekeeping_get_ns+0xd/0x2a
> [ 1681.128273]  [<ffffffff8111bda1>] ? wait_on_buffer+0x28/0x28
> [ 1681.128283]  [<ffffffff813483e4>] ? io_schedule+0x59/0x71
> [ 1681.128289]  [<ffffffff8111bda7>] ? sleep_on_buffer+0x6/0xa
> [ 1681.128296]  [<ffffffff81348827>] ? __wait_on_bit+0x3e/0x71
> [ 1681.128303]  [<ffffffff813488c9>] ?
> out_of_line_wait_on_bit+0x6f/0x78 [ 1681.128310]
> [<ffffffff8111bda1>] ? wait_on_buffer+0x28/0x28 [ 1681.128319]
> [<ffffffff8105f575>] ? autoremove_wake_function+0x2a/0x2a
> [ 1681.128354]  [<ffffffffa018d9c0>] ?
> jbd2_journal_commit_transaction+0xb9b/0x1057 [jbd2]
> [ 1681.128366]  [<ffffffff8100d02f>] ? load_TLS+0x7/0xa
> [ 1681.128373]  [<ffffffff8100d6a3>] ? __switch_to+0x133/0x258
> [ 1681.128389]  [<ffffffffa01910ae>] ? kjournald2+0xc0/0x20a [jbd2]
> [ 1681.128397]  [<ffffffff8105f54b>] ? add_wait_queue+0x3c/0x3c
> [ 1681.128412]  [<ffffffffa0190fee>] ? commit_timeout+0x5/0x5 [jbd2]
> [ 1681.128420]  [<ffffffff8105ef05>] ? kthread+0x76/0x7e
> [ 1681.128430]  [<ffffffff813505b4>] ? kernel_thread_helper+0x4/0x10
> [ 1681.128438]  [<ffffffff8105ee8f>] ? kthread_worker_fn+0x139/0x139
> [ 1681.128446]  [<ffffffff813505b0>] ? gs_change+0x13/0x13
> [... more hung task warnings from other processes follow ...]
> 
> 
> This machine is running debian wheezy. mdadm version is 3.2.5-1 from
> debian wheezy. Kernel is 3.2.18-1 from wheezy (3.2.0-2-amd64).
> 
> Any help would be much appreciated! Especially if the data is
> recoverable. It's possible that the reshape process never actually got
> started and rebooting the machine without the new disks will make
> everything "just work"... but I don't want to try that just yet, in
> case it prevents future data recovery work.
> 
> Any thoughts? Or more debug info I could provide to diagnose this?
> 
> Best regards,
> Davíð

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux