raid5 reshape failure - restart?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In trying to reshape a raid5 array, I encountered some problems.
I was trying to reshape from raid5 3->4 devices.  The reshape process
started with seeming no problems, however i noticed in the kernel log
a number of ata3.00: failed command: WRITE FPDMA QUEUED errors.
In trying to determine if this was going to be bad for me, I disabled
ncq on this device. Looking at the log, i notice around the same time
/dev/sdd reported problems and took itself offline.
At this point the reshape seemed to be continuing w/o issue, even
though one of the drives was offline.. I wasn't sure that this made
sense.

Shortly after, I noticed that the progress on the reshape had stalled.
 I tried changing the stripe_cache_size from 256 to [1024|2048|4096],
but the reshape did not resume.  top reported that the reshape process
was using 100% of one core, and the load average was climbing into the
50's

At this point I rebooted.   The array does not start.

Can the reshape be restarted?  I cannot figure out where the backup
file ended up.  It does not seem to be where I thought I saved it.

Can I assemble this array with only the 3 original devices? Is there a
way to recover at least some of the data on the array?  I have various
backups, but there are some stuff that was not "critical' but would
still be handy to not loose.

Various logs that could be helpful:  md_d2 is the array in question.
Thanks..
--Glen

# mdadm --version
mdadm - v3.1.4 - 31st August 2010

 # uname -a
Linux palidor 2.6.36-gentoo-r5 #1 SMP Wed Mar 2 20:54:16 EST 2011
x86_64 Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz GenuineIntel
GNU/Linux

current state:

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [multipath] [raid1]
md8 : active raid5 sdh1[0] sdg1[4] sdf1[1] sdi1[3] sde1[2]
      5860542464 blocks level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]

md_d2 : inactive sdb5[1](S) sda5[0](S) sdd5[2](S) sdc5[3](S)
      2799357952 blocks super 0.91

md1 : active raid5 sdd3[2] sdb3[1] sda3[0]
      62926336 blocks level 5, 256k chunk, algorithm 2 [3/3] [UUU]

md0 : active raid1 sdb1[1] sda1[0] sdd1[2]
      208704 blocks [3/3] [UUU]


# mdadm -E /dev/sdb5   ([abc]) are all similiar.
/dev/sdb5:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 2803efc9:c5d2ec1e:9894605d:35c5ea6f
  Creation Time : Sat Oct  3 11:01:02 2009
     Raid Level : raid5
  Used Dev Size : 699839488 (667.42 GiB 716.64 GB)
     Array Size : 2099518464 (2002.26 GiB 2149.91 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2

  Reshape pos'n : 62731776 (59.83 GiB 64.24 GB)
  Delta Devices : 1 (3->4)

    Update Time : Sun May 15 11:25:21 2011
          State : active
 Active Devices : 3
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 2f2eac3a - correct
         Events : 114069

         Layout : left-symmetric
     Chunk Size : 256K

      Number   Major   Minor   RaidDevice State
this     1       8       21        1      active sync   /dev/sdb5

   0     0       8        5        0      active sync   /dev/sda5
   1     1       8       21        1      active sync   /dev/sdb5
   2     2       0        0        2      faulty removed
   3     3       8       37        3      active sync   /dev/sdc5

# mdadm -E /dev/sdd5
/dev/sdd5:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 2803efc9:c5d2ec1e:9894605d:35c5ea6f
  Creation Time : Sat Oct  3 11:01:02 2009
     Raid Level : raid5
  Used Dev Size : 699839488 (667.42 GiB 716.64 GB)
     Array Size : 2099518464 (2002.26 GiB 2149.91 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2

  Reshape pos'n : 18048768 (17.21 GiB 18.48 GB)
  Delta Devices : 1 (3->4)

    Update Time : Sun May 15 10:51:41 2011
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 29dcc275 - correct
         Events : 113870

         Layout : left-symmetric
     Chunk Size : 256K

      Number   Major   Minor   RaidDevice State
this     2       8       53        2      active sync   /dev/sdd5

   0     0       8        5        0      active sync   /dev/sda5
   1     1       8       21        1      active sync   /dev/sdb5
   2     2       8       53        2      active sync   /dev/sdd5
   3     3       8       37        3      active sync   /dev/sdc5
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux