Re: mdadm --grow failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Marc Marais wrote:
I'm trying to grow my raid 5 array as I've just added a new disk. The array was originally 3 drives, I've added a fourth using:

mdadm -a /dev/md6 /dev/sda1

Which added the new drive as a spare. I then did:

mdadm --grow /dev/md6 -n 4

Which started the reshape operation.
Feb 16 23:51:40 xerces kernel: RAID5 conf printout:
Feb 16 23:51:40 xerces kernel:  --- rd:4 wd:4
Feb 16 23:51:40 xerces kernel:  disk 0, o:1, dev:sdb1
Feb 16 23:51:40 xerces kernel:  disk 1, o:1, dev:sdc1
Feb 16 23:51:40 xerces kernel:  disk 2, o:1, dev:sdd1
Feb 16 23:51:40 xerces kernel:  disk 3, o:1, dev:sda1
Feb 16 23:51:40 xerces kernel: md: reshape of RAID array md6
Feb 16 23:51:40 xerces kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk. Feb 16 23:51:40 xerces kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape. Feb 16 23:51:40 xerces kernel: md: using 128k window, over a total of 156288256 blocks.

Unfortunately one of the drives timed out during the operation (not a read error - just a timeout - which I would've thought would be retried but anyway...):

Feb 17 00:19:16 xerces kernel: ata3: command timeout
Feb 17 00:19:16 xerces kernel: ata3: no sense translation for status: 0x40
Feb 17 00:19:16 xerces kernel: ata3: translated ATA stat/err 0x40/00 to SCSI SK/ASC/ASCQ 0xb/00/00
Feb 17 00:19:16 xerces kernel: ata3: status=0x40 { DriveReady }
Feb 17 00:19:16 xerces kernel: sd 3:0:0:0: SCSI error: return code = 0x08000002 Feb 17 00:19:16 xerces kernel: sdc: Current [descriptor]: sense key: Aborted Command Feb 17 00:19:16 xerces kernel: Additional sense: No additional sense information Feb 17 00:19:16 xerces kernel: Descriptor sense data with sense descriptors (in hex): Feb 17 00:19:16 xerces kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Feb 17 00:19:16 xerces kernel: 00 00 00 01 Feb 17 00:19:16 xerces kernel: end_request: I/O error, dev sdc, sector 24065423 Feb 17 00:19:16 xerces kernel: raid5: Disk failure on sdc1, disabling device. Operation continuing on 3 devices

Which then unfortunately aborted the reshape operation:

Feb 17 00:19:16 xerces kernel: md: md6: reshape done.
Feb 17 00:19:17 xerces kernel: RAID5 conf printout:
Feb 17 00:19:17 xerces kernel:  --- rd:4 wd:3
Feb 17 00:19:17 xerces kernel:  disk 0, o:1, dev:sdb1
Feb 17 00:19:17 xerces kernel:  disk 1, o:0, dev:sdc1
Feb 17 00:19:17 xerces kernel:  disk 2, o:1, dev:sdd1
Feb 17 00:19:17 xerces kernel:  disk 3, o:1, dev:sda1
Feb 17 00:19:17 xerces kernel: RAID5 conf printout:
Feb 17 00:19:17 xerces kernel:  --- rd:4 wd:3
Feb 17 00:19:17 xerces kernel:  disk 0, o:1, dev:sdb1
Feb 17 00:19:17 xerces kernel:  disk 2, o:1, dev:sdd1
Feb 17 00:19:17 xerces kernel:  disk 3, o:1, dev:sda1

I re-added the failed disk (sdc) (which btw is a brand new disk - seems this is a controller issue - high IO load?) which then resynced the array.

At this point I'm confused as to the state of the array.

mdadm -D /dev/md6 gives:

/dev/md6:
        Version : 00.91.03
  Creation Time : Tue Aug  1 23:31:54 2006
     Raid Level : raid5
     Array Size : 312576512 (298.10 GiB 320.08 GB)
  Used Dev Size : 156288256 (149.05 GiB 160.04 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 6
    Persistence : Superblock is persistent

    Update Time : Sat Feb 17 12:14:22 2007
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 128K

  Delta Devices : 1, (3->4)

           UUID : 603e7ac0:de4df2d1:d44c6b9b:3d20ad32
         Events : 0.7215890

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1
       2       8       49        2      active sync   /dev/sdd1
       3       8        1        3      active sync   /dev/sda1

Although it previously (before issuing the command below) mentioned something about reshape 1% or something to that effect.

I've attempted to continue the reshape by issuing:

mdadm --grow /dev/md6 -n 4 Which gives the error that the array can't be reshaped without increasing its size!

Is my array destroyed? Seeing as the sda disk wasn't completely synced I'm wonder how it was using to resync the array when sdc went offline. I've got a bad feeling about this :|

Help appreciated. (I do have a full backup of course but that's a last resort with my luck I'd get a read error from the tape drive)
I have to think maybe a 'check' would have been good before the grow, but since Neil didn't suggest it, please don't now, unless he agrees that it's a valid attempt.

However, you certainly can run 'df' and see if the filesystem is resized.

--
bill davidsen <davidsen@xxxxxxx>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux