Hum. Why not simply use the "fail" option to "fail" and thus put offline the problematic drive ? mdadm --fail /your/raid/device /the/drive/you/want/to/fail You can "remove" the drive afterward with the "remove" command ;-). I don't think you should do any "physical" operation like disconnecting power supply of a live disk - even if it is a dodgy disk. "This eliminates possibility that the bad disk will lock up the system" but "create the possibility of a short circuit and having no more system at all". Pascal Charest -- Pascal Charest, Free software consultant {GNU/Linux} http://blog.pacharest.com On Tue, May 13, 2008 at 12:28 PM, David Lethe <david@xxxxxxxxxxxx> wrote: > I would also add to Steve's suggestion that you be prepared to > immediately disconnect the power to the dodgy disk once the rebuild > starts. That eliminates possibility that the bad disk will lock up the > system. > > David > > > -----Original Message----- > From: linux-raid-owner@xxxxxxxxxxxxxxx > > > [mailto:linux-raid-owner@xxxxxxxxxxxxxxx] On Behalf Of Steve Fairbairn > Sent: Tuesday, May 13, 2008 11:11 AM > To: 'Joshua Johnson'; linux-raid@xxxxxxxxxxxxxxx > Subject: RE: Help recovering from failed disk on RAID 6 > > Hi, > > It appears noone else has answered, so I'll try. First I'd attempt to > start the array with the --force parameter, which I believe will start > the dirty array without the failed drive in it. > > The other option to try depends on how long you have before the OS > freezes, but is to start the array with the dodgy drive in it, but > immediately tell mdadm to fail the dodgy disk. This should have mdadm > start a resync with the spare drive. > > Hope this helps, > > Steve. > > > -----Original Message----- > > From: linux-raid-owner@xxxxxxxxxxxxxxx > > [mailto:linux-raid-owner@xxxxxxxxxxxxxxx] On Behalf Of Joshua Johnson > > Sent: 28 April 2008 03:17 > > To: linux-raid@xxxxxxxxxxxxxxx > > Subject: Help recovering from failed disk on RAID 6 > > > > > > I am running a linux server with an 8 disk IDE/SATA RAID 6 > > array. One of the disks is having a problem which caused the > > machine to freeze. If I boot the machine without the problem > > disk the array fails to start. If I boot with the problem > > disk the array starts correctly and begins syncing, but the > > machine will soon freeze up again when the disk drops out. > > My number one question is how to get the array back online. > > It has a spare disk, but since the OS is freezing rather than > > failing the disk that is having the problem, it never > > switched to the new disk. When I try to start the array > > without the problem disk, I > > get: > > > > #mdadm --manage --run /dev/md0 > > raid5: device hda2 operational as raid disk 0 > > raid5: device sdb2 operational as raid disk 7 > > raid5: device sda1 operational as raid disk 6 > > raid5: device hdi2 operational as raid disk 5 > > raid5: device hdg2 operational as raid disk 3 > > raid5: device hde2 operational as raid disk 2 > > raid5: device hdk2 operational as raid disk 1 > > raid5: cannot start dirty degraded array for md0 > > RAID5 conf printout: > > --- rd:8 wd:7 > > disk 0, o:1, dev:hda2 > > disk 1, o:1, dev:hdk2 > > disk 2, o:1, dev:hde2 > > disk 3, o:1, dev:hdg2 > > disk 5, o:1, dev:hdi2 > > disk 6, o:1, dev:sda1 > > disk 7, o:1, dev:sdb2 > > raid5: failed to run raid set md0 > > md: pers->run() failed ... > > mdadm: failed to run array /dev/md0: Input/output error > > > > /proc/mdstat contains: > > Personalities : [raid1] [raid6] [raid5] [raid4] > > md1 : active raid1 hdg1[1] hda1[0] > > 4200896 blocks [2/2] [UU] > > > > md0 : inactive hda2[0] sdc2[8](S) sdb2[7] sda1[6] hdi2[5] > > hdg2[3] hde2[2] hdk2[1] > > 1529265920 blocks > > > > > > So how do I get this array to run? I can't start it without > > the problem disk and I can't sync it with the problem disk. > > I am running RAID 6 to be able to recover from multiple disk > > failures so it is a little vexing that a single disk going > > offline renders my array unrunnable. Any help with this > > issue is greatly appreciated. > > -- > > To unsubscribe from this list: send the line "unsubscribe > > linux-raid" in the body of a message to > > majordomo@xxxxxxxxxxxxxxx More majordomo info at > http://vger.kernel.org/majordomo-info.html > > No virus found in this incoming message. > Checked by AVG. > Version: 7.5.524 / Virus Database: 269.23.9/1419 - Release Date: > 07/05/2008 07:46 > > > No virus found in this outgoing message. > Checked by AVG. > Version: 7.5.524 / Virus Database: 269.23.16/1429 - Release Date: > 12/05/2008 18:14 > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Pascal Charest, Free software consultant {GNU/Linux} http://blog.pacharest.com -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html