Sujit Karataparambil wrote: > http://www.gagme.com/greg/linux/raid-lvm.php > > you can try this with the spare drives you have. > > basically what you have to do is to check whether the drive > now linked to another device name, is the reason for this > problem. > > once it shows unplugged or failed you can, use your new > replacement drive and reboot. > > Kindly read the comments to this article, Which is very > usefull. > > On 8/27/08, Sujit Karataparambil <sjt.kar@xxxxxxxxx> wrote: > >>> Thanks much for the reply. For the purposes of this discussion you can >>> assume that I've already re-established confidence in the drive, the >>> cable, and the controller and that the data on the drives is worthless >>> and I just want to get maximum uptime without causing a raid assemble >>> problem on the next reboot. >>> >> Good. >> >> >>> Any idea on my original question? If I re-add the drive using the >>> /dev/sdc name will I have problems on the next boot when the drive is >>> named /dev/sda? >>> >> Since this seems to be block device it really does not matter. >> >> >>> Based on my experience with Linux and other software raid >>> implementations, I'm strongly inclined to think that the device naming >>> doesn't matter - the system will scan the drives at boot looking for >>> >> Kindly read some decent kernel documentation before you jump up and >> say this. Kindly surf the net and read some decent article's before you >> do any precious upgrades for now. >> >> Sujit >> >> -- >> --linux(2.4/2.6),bsd(4.5.x+),solaris(2.5+) >> >> > > Sujit, Thanks for the replies and the link. I appreciate them. I spent several hours this week reading the kernel documentation (md.txt), the mdadm man pages, the linux-raid wiki, and reading articles on the net before posting to the list. I highly recommend this wiki page from IBM for anybody who has an issue similar to mine (temporary failure that knocks a drive offline who wants to bring it back online without a reboot). This really helped me understand the processes for running a highly available raid-1 set on the linux kernel. The hardware is different but the same principles work. http://www-941.ibm.com/collaboration/wiki/pages/viewpage.action?pageId=3625 For anybody who is interested, I grabbed a test system this morning to simulate my situation (temporary drive failure). You can, in fact, bring the failed drive back online with a different device name, remirror it, and reboot with no issues on Centos5. The key, as Steve Fairbairn pointed out, is that the mdadm.conf file is setup to use the RAID UUID. I moved the drives into a different scan order with a the same test and that works also. My situation is slightly complicated by the fact that I'm booting off these same drives so I had to mirror the MBR on the second drive but since I had already done this previously on the wonky system this was easily achieved. see here http://www.dirigo.net/tuxTips/avoidingProblems/GrubMdMbr.php I'm running a couple of more tests today to see what happens if I rescan the device to bring it back online with the same device name - I'll post the result in case anybody is interested. I expect that I'll have to initiate the rebuild but don't expect any other problems. Regards, --Tony -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html