thanks to all for the info. I am backup and working after rebuilding - now I am looking for a way to monitor my array. I have looked at raidmon - does this monitor hardware? Is there a specific monitor for my specific controller - "Compaq smart array 431 (chipset - sym53c896) Thanks, Doug -----Original Message----- From: Dominic RIVERA [mailto:Dominic.Rivera@xxxxxxxxxxx] Sent: Tuesday, October 14, 2003 3:02 PM To: redhat-list@xxxxxxxxxx Subject: Re: HELP !!! raid 5 problem You should be able to hotswap the drive and then verify integrity before the system goes down. Obviously in some cases you won't be able to like when the drive fails when the system is powered down. Sometimes it is possible to reconstruct a raid-5 array that has, but I'd be very careful fomr this point on out, you're in some seriously dangerous territory. One problem you *might* be running into that you may be able to fix is if you're manually assigning scsi id's to the drives. Linux assigns device names based on the order that they are found, so if you put in a new disk in a different place, or if your new drive was discovered in a different order then your array might be trying to reconstruct itself wrong. Your comment about ls not working makes me think you had more than just a single driver error. Were you booting off of a raid5 array? You shouldn't be able to put your /boot partition on (software) raid5 as far as I know. What drive was it that failed, your boot drive? Did you install lilo/grub onto multiple disks? Any more information about your config could be helpful. -Dominic Dominic Rivera (503) 947-7308 dominic.rivera@xxxxxxxxxxx >>> redhat@xxxxxxxxxxx 10/14/03 12:32 PM >>> Simpson, Doug wrote: >I have a compaq proliant ml 350 that is running RH 7.0. It is RAIDed to 5, >however, one of the 4 drives crashed yesterday. We hot swapped the old for a >new. Rebooted and crossed our figures. The LILO screen comes up and then >flashes into a text screen saying "LOADING LINUX ...". It does this about 5 >or 6 times and then stops on the "LOADING LINUX ....." screen. >We cannot access the machine. >We can see the drives flashing and it looks like the one drive is fgoing >through the rebuilding process. >With a 4 drive RAID5 we should still be able to get in - no? >Yesterday when the drive crashed the machine was not very usable - ie. could >not run "ls" or "find". >This is a bit strange for a RAID5 setup. >Compaq/hp has told us to wait 20 minutes a gig for the rebuild - 6 hours. >We are wondering if this six hours will be wasted time. >Has anyone seen this behavior before? >Any thoughts would be much appreciated. >We think the boot sector is corrupt and that is why it will not boot. So we >are waitning fo rhte stripping to finish and then fix teh boot sector. > >Anyone? > >Thanks, >Doug > > > > Well, the RAID set is not working then. The whole purpose of RAID is to avoid data loss. If 1 drive out of 5 is lost, the RAID set can still reconstruct the missing data using parity information, however, you will suffer a large performance loss for this. The rebuild process should have no affect on the integrity of the raid set, just a performance hit. Something is either configured wrong or faulty. -CC -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list -- redhat-list mailing list unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe https://www.redhat.com/mailman/listinfo/redhat-list