hi ya cajoline -- try replacing the "bad cable"... -- try blowing air on oyur hot disks... -- your disk is about to die or did die... run the disks at ata-33 instead of ata-100 and see if its any better hdparm -X... -- boot into single user... add a new disks... copy the bad disk onto the new disk ( do NOT use dd... you'd just copy the bad info too ) - if you lose some files... oh well.. hope you have backups ... or start using disk utilities to recover file segments and manually patch it back together c ya alvin http://www.Linux-Backup.net .. free scripts/methodologies ... On Fri, 8 Feb 2002, Cajoline wrote: > Thanks for the suggestions but the array is raid-0 and that can't change > anymore :) > > > -----Original Message----- > > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > > owner@vger.kernel.org] On Behalf Of Michael Robinton > > Sent: Friday, February 08, 2002 10:41 PM > > To: linux-raid@vger.kernel.org > > Subject: badblocks & raid > > > > > have a disk that seems to have bad sectors: > > > > > > hdd: dma_intr: status=0x71 { DriveReady DeviceFault SeekComplete > > > Error } hdd: dma_intr: error=0x04 { DriveStatusError } hdd: DMA > > > disabled ide1: reset: success hdd: set_geometry_intr: status=0x71 { > > > DriveReady DeviceFault SeekComplete Error } hdd: set_geometry_intr: > > > error=0x04 { DriveStatusError } ide1: reset: success hdd: > > > set_geometry_intr: status=0x71 { DriveReady DeviceFault SeekComplete > > > Error } hdd: set_geometry_intr: error=0x04 { DriveStatusError } > > > end_request: I/O error, dev 16:41 (hdd), sector 56223488 hdd: > > > recal_intr: status=0x71 { DriveReady DeviceFault SeekComplete Error > > > } > > > hdd: recal_intr: error=0x04 { DriveStatusError } > > > ide1: reset: success > > > ... > > > hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error } > > > hdd: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=23844548, > > > sector=23844480 ... > > > > > > and so on, for a number of sectors. > > > This drive, hdd, has one partition, hdd1 that participates in a md0 > > > array. Let's assume I can't just get rid of the drive that has > > > problems, I have to keep it and it has to stay in the array. If it > > > wasn't part of the array, I could run a badblocks -w test, find the > > > numbers of the failing sectors and feed them to mke2fs/e2fsck or > > > whatever other utility the filesystem on hdd1 has for marking bad > > > blocks, and the problem would (hopefully) end there. > > > > > > However, since hdd1 is part of the array, I suppose it's not > > > possible to run badblocks on that single device and then somehow map > > > the blocks on hdd1 to blocks on md0, especially since it's striping, > > > raid-0, right? Is there some way to find out which sectors of md0 > > > are on this drive, so I can limit the range of sectors to run > > > badblocks on? Running badblocks read+write on such a huge device > > > with can take ages. > > > > > > If anyone has any other suggestions, they are also welcome :) > > > > > If you are running raid1, there is a solution -- kludgy, but it will > > work > > > > remove the bad drive from the array. > > > > Take the second drive and mark it as a regular set or partitions > > rather than raid and restart it as a stand alone file system. The > > raid superblock is at the end of the partition so if you have > > formatted as ext2, the partions will mount normally if the fstab, > > kernel, etc.. are asked to mount the underlying partitions instead of > > the raid device. > > > > do a complete reformat of the bad raid device -- it should be mounted > > with the other disk failed but really mounted as the real file system > > /dev/hdx...etc... > > > > If you have a rescue disk that can do the mounting of the old disks, > > better yet, then you don't have to alter the fstab, etc... > > > > use cpio to transfer the disk image from the old good disk to the > > newly formatted BAD disk. I'd suggest doing the first directory > > levels independently so you can avoid the /dev, /proc, /tmp and /mnt > > directories. Create those by hand, copy /dev by hand using cp -a > > > > cd / > > file /targetdir | cpio -padm /mnt > > > > I've done this many times without backup, though I don't recommend > > it. If you screw up you're dead. Better to take a spare disk and sync > > it to the remaining good one so you have a backup (faster easier) or > > run a backup tape. If you choose to go the spare disk route, use it > > instead of the original -- test it carefully to make sure the files > > really transferred as you expected (memory error can eat your lunch). > > > > Once the transfer is complete, remount the new BAD disk as the OS > > file system and do a raid hot add of the old good disk. It will sync > > with the bad blocks ignored. > > > > If used this exact technique ONCE only, it did work fine. It was a > while > > ago and as I recall it did produce some errors in the area where the > bad > > blocks reside but no where else. The system has been running for some > time > > and I've encountered no problems with it. raid1 on PII with 2.4.17, > > 2 - 3.5 gig ide drives > > > > Michael > > > > - > > To unsubscribe from this list: send the line "unsubscribe linux-raid" > in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html