> have a disk that seems to have bad sectors: > > hdd: dma_intr: status=0x71 { DriveReady DeviceFault SeekComplete > Error } hdd: dma_intr: error=0x04 { DriveStatusError } hdd: DMA > disabled ide1: reset: success hdd: set_geometry_intr: status=0x71 { > DriveReady DeviceFault SeekComplete Error } hdd: set_geometry_intr: > error=0x04 { DriveStatusError } ide1: reset: success hdd: > set_geometry_intr: status=0x71 { DriveReady DeviceFault SeekComplete > Error } hdd: set_geometry_intr: error=0x04 { DriveStatusError } > end_request: I/O error, dev 16:41 (hdd), sector 56223488 hdd: > recal_intr: status=0x71 { DriveReady DeviceFault SeekComplete Error > } > hdd: recal_intr: error=0x04 { DriveStatusError } > ide1: reset: success > ... > hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error } > hdd: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=23844548, > sector=23844480 ... > > and so on, for a number of sectors. > This drive, hdd, has one partition, hdd1 that participates in a md0 > array. Let's assume I can't just get rid of the drive that has > problems, I have to keep it and it has to stay in the array. If it > wasn't part of the array, I could run a badblocks -w test, find the > numbers of the failing sectors and feed them to mke2fs/e2fsck or > whatever other utility the filesystem on hdd1 has for marking bad > blocks, and the problem would (hopefully) end there. > > However, since hdd1 is part of the array, I suppose it's not > possible to run badblocks on that single device and then somehow map > the blocks on hdd1 to blocks on md0, especially since it's striping, > raid-0, right? Is there some way to find out which sectors of md0 > are on this drive, so I can limit the range of sectors to run > badblocks on? Running badblocks read+write on such a huge device > with can take ages. > > If anyone has any other suggestions, they are also welcome :) > If you are running raid1, there is a solution -- kludgy, but it will work remove the bad drive from the array. Take the second drive and mark it as a regular set or partitions rather than raid and restart it as a stand alone file system. The raid superblock is at the end of the partition so if you have formatted as ext2, the partions will mount normally if the fstab, kernel, etc.. are asked to mount the underlying partitions instead of the raid device. do a complete reformat of the bad raid device -- it should be mounted with the other disk failed but really mounted as the real file system /dev/hdx...etc... If you have a rescue disk that can do the mounting of the old disks, better yet, then you don't have to alter the fstab, etc... use cpio to transfer the disk image from the old good disk to the newly formatted BAD disk. I'd suggest doing the first directory levels independently so you can avoid the /dev, /proc, /tmp and /mnt directories. Create those by hand, copy /dev by hand using cp -a cd / file /targetdir | cpio -padm /mnt I've done this many times without backup, though I don't recommend it. If you screw up you're dead. Better to take a spare disk and sync it to the remaining good one so you have a backup (faster easier) or run a backup tape. If you choose to go the spare disk route, use it instead of the original -- test it carefully to make sure the files really transferred as you expected (memory error can eat your lunch). Once the transfer is complete, remount the new BAD disk as the OS file system and do a raid hot add of the old good disk. It will sync with the bad blocks ignored. If used this exact technique ONCE only, it did work fine. It was a while ago and as I recall it did produce some errors in the area where the bad blocks reside but no where else. The system has been running for some time and I've encountered no problems with it. raid1 on PII with 2.4.17, 2 - 3.5 gig ide drives Michael - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html