RE: badblocks & raid

"Cajoline" <cajoline@chaosengine.de> · Fri, 8 Feb 2002 22:55:38 +0200



Thanks for the suggestions but the array is raid-0 and that can't change
anymore :)

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Michael Robinton
> Sent: Friday, February 08, 2002 10:41 PM
> To: linux-raid@vger.kernel.org
> Subject: badblocks & raid
> 
> > have a disk that seems to have bad sectors:
> >
> > hdd: dma_intr: status=0x71 { DriveReady DeviceFault SeekComplete
> > Error } hdd: dma_intr: error=0x04 { DriveStatusError } hdd: DMA
> > disabled ide1: reset: success hdd: set_geometry_intr: status=0x71 {
> > DriveReady DeviceFault SeekComplete Error } hdd: set_geometry_intr:
> > error=0x04 { DriveStatusError } ide1: reset: success hdd:
> > set_geometry_intr: status=0x71 { DriveReady DeviceFault SeekComplete
> > Error } hdd: set_geometry_intr: error=0x04 { DriveStatusError }
> > end_request: I/O error, dev 16:41 (hdd), sector 56223488 hdd:
> > recal_intr: status=0x71 { DriveReady DeviceFault SeekComplete Error
> > }
> > hdd: recal_intr: error=0x04 { DriveStatusError }
> > ide1: reset: success
> > ...
> > hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> > hdd: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=23844548,
> > sector=23844480 ...
> >
> > and so on, for a number of sectors.
> > This drive, hdd, has one partition, hdd1 that participates in a md0
> > array. Let's assume I can't just get rid of the drive that has
> > problems, I have to keep it and it has to stay in the array. If it
> > wasn't part of the array, I could run a badblocks -w test, find the
> > numbers of the failing sectors and feed them to mke2fs/e2fsck or
> > whatever other utility the filesystem on hdd1 has for marking bad
> > blocks, and the problem would (hopefully) end there.
> >
> > However, since hdd1 is part of the array, I suppose it's not
> > possible to run badblocks on that single device and then somehow map
> > the blocks on hdd1 to blocks on md0, especially since it's striping,
> > raid-0, right? Is there some way to find out which sectors of md0
> > are on this drive, so I can limit the range of sectors to run
> > badblocks on? Running badblocks read+write on such a huge device
> > with can take ages.
> >
> > If anyone has any other suggestions, they are also welcome :)
> >
> If you are running raid1, there is a solution -- kludgy, but it will
> work
> 
> remove the bad drive from the array.
> 
> Take the second drive and mark it as a regular set or partitions
> rather than raid and restart it as a stand alone file system. The
> raid superblock is at the end of the partition so if you have
> formatted as ext2, the partions will mount normally if the fstab,
> kernel, etc.. are asked to mount the underlying partitions instead of
> the raid device.
> 
> do a complete reformat of the bad raid device -- it should be mounted
> with the other disk failed but really mounted as the real file system
> /dev/hdx...etc...
> 
> If you have a rescue disk that can do the mounting of the old disks,
> better yet, then you don't have to alter the fstab, etc...
> 
> use cpio to transfer the disk image from the old good disk to the
> newly formatted BAD disk. I'd suggest doing the first directory
> levels independently so you can avoid the /dev, /proc, /tmp and /mnt
> directories. Create those by hand, copy /dev by hand using cp -a
> 
> cd /
> file /targetdir | cpio -padm /mnt
> 
> I've done this many times without backup, though I don't recommend
> it. If you screw up you're dead. Better to take a spare disk and sync
> it to the remaining good one so you have a backup (faster easier) or
> run a backup tape. If you choose to go the spare disk route, use it
> instead of the original -- test it carefully to make sure the files
> really transferred as you expected (memory error can eat your lunch).
> 
> Once the transfer is complete, remount the new BAD disk as the OS
> file system and do a raid hot add of the old good disk. It will sync
> with the bad blocks ignored.
> 
> If used this exact technique ONCE only, it did work fine. It was a
while
> ago and as I recall it did produce some errors in the area where the
bad
> blocks reside but no where else. The system has been running for some
time
> and I've encountered no problems with it. raid1 on PII with 2.4.17,
> 2 - 3.5 gig ide drives
> 
> Michael
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid"
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html