Re: raid5: I lost a XFS file system due to a minor IDE cable problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 28, 2007 at 05:30:52PM +0200, Pallai Roland wrote:
> 
> On Monday 28 May 2007 14:53:55 Pallai Roland wrote:
> > On Friday 25 May 2007 02:05:47 David Chinner wrote:
> > > "-o ro,norecovery" will allow you to mount the filesystem and get any
> > > uncorrupted data off it.
> > >
> > > You still may get shutdowns if you trip across corrupted metadata in
> > > the filesystem, though.
> >
> > This filesystem is completely dead.
> > [...]
> 
>  I tried to make a md patch to stop writes if a raid5 array got 2+ failed 
> drives, but I found it's already done, oops. :) handle_stripe5() ignores 
> writes in this case quietly, I tried and works.

Hmmm - it clears the uptodate bit on the bio, which is supposed to
make the bio return EIO. That looks to be doing the right thing...

>  There's an another layer I used on this box between md and xfs: loop-aes. I 

Oh, that's a kind of important thing to forget to mention....

> used it since years and rock stable, but now it's my first suspect, cause I 
> found a bug in it today:
>  I assembled my array from n-1 disks, and I failed a second disk for a test 
> and I found /dev/loop1 still provides *random* data where /dev/md1 serves 
> nothing, it's definitely a loop-aes bug:

.....

>  It's not an explanation to my screwed up file system, but for me it's enough 
> to drop loop-aes. Eh.

If you can get random data back instead of an error from the block device,
then I'm not surprised your filesystem is toast. If it's one sector in a
larger block that is corrupted, then the only thing that will protect you from
this sort of corruption causing problems is metadata checksums (yet another
thin on my list of stuff to do).

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux