On Mon, May 28, 2007 at 05:30:52PM +0200, Pallai Roland wrote: > > On Monday 28 May 2007 14:53:55 Pallai Roland wrote: > > On Friday 25 May 2007 02:05:47 David Chinner wrote: > > > "-o ro,norecovery" will allow you to mount the filesystem and get any > > > uncorrupted data off it. > > > > > > You still may get shutdowns if you trip across corrupted metadata in > > > the filesystem, though. > > > > This filesystem is completely dead. > > [...] > > I tried to make a md patch to stop writes if a raid5 array got 2+ failed > drives, but I found it's already done, oops. :) handle_stripe5() ignores > writes in this case quietly, I tried and works. Hmmm - it clears the uptodate bit on the bio, which is supposed to make the bio return EIO. That looks to be doing the right thing... > There's an another layer I used on this box between md and xfs: loop-aes. I Oh, that's a kind of important thing to forget to mention.... > used it since years and rock stable, but now it's my first suspect, cause I > found a bug in it today: > I assembled my array from n-1 disks, and I failed a second disk for a test > and I found /dev/loop1 still provides *random* data where /dev/md1 serves > nothing, it's definitely a loop-aes bug: ..... > It's not an explanation to my screwed up file system, but for me it's enough > to drop loop-aes. Eh. If you can get random data back instead of an error from the block device, then I'm not surprised your filesystem is toast. If it's one sector in a larger block that is corrupted, then the only thing that will protect you from this sort of corruption causing problems is metadata checksums (yet another thin on my list of stuff to do). Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html