Sent: Mon Jan 03 2011 22:33:24 GMT-0700 (Mountain Standard Time)
From: NeilBrown <neilb@xxxxxxx>
To: Patrick H. <linux-raid@xxxxxxxxxxxx> linux-raid@xxxxxxxxxxxxxxx
Subject: Re: filesystem corruption
On Sun, 02 Jan 2011 22:05:06 -0700 "Patrick H." <linux-raid@xxxxxxxxxxxx>
wrote:
Ok, thanks for the info.
I think I'll solve it by creating 2 dedicated hosts for running the
array, but not actually export any disks themselves. This way if a
master dies, all the raid disks are still there and can be picked up by
the other master.
That sounds like it should work OK.
NeilBrown
Well, it didnt solve it. if I power the entire cluster down and start it
back up, I get corruption, on old files that werent being modified
still. If I power off just a single node, it seems to handle it fine,
just not the whole cluster.
It also seems to happen fairly frequently now. In the previous setup it
was probably 1 in 50 failures that there was corruption. Now its pretty
much a guarantee there will be corruption if I kill it.
On the last failure I did, when it came back up, it re-assembled the
entire raid-5 array with all disks active and none of them needing any
sort of re-sync. The disk controller is battery backed, so even if it
was re-ordering the writes, the battery should ensure that it all gets
committed.
Any other ideas?
-Patrick
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html