Re: filesystem corruption

"Patrick H." <linux-raid@xxxxxxxxxxxx> · Tue, 04 Jan 2011 00:50:39 -0700

Sent: Mon Jan 03 2011 22:33:24 GMT-0700 (Mountain Standard Time)
From: NeilBrown <neilb@xxxxxxx>
To: Patrick H. <linux-raid@xxxxxxxxxxxx> linux-raid@xxxxxxxxxxxxxxx
Subject: Re: filesystem corruption
On Sun, 02 Jan 2011 22:05:06 -0700 "Patrick H." <linux-raid@xxxxxxxxxxxx>
wrote:

Ok, thanks for the info.
I think I'll solve it by creating 2 dedicated hosts for running the 
array, but not actually export any disks themselves. This way if a 
master dies, all the raid disks are still there and can be picked up by 
the other master.

That sounds like it should work OK.

NeilBrown

Well, it didnt solve it. if I power the entire cluster down and start it 
back up, I get corruption, on old files that werent being modified 
still. If I power off just a single node, it seems to handle it fine, 
just not the whole cluster.

It also seems to happen fairly frequently now. In the previous setup it 
was probably 1 in 50 failures that there was corruption. Now its pretty 
much a guarantee there will be corruption if I kill it.
On the last failure I did, when it came back up, it re-assembled the 
entire raid-5 array with all disks active and none of them needing any 
sort of re-sync. The disk controller is battery backed, so even if it 
was re-ordering the writes, the battery should ensure that it all gets 
committed.

Any other ideas?

-Patrick
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html