On Thu, 12 Nov 2009, Brian Hirt wrote:
Nathan,
This might be a side effect of open files not being replicated. If the
VM has any open files that it has written to, none of the changes will
be propagated to the other node until the file is closed. If a gluster
node goes down, and that's the node with the modified open file, then
you are out of luck. If the VM reopens the file while the correct node
is down, it will get the out of date file on the replicated node. From
there, the VM might go bonkers, or maybe it writes to that file also,
then you end up with a split brain where you have two different copies
on two different nodes.
Ah ha! So even tho I am using distribute since I am using xen with file
the file is opened and only accesses one node? That would explain a LOT of
things
Is there any way to see what node it is using for what xen file?
3.0 is supposed to address the open file replicate, though I haven't
had a chance yet to test it to see if it fixes all the problems we were
having with 2.08. You might want to check out 3.0pre1 and see if that
makes any difference.
I just have been very scared about going with a pre1 release! At the same
time, I must find a stable xen storage solution and 3.0pre1 may offer me
more then I can get with current 2.0.8 anyway.
<>
Nathan Stratton CTO, BlinkMind, Inc.
nathan at robotics.net nathan at blinkmind.com
http://www.robotics.net http://www.blinkmind.com