On 10/27/2014 01:34 PM, Tytus Rogalewski wrote:My experience with DRBD is really old, but I became a gluster user because of my experience with drbd. After it destroyed my filesystem for the 3rd time, it was "replace that or find somewhere else to work" time. I chose gluster because you can create a fullly redundant system from the client to each replica server, all the way through all the hardware by creating parallel network paths. What you experienced is a result of the ping timeout. Ping-timeouts happen when the TCP connection is not closed, like when you pull the plug. The timeout exists to allow the filesystem to recover gracefully in the event of a temporary network problem. Without that, there's an increased load on the server while all the file descriptors are re-established. This can be a fairly heavy load, to the point where tcp pings are delayed. If they're delayed longer than ping-timeout, you have a race condition from which you'll never recover. For that reason, the ping-timeout is longer. You *can* adjust that timeout as long as you sufficiently test around the actual loads you're expecting. Keep in mind your SLA/OLA expectations and engineer for them using the actual mathematical calculations, not just some gut expectations. Your DC power should be more reliable than most industries requirements. Each client intends to write to both (all) replicas. The intent count is incremented in extended attributes, the write executes on a replica, the intent count is decremented for that replica. With the disconnect, each of those files will show pending changed destined for the other replica. When they are reconnected, the self-heal daemon (or a client attempting to access those files) will note the changes destined for the other brick and repair it. Split-brain occurs when each side of that netsplit writes to the same file. That file indicates pending changes for the other brick. When the connection returns, they compare those pending flags and see changes to each that are unwritten on the other. They refuse and leave each file intact, forcing manual intervention to clear the split-brain. You can avoid split-brain by using replica 3 and volume-level quorum, or with replica 2 and some 3rd observer, server quorum. It is also possible to have quorum with only 2 servers or replicas, but I wouldn't recommend it. With volume based quorum, the volume will go read only if the client loses connection with either server. With server quorum and only two servers, the server will shut down if it loses quorum completely removing access to the volume. |
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users