Re: One node goes offline, the other node loses its connection to its local Gluster volume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/6/2014 7:48 PM, Greg Scott wrote:
In your real-life concern, the interconnect would not interfere with the existence of either
machines' ip address so after the ping-timeout, operations would resume in a split-brain
configuration. As long as no changes were made to the same file on both volumes, when the
connection is reestablished, the self-heal will do exactly what you expect.
Except that's not what happens.  If I ifdown that interconnect NIC, I should see the file system after 42 seconds, right?  But I don't.  Email butchers the output below, but what it shows is, I can look at my /firewall-scripts directory just fine when things are steady state.  I ifdown the interconnect NIC, that directory goes away.  I wait more than 2 minutes and it still doesn't come back.  And then when I ifup the NIC, everything goes back to normal after a few seconds.

[root@stylmark-fw1 ~]# ls /firewall-scripts
allow-all           etc                  initial_rc.firewall  rcfirewall.conf            var
allow-all-with-nat  failover-monitor.sh  rc.firewall          start-failover-monitor.sh
[root@stylmark-fw1 ~]# date
Thu Mar  6 18:39:42 CST 2014
[root@stylmark-fw1 ~]# ifdown enp5s4
[root@stylmark-fw1 ~]# ls /firewall-scripts
ls: cannot access /firewall-scripts: Transport endpoint is not connected
[root@stylmark-fw1 ~]# date
Thu Mar  6 18:41:50 CST 2014
[root@stylmark-fw1 ~]# ls /firewall-scripts
ls: cannot access /firewall-scripts: No such file or directory
[root@stylmark-fw1 ~]# ifup enp5s4
[root@stylmark-fw1 ~]# ls /firewall-scripts
ls: cannot access /firewall-scripts: No such file or directory
[root@stylmark-fw1 ~]# df -h
Filesystem                       Size  Used Avail Use% Mounted on
/dev/mapper/fedora-root           17G  2.3G   14G  14% /
devtmpfs                         989M     0  989M   0% /dev
tmpfs                            996M     0  996M   0% /dev/shm
tmpfs                            996M  524K  996M   1% /run
tmpfs                            996M     0  996M   0% /sys/fs/cgroup
tmpfs                            996M     0  996M   0% /tmp
/dev/sda2                        477M   87M  362M  20% /boot
/dev/sda1                        200M  9.6M  191M   5% /boot/efi
/dev/mapper/fedora-gluster--fw1  9.8G   33M  9.8G   1% /gluster-fw1
192.168.253.1:/firewall-scripts  9.8G   33M  9.8G   1% /firewall-scripts
[root@stylmark-fw1 ~]# ls /firewall-scripts
allow-all           etc                  initial_rc.firewall  rcfirewall.conf            var
allow-all-with-nat  failover-monitor.sh  rc.firewall          start-failover-monitor.sh
[root@stylmark-fw1 ~]#

You can avoid the split-brain using a couple of quorum techniques, the one that would seem to satisfy your
requirements leaving your volume read-only during the duration of the outage.
I like this idea - how do I do it?
I don't see a follow-up here, so I will put in (only) my two cents worth.

If I understand correctly, you get the read-only condition by using client-side quorum. The behavior you describe above sounds like that produced by server-side quorum -- the volume goes offline until a quorum is present.

I have suffered through a couple of split-brain situations, and I agree that you do not want to run a two-node setup without quorum.

You may have gotten an answer that I did not see, but even so, I'll leave this here for the next guy who has a question.

Ted Miller
Elkhart, IN

- Greg
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux