For what it's worth, I've never been able to lose a brick in a 2 brick replica volume and still be able to write data. I've also found the documentation confusing as to what 'Option: cluster.server-quorum-type' actually means. Default Value: (null) Description: This feature is on the server-side i.e. in glusterd. Whenever the glusterd on a machine observes that the quorum is not met, it brings down the bricks to prevent data split-brains. When the network connections are brought back up and the quorum is restored the bricks in the volume are brought back up. It seems to be implying a brick quorum, but I think it actually means a glusterd quorum. In other words, if 2/3 glusterd processes fail, take the brick offline. This would seem to make sense in your configuration. But There are also two other quorum settings which seem to be more focused on brick count/ratio to form quorum: Option: cluster.quorum-type Default Value: none Description: If value is "fixed" only allow writes if quorum-count bricks are present. If value is "auto" only allow writes if more than ha lf of bricks, or exactly half including the first, are present. Option: cluster.quorum-count Default Value: (null) Description: If quorum-type is "fixed" only allow writes if this many bricks or present. Other quorum types will OVERWRITE this value. So you might be able to set type as 'fixed' and count as '1' and with cluster.server-quorum-type: server already enabled get what you want. But again, I've never had this work properly, and always ended up with split-brains which are difficult to resolve when you're storing vm images rather than files. Your other options are; use your 3rd server as another brick, and do replica 3 (which I've had good success with). Or seeing as you're using 3.7 you could look into arbiter nodes if they're stable in current version. On Mon, Feb 8, 2016 at 6:20 AM, Dominique Roux <dominique.roux@xxxxxxxxxxx> wrote: > Hi guys, > > I faced a problem a week ago. > In our environment we have three servers in a quorum. The gluster volume > is spreaded over two bricks and has the type replicated. > > We now, for simulating a fail of one brick, isolated one of the two > bricks with iptables, so that communication to the other two peers > wasn't possible anymore. > After that VMs (opennebula) which had I/O in this time crashed. > We stopped the glusterfsd hard (kill -9) and restarted it, what made > things work again (Certainly we also had to restart the failed VMs). But > I think this shouldn't happen. Since quorum was not reached (2/3 hosts > were still up and connected). > > Here some infos of our system: > OS: CentOS Linux release 7.1.1503 > Glusterfs version: glusterfs 3.7.3 > > gluster volume info: > > Volume Name: cluster1 > Type: Replicate > Volume ID: > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: srv01:/home/gluster > Brick2: srv02:/home/gluster > Options Reconfigured: > cluster.self-heal-daemon: enable > cluster.server-quorum-type: server > network.remote-dio: enable > cluster.eager-lock: enable > performance.stat-prefetch: on > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > server.allow-insecure: on > nfs.disable: 1 > > Hope you can help us. > > Thanks a lot. > > Best regards > Dominique > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users