Fail of one brick lead to crash VMs

Dominique Roux <dominique.roux@xxxxxxxxxxx> · Mon, 8 Feb 2016 15:20:25 +0100

Hi guys,

I faced a problem a week ago.
In our environment we have three servers in a quorum. The gluster volume
is spreaded over two bricks and has the type replicated.

We now, for simulating a fail of one brick, isolated one of the two
bricks with iptables, so that communication to the other two peers
wasn't possible anymore.
After that VMs (opennebula) which had I/O in this time crashed.
We stopped the glusterfsd hard (kill -9) and restarted it, what made
things work again (Certainly we also had to restart the failed VMs). But
I think this shouldn't happen. Since quorum was not reached (2/3 hosts
were still up and connected).

Here some infos of our system:
OS: CentOS Linux release 7.1.1503
Glusterfs version: glusterfs 3.7.3

gluster volume info:

Volume Name: cluster1
Type: Replicate
Volume ID:
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: srv01:/home/gluster
Brick2: srv02:/home/gluster
Options Reconfigured:
cluster.self-heal-daemon: enable
cluster.server-quorum-type: server
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: on
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
server.allow-insecure: on
nfs.disable: 1

Hope you can help us.

Thanks a lot.

Best regards
Dominique
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users