Hi all,
I work at a small, 8-person company that uses Gluster for its primary data storage. We have a volume called "data" that is replicated over two servers (details below). This worked perfectly for over a year, but lately we've been noticing some mismatches between the two bricks, so it seems there has been some split-brain situation that is not being detected or resolved. I have two questions about this:
1) I expected Gluster to (eventually) detect a situation like this; why doesn't it?
2) How do I fix this situation? I've tried an explicit 'heal', but that didn't seem to change anything.
Thanks a lot for your help!
Sjors
------8<------
Volume & peer info: http://pastebin.com/PN7tRXdU
curacao# md5sum /export/sdb1/data/Case/21000355/studies.dat
7bc2daec6be953ffae920d81fe6fa25c
/export/sdb1/data/Case/21000355/studies.dat
bonaire# md5sum /export/sdb1/data/Case/21000355/studies.dat
28c950a1e2a5f33c53a725bf8cd72681 /export/sdb1/data/Case/21000355/studies.dat
# mallorca is one of the clients
mallorca# md5sum /data/Case/21000355/studies.dat
7bc2daec6be953ffae920d81fe6fa25c /data/Case/21000355/studies.dat
I expected an input/output error after reading this file, because of the split-brain situation, but got none. There are no entries in the GlusterFS logs of either bonaire or curacao.
bonaire# gluster volume heal data full
Launching heal operation to perform full self heal on volume data has been successful
Use heal info commands to check status
bonaire# gluster volume heal data info
Brick bonaire:/export/sdb1/data/
Number of entries: 0
Brick curacao:/export/sdb1/data/
Number of entries: 0
(Same output on curacao, and hours after this, the md5sums on both bricks still differ.)
curacao# gluster --version
glusterfs 3.6.2 built on Mar 2 2015 14:05:34
Repository revision: git://git.gluster.com/glusterfs.git
(Same version on Bonaire)
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users