As per the quickstart guide, I'm setting up a replicated volume on two test (KVM) VMs fiori2 and torchio2 as follows: mkfs -t xfs -i size=512 -f /dev/vdb1 # on both mount /dev/vdb1 /vol/brick0 # on both gluster peer probe torchio2 # on fiori2 gluster peer probe fiori2 # on torchio2 mkdir /vol/brick0/vmimages # on both gluster volume create vmimages replica 2 \ torchio2:/vol/brick0/vmimages fiori2:/vol/brick0/vmimages # fiori2 mount -t glusterfs fiori2:/vmimages /mnt # on both Then I pull the virtual network cable out of one host (with 'virsh domif-setlink fiori2 vnet10 down') and then run: ls /mnt # on both (wait for timeouts to elapse) uname -n > /mnt/hostname # on both (create conflict) Then I put the cable back, wait a bit and then run: torchio2# cat /mnt/hostname cat: /mnt/hostname: Input/output error torchio2# I'm deliberately trying to provoke split-brain, so this I/O error is no surprise. The real problem comes when I try to recover from it: fiori2# gluster volume heal vmimages info Brick torchio2:/vol/brick0/vmimages / - Is in split-brain /hostname Number of entries: 2 Brick fiori2:/vol/brick0/vmimages / - Is in split-brain /hostname Number of entries: 2 fiori2# gluster volume heal vmimages split-brain source-brick torchio2:/vol/brick0/vmimages 'source-brick' option used on a directory (gfid:00000000-0000-0000-0000-000000000001). Performing conservative merge. Healing gfid:00000000-0000-0000-0000-000000000001 failed:Operation not permitted. Healing gfid:73dce70e-bb3e-40a2-bec9-4741399b6b72 failed:Transport endpoint is not connected. Number of healed entries: 0 fiori2# and the I/O error remains. I've also tried it the manual/fattr way, but that itself also produces I/O errors: fiori2# getfattr -d -m . -e hex /mnt/hostname getfattr: /mnt/hostname: Input/output error fiori2# I've done some googling, but not turned up any references to split-brain with "Operation not permitted" or "Transport endpoint is not connected". Am I doing something wrong? Is this a known bug? Is there a workaround? For info, I'm using: fiori2# cat /etc/issue Ubuntu 16.04 LTS \n \l fiori2# uname -a Linux fiori2 4.4.0-24-generic #43-Ubuntu SMP Wed Jun 8 19:27:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux fiori2# dpkg -l | grep gluster ii glusterfs-client 3.7.6-1ubuntu1 amd64 clustered file-system (client package) ii glusterfs-common 3.7.6-1ubuntu1 amd64 GlusterFS common libraries and translator modules ii glusterfs-server 3.7.6-1ubuntu1 amd64 clustered file-system (server package) fiori2# I understand that two nodes are not optimal; occassional split-brain is acceptable so long as I can recover from it. Up to now, for a clustered filesystem on my VM servers, I've been using DRBD+OCFS2, but the NFS3 interaction has been glitchy, so now I'm doing some tests with GlusterFS. Any advice gratefully received! Thanks! Alexis _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users