Hi all! Another incident, now a real "split brain" situation: Server pair 12 & 13, a set of files can't be repaired and throws errors. Is there a way to interpret the AFR code in order to select which files should be chosen to be deleted/overwritten?! No errors in opt-profitbricks-storage.log from pserver12; but opt-profitbricks-storage.log from pserver13 says: [2011-05-03 18:14:29.343512] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-11. [2011-05-03 18:14:29.344467] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-11' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.347376] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-16. [2011-05-03 18:14:29.348157] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-16' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.349013] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-17. [2011-05-03 18:14:29.349817] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-17' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.351252] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-19. [2011-05-03 18:14:29.352043] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-19' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.353477] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-20. [2011-05-03 18:14:29.354242] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-20' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.356343] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-23. [2011-05-03 18:14:29.357198] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-23' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.358030] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-24. [2011-05-03 18:14:29.358877] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-24' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.362652] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-3. [2011-05-03 18:14:29.363431] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-3' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.364261] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-30. [2011-05-03 18:14:29.365041] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-30' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.368924] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-36. [2011-05-03 18:14:29.369682] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-36' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.371696] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-39. [2011-05-03 18:14:29.372451] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-39' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-05-03 18:14:29.373939] I [afr-common.c:672:afr_lookup_done] 0-storage0-replicate-2: split brain detected during lookup of /pserver3-5. [2011-05-03 18:14:29.374705] E [afr-self-heal-data.c:645:afr_sh_data_fix] 0-storage0-replicate-2: Unable to self-heal contents of '/pserver3-5' (possible split-brain). Please delete the file from all but the preferred subvolume. 0 root at de-dc1-c1-pserver12:/var/log/glusterfs # getfattr -R -d -e hex -m "trusted.afr." /mnt/gluster/brick?/storage | grep -v 0x000000000000000000000000 | grep -B1 -A1 trusted getfattr: Removing leading '/' from absolute path names # file: mnt/gluster/brick0/storage/pserver3-19 trusted.afr.storage0-client-5=0x3f0000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-3 trusted.afr.storage0-client-5=0x3f0000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-30 trusted.afr.storage0-client-5=0x3f0000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-17 trusted.afr.storage0-client-5=0x3f0000010000000000000000 # file: mnt/gluster/brick0/storage/pserver3-11 trusted.afr.storage0-client-5=0x3f0000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-20 trusted.afr.storage0-client-5=0x3f0000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-16 trusted.afr.storage0-client-5=0x3f0000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-5 trusted.afr.storage0-client-5=0x3f0000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-39 trusted.afr.storage0-client-5=0x3f0000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-23 trusted.afr.storage0-client-5=0x3f0000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-24 trusted.afr.storage0-client-5=0x3f0000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-36 trusted.afr.storage0-client-5=0x3f0000010000000000000000 130 root at de-dc1-c1-pserver13:/var/log/glusterfs # getfattr -R -d -e hex -m "trusted.afr." /mnt/gluster/brick?/storage | grep -v 0x000000000000000000000000 | grep -B1 -A1 trusted getfattr: Removing leading '/' from absolute path names # file: mnt/gluster/brick0/storage/pserver3-23 trusted.afr.storage0-client-4=0xd70000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-20 trusted.afr.storage0-client-4=0xce00000a0000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-11 trusted.afr.storage0-client-4=0xd70000010000000000000000 # file: mnt/gluster/brick0/storage/pserver3-5 trusted.afr.storage0-client-4=0xd70000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-30 trusted.afr.storage0-client-4=0xd70000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-39 trusted.afr.storage0-client-4=0xd70000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-16 trusted.afr.storage0-client-4=0xd70000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-17 trusted.afr.storage0-client-4=0xd70000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-24 trusted.afr.storage0-client-4=0xd70000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-36 trusted.afr.storage0-client-4=0xd70000010000000000000000 # file: mnt/gluster/brick0/storage/pserver3-3 trusted.afr.storage0-client-4=0xd70000010000000000000000 -- # file: mnt/gluster/brick0/storage/pserver3-19 trusted.afr.storage0-client-4=0xd70000010000000000000000