I got a split brain with an all-null pending matrix with 3.4.1qa2 while the machines where all i386 (we previosuly thought that there was an LP64 issue somewhere) Here is the end of the log. Does it ring a bell for someone? (...) [2013-09-16 23:50:57.788509] E [dht-helper.c:751:dht_migration_complete_check_task] 0-gfs340-dht: /manu/netbsd-20130915/usr/src/external/bsd/bind/lib/libisc/obj/sha1.po: failed to lookup the file on gfs340-replicate-0 [2013-09-16 23:50:57.809062] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-gfs340-client-3: remote operation failed: No such file or directory [2013-09-16 23:50:59.420905] W [defaults.c:1274:default_forget] 0-fuse: xlator does not implement forget_cbk [2013-09-16 23:50:59.421812] W [defaults.c:1274:default_forget] 0-fuse: xlator does not implement forget_cbk [2013-09-16 23:51:12.141445] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-gfs340-replicate-1: Unable to self-heal contents of '/manu/netbsd-20130915/usr/src/external/bsd/bind/lib/libisc/obj/sha2.pico' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 0 ] [ 0 0 ] ] [2013-09-16 23:51:12.142528] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-gfs340-replicate-1: background data self-heal failed on /manu/netbsd-20130915/usr/src/external/bsd/bind/lib/libisc/obj/sha2.pico [2013-09-16 23:51:17.804439] W [client-rpc-fops.c:1994:client3_3_setattr_cbk] 0-gfs340-client-2: remote operation failed: No such file or directory [2013-09-16 23:51:17.804753] W [client-rpc-fops.c:1994:client3_3_setattr_cbk] 0-gfs340-client-3: remote operation failed: No such file or directory [2013-09-16 23:51:17.806121] W [client-rpc-fops.c:1755:client3_3_xattrop_cbk] 0-gfs340-client-2: remote operation failed: Undefined error: 0. Path: (null) (--) [2013-09-16 23:51:17.806938] W [client-rpc-fops.c:1755:client3_3_xattrop_cbk] 0-gfs340-client-3: remote operation failed: Undefined error: 0. Path: (null) (--) [2013-09-16 23:51:17.836769] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-gfs340-client-2: remote operation failed: No such file or directory [2013-09-16 23:51:17.837065] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-gfs340-client-3: remote operation failed: No such file or directory [2013-09-16 23:51:17.838588] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-gfs340-client-2: remote operation failed: No such file or directory [2013-09-16 23:51:17.845643] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-gfs340-client-3: remote operation failed: No such file or directory [2013-09-16 23:51:17.845827] I [afr-lk-common.c:1075:afr_lock_blocking] 0-gfs340-replicate-1: unable to lock on even one child [2013-09-16 23:51:17.845951] I [afr-transaction.c:1063:afr_post_blocking_inodelk_cbk] 0-gfs340-replicate-1: Blocking inodelks failed. [2013-09-16 23:51:17.846192] W [fuse-bridge.c:986:fuse_setattr_cbk] 0-glusterfs-fuse: 29322945: SETATTR() /manu/netbsd-20130915/usr/src/external/bsd/bind/lib/libisc/obj/sha2.po => -1 (No such file or directory) (...) [2013-09-16 23:56:51.640121] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-gfs340-replicate-1: Unable to self-heal contents of '/manu/netbsd-20130915/usr/src/external/bsd/bind/lib/libisc/obj/sha2.pico' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 0 ] [ 0 0 ] ] [2013-09-16 23:56:51.642231] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-gfs340-replicate-1: background data self-heal failed on /manu/netbsd-20130915/usr/src/external/bsd/bind/lib/libisc/obj/sha2.pico [2013-09-16 23:56:51.643156] W [page.c:991:__ioc_page_error] 0-gfs340-io-cache: page error for page = 0xab653c80 & waitq = 0x8bdae660 [2013-09-16 23:56:51.643671] W [fuse-bridge.c:2082:fuse_readv_cbk] 0-glusterfs-fuse: 29400837: READ => -1 (Input/output error) [2013-09-16 23:56:51.644174] W [page.c:991:__ioc_page_error] 0-gfs340-io-cache: page error for page = 0x942ab940 & waitq = 0x8a3fc120 [2013-09-16 23:56:51.644451] W [fuse-bridge.c:2082:fuse_readv_cbk] 0-glusterfs-fuse: 29400838: READ => -1 (Input/output error) [2013-09-16 23:56:52.877452] W [page.c:991:__ioc_page_error] 0-gfs340-io-cache: page error for page = 0xab6341c0 & waitq = 0x8a3fc680 [2013-09-16 23:56:52.877799] W [fuse-bridge.c:2082:fuse_readv_cbk] 0-glusterfs-fuse: 29401347: READ => -1 (Input/output error) [2013-09-16 23:56:52.993734] W [page.c:991:__ioc_page_error] 0-gfs340-io-cache: page error for page = 0x8c8ce260 & waitq = 0x8a3fc660 [2013-09-16 23:56:52.994202] W [fuse-bridge.c:2082:fuse_readv_cbk] 0-glusterfs-fuse: 29401606: READ => -1 (Input/output error) -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz manu@xxxxxxxxxx