On 06/29/2013 04:15 PM, Emmanuel Dreyfus wrote:
Hi I get the all null pending matrix issue again. This time it is on 3.4.0beta4, and no brick crashed since the volume was created and mounted on the client. Here is the last lines leading to the error on the client. I have many warnings like the one for cbc128.pico before and they seem harmless. The error on cms_env.po is the first message about this file. Since no brick crashed, I just don't understand why it would want to self heal.
Are there any disconnections between the client and bricks?
[2013-06-29 07:48:13.503301] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-gfs34-client-1: remote operation failed: No such file or directory [2013-06-29 07:48:13.503493] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-gfs34-client-0: remote operation failed: No such file or directory [2013-06-29 07:48:13.504936] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-gfs34-client-0: remote operation failed: No such file or directory [2013-06-29 07:48:13.506062] W [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-gfs34-client-1: remote operation failed: No such file or directory [2013-06-29 07:48:13.506107] I [afr-lk-common.c:1075:afr_lock_blocking] 0-gfs34-replicate-0: unable to lock on even one child [2013-06-29 07:48:13.506138] I [afr-transaction.c:1063:afr_post_blocking_inodelk_cbk] 0-gfs34-replicate-0: Blocking inodelks failed. [2013-06-29 07:48:13.506283] W [fuse-bridge.c:953:fuse_setattr_cbk] 0-glusterfs-fuse: 15006882: SETATTR() /manu/netbsd/usr/src/crypto/external/bsd/openssl/lib/libcrypto/obj/cbc128.pico => -1 (No such file or directory) [2013-06-29 07:56:05.000389] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-gfs34-replicate-0: Unable to self-heal contents of '/manu/netbsd/usr/src/crypto/external/bsd/openssl/lib/libcrypto/obj/cms_env.po' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 0 ] [ 0 0 ] ] [2013-06-29 07:56:05.000949] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-gfs34-replicate-0: background data self-heal failed on /manu/netbsd/usr/src/crypto/external/bsd/openssl/lib/libcrypto/obj/cms_env.po
Is this seen with a striped-replicate setup? Do we observe the same behavior when stripe xlator is not involved, say in a distributed-replicate setup?
Thanks, Vijay