Ok. It happens on all slave nodes (and on the interimmaster as well). It’s like I assumed. These are the logs of one of the slaves: gsyncd.log [2018-09-24 13:52:25.418382] I [repce(slave slave/bricks/brick1/brick):80:service_loop] RepceServer: terminating on reaching EOF. [2018-09-24 13:52:37.95297] W [gsyncd(slave slave/bricks/brick1/brick):293:main] <top>: Session config file not exists, using the default config path=/var/lib/glusterd/geo-replication/glustervol1_slave_glustervol1/gsyncd.conf [2018-09-24 13:52:37.109643] I [resource(slave slave/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume locally... [2018-09-24 13:52:38.303920] I [resource(slave slave/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume duration=1.1941 [2018-09-24 13:52:38.304771] I [resource(slave slave/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening [2018-09-24 13:52:41.981554] I [resource(slave slave/bricks/brick1/brick):598:entry_ops] <top>: Special case: rename on mkdir gfid=29d1d60d-1ad6-45fc-87e0-93d478f7331e entry='.gfid/6b97b987-8aef-46c3-af27-20d3aa883016/New
folder' [2018-09-24 13:52:42.45641] E [repce(slave slave/bricks/brick1/brick):105:worker] <top>: call failed: Traceback (most recent call last): File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 101, in worker res = getattr(self.obj, rmeth)(*in_data[2:]) File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 599, in entry_ops src_entry = get_slv_dir_path(slv_host, slv_volume, gfid) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 682, in get_slv_dir_path [ENOENT], [ESTALE]) File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 540, in errno_wrap return call(*arg) OSError: [Errno 13] Permission denied: '/bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e' [2018-09-24 13:52:42.81794] I [repce(slave slave/bricks/brick1/brick):80:service_loop] RepceServer: terminating on reaching EOF. [2018-09-24 13:52:53.459676] W [gsyncd(slave slave/bricks/brick1/brick):293:main] <top>: Session config file not exists, using the default config path=/var/lib/glusterd/geo-replication/glustervol1_slave_glustervol1/gsyncd.conf [2018-09-24 13:52:53.473500] I [resource(slave slave/bricks/brick1/brick):1096:connect] GLUSTER: Mounting gluster volume locally... [2018-09-24 13:52:54.659044] I [resource(slave slave/bricks/brick1/brick):1119:connect] GLUSTER: Mounted gluster volume duration=1.1854 [2018-09-24 13:52:54.659837] I [resource(slave slave/bricks/brick1/brick):1146:service_loop] GLUSTER: slave listening The folder “New folder” will be created via Samba and it was renamed by my colleague right away after creation. [root@slave glustervol1_slave_glustervol1]# ls /bricks/brick1/brick/.glusterfs/29/d1/29d1d60d-1ad6-45fc-87e0-93d478f7331e/ [root@slave glustervol1_slave_glustervol1]# ls /bricks/brick1/brick/.glusterfs/29/d1/ -al total 0 drwx--S---+ 2 root AD+group 50 Sep 21 09:39 . drwx--S---+ 11 root AD+group 96 Sep 21 09:39 .. lrwxrwxrwx. 1 root AD+group 75 Sep 21 09:39 29d1d60d-1ad6-45fc-87e0-93d478f7331e -> ../../6b/97/6b97b987-8aef-46c3-af27-20d3aa883016/vRealize Operation Manager Creating the folder in /bricks/brick1/brick/.glusterfs/6b/97/6b97b987-8aef-46c3-af27-20d3aa883016/, but it didn’t change anything. mnt-slave-bricks-brick1-brick.log [2018-09-24 13:51:10.625723] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glustervol1-client-0: error returned while attempting to connect to host:(null), port:0 [2018-09-24 13:51:10.626092] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glustervol1-client-0: error returned while attempting to connect to host:(null), port:0 [2018-09-24 13:51:10.626181] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-glustervol1-client-0: changing port to 49152 (from 0) [2018-09-24 13:51:10.643111] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glustervol1-client-0: error returned while attempting to connect to host:(null), port:0 [2018-09-24 13:51:10.643489] W [dict.c:923:str_to_data] (-->/usr/lib64/glusterfs/4.1.3/xlator/protocol/client.so(+0x4131a) [0x7fafb023831a] -->/lib64/libglusterfs.so.0(dict_set_str+0x16)
[0x7fafbdb83266] -->/lib64/libglusterfs.so.0(str_to_data+0x91) [0x7fafbdb7fea1] ) 0-dict: value is NULL [Invalid argument] [2018-09-24 13:51:10.643507] I [MSGID: 114006] [client-handshake.c:1308:client_setvolume] 0-glustervol1-client-0: failed to set process-name in handshake msg [2018-09-24 13:51:10.643541] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glustervol1-client-0: error returned while attempting to connect to host:(null), port:0 [2018-09-24 13:51:10.671460] I [MSGID: 114046] [client-handshake.c:1176:client_setvolume_cbk] 0-glustervol1-client-0: Connected to glustervol1-client-0, attached to remote volume '/bricks/brick1/brick'. [2018-09-24 13:51:10.672694] I [fuse-bridge.c:4294:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.22 [2018-09-24 13:51:10.672715] I [fuse-bridge.c:4927:fuse_graph_sync] 0-fuse: switched to graph 0 [2018-09-24 13:51:10.673329] I [MSGID: 109005] [dht-selfheal.c:2342:dht_selfheal_directory] 0-glustervol1-dht: Directory selfheal failed: Unable to form layout for directory / [2018-09-24 13:51:16.116458] I [fuse-bridge.c:5199:fuse_thread_proc] 0-fuse: initating unmount of /var/mountbroker-root/user1300/mtpt-geoaccount-ARDW1E [2018-09-24 13:51:16.116595] W [glusterfsd.c:1514:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7e25) [0x7fafbc9eee25] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55d5dac5dd65]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55d5dac5db8b] ) 0-: received signum (15), shutting down [2018-09-24 13:51:16.116616] I [fuse-bridge.c:5981:fini] 0-fuse: Unmounting '/var/mountbroker-root/user1300/mtpt-geoaccount-ARDW1E'. [2018-09-24 13:51:16.116625] I [fuse-bridge.c:5986:fini] 0-fuse: Closing fuse connection to '/var/mountbroker-root/user1300/mtpt-geoaccount-ARDW1E'. Regards, Christian From:
Kotresh Hiremath Ravishankar <khiremat@xxxxxxxxxx> The problem occured on slave side whose error is propagated to master. Mostly any traceback with repce involved is related to problem in slave. Just check few lines above in the log to find the slave node, the crashed worker is connected
to and get geo replication logs to further debug. On Fri, 21 Sep 2018, 20:10 Kotte, Christian (Ext), <christian.kotte@xxxxxxxxxxxx> wrote:
|
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users