Re: Transport Endpoint Not Connected When Writing a Lot of Files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you.
I checked the logs but the information was not clear to me.

I add the log of two different crashes. I will do an upgrade to glusterFS 6 in some weeks. Actually I cannot interrupt user activity on these servers since we are in the middle of the uni-semester.

If these logfiles reveal something interesting to you, would be nice to get a hint.


ol-data-client-2. Client process will keep trying to connect to glusterd until brick's port is available [2019-09-16 19:05:34.028164] E [rpc-clnt.c:348:saved_frames_unwind] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7ff167753ddb] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xc021)[0x7ff167523021] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xc14e)[0x7ff16752314e] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x8e)[0x7ff1675246be] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xe268)[0x7ff167525268] ))))) 0-vol-data-client-2: forced unwinding frame type(GlusterFS 4.x v1) op(FSTAT(25)) called at 2019-09-16 19:05:28.736873 (xid=0x113aecf) [2019-09-16 19:05:34.028206] W [MSGID: 114031] [client-rpc-fops_v2.c:1260:client4_0_fstat_cbk] 0-vol-data-client-2: remote operation failed [Transport endpoint is not connected] [2019-09-16 19:05:44.970828] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-09-16 19:05:44.971030] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-09-16 19:05:44.971165] E [MSGID: 114058] [client-handshake.c:1442:client_query_portmap_cbk] 0-vol-data-client-2: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2019-09-16 19:05:47.971375] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0

[2019-09-16 19:05:44.971200] I [MSGID: 114018] [client.c:2254:client_rpc_notify] 0-vol-data-client-2: disconnected from vol-data-client-2. Client process will keep trying to connect to glusterd until brick's port is available



[2019-09-17 07:43:44.807182] E [MSGID: 114058] [client-handshake.c:1442:client_query_portmap_cbk] 0-vol-data-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2019-09-17 07:43:44.807217] I [MSGID: 114018] [client.c:2254:client_rpc_notify] 0-vol-data-client-0: disconnected from vol-data-client-0. Client process will keep trying to connect to glusterd until brick's port is available [2019-09-17 07:43:44.807228] E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-data-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
Final graph:
+------------------------------------------------------------------------------+
  1: volume vol-data-client-0
  2:     type protocol/client
  3:     option ping-timeout 42
  4:     option remote-host diufnas20
  5:     option remote-subvolume /bigdisk/brick1/vol-data
  6:     option transport-type socket
  7:     option transport.address-family inet
  8:     option username a14ffa1b-b64e-410c-894d-435c18e81b2d
  9:     option password 37ba4281-166d-40fd-9ef0-08a187d1107b
 10:     option transport.tcp-user-timeout 0
 11:     option transport.socket.keepalive-time 20
 12:     option transport.socket.keepalive-interval 2
 13:     option transport.socket.keepalive-count 9
 14:     option send-gids true
 15: end-volume
 16:
 17: volume vol-data-client-1
 18:     type protocol/client
 19:     option ping-timeout 42
 20:     option remote-host diufnas21
 21:     option remote-subvolume /bigdisk/brick2/vol-data
 22:     option transport-type socket
 23:     option transport.address-family inet
 24:     option username a14ffa1b-b64e-410c-894d-435c18e81b2d
 25:     option password 37ba4281-166d-40fd-9ef0-08a187d1107b
 26:     option transport.tcp-user-timeout 0
 27:     option transport.socket.keepalive-time 20
29:     option transport.socket.keepalive-count 9
 30:     option send-gids true
 31: end-volume
 32:
 33: volume vol-data-client-2
 34:     type protocol/client
 35:     option ping-timeout 42
 36:     option remote-host diufnas22
 37:     option remote-subvolume /bigdisk/brick3/vol-data
 38:     option transport-type socket
 39:     option transport.address-family inet
 40:     option username a14ffa1b-b64e-410c-894d-435c18e81b2d
 41:     option password 37ba4281-166d-40fd-9ef0-08a187d1107b
 42:     option transport.tcp-user-timeout 0
 43:     option transport.socket.keepalive-time 20
 44:     option transport.socket.keepalive-interval 2
 45:     option transport.socket.keepalive-count 9
 46:     option send-gids true
 47: end-volume
 48:
49: volume vol-data-replicate-0
 50:     type cluster/replicate
51: option afr-pending-xattr vol-data-client-0,vol-data-client-1,vol-data-client-2
 52:     option arbiter-count 1
 53:     option use-compound-fops off
 54:     subvolumes vol-data-client-0 vol-data-client-1 vol-data-client-2
 55: end-volume
 56:
 57: volume vol-data-dht
 58:     type cluster/distribute
 59:     option min-free-disk 10%
 60:     option lock-migration off
 61:     option force-migration off
 62:     subvolumes vol-data-replicate-0
 63: end-volume
 64:
 65: volume vol-data-write-behind
 66:     type performance/write-behind
 67:     subvolumes vol-data-dht
 68: end-volume
 69:
 70: volume vol-data-read-ahead
 71:     type performance/read-ahead
 72:     subvolumes vol-data-write-behind
 73: end-volume
 74:
 75: volume vol-data-readdir-ahead
 76:     type performance/readdir-ahead
 77:     option parallel-readdir off
 78:     option rda-request-size 131072
 79:     option rda-cache-limit 10MB
 80:     subvolumes vol-data-read-ahead
 81: end-volume
 82:
 83: volume vol-data-io-cache
 84:     type performance/io-cache
 85:     option max-file-size 256MB
 86:     option cache-size 28GB
 87:     subvolumes vol-data-readdir-ahead
 88: end-volume
 89:
 90: volume vol-data-quick-read
 91:     type performance/quick-read
 92:     option cache-size 28GB
 93:     subvolumes vol-data-io-cache
 94: end-volume
 95:
 96: volume vol-data-open-behind
 97:     type performance/open-behind
 98:     subvolumes vol-data-quick-read
 99: end-volume
100:
101: volume vol-data-md-cache
102:     type performance/md-cache
103:     subvolumes vol-data-open-behind
104: end-volume
105:
106: volume vol-data-io-threads
107:     type performance/io-threads
108:     subvolumes vol-data-md-cache
109: end-volume
110:
111: volume vol-data
112:     type debug/io-stats
113:     option log-level INFO
114:     option latency-measurement off
115:     option count-fop-hits off
116:     subvolumes vol-data-io-threads
117: end-volume
118:
119: volume meta-autoload
120:     type meta
121:     subvolumes vol-data
122: end-volume
123:
+------------------------------------------------------------------------------+
[2019-09-17 07:43:47.249546] E [socket.c:2524:socket_connect_finish] 0-vol-data-client-2: connection to 134.21.57.122:24007 failed (No route to host); disconnecting socket [2019-09-17 07:43:48.801700] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0



root@nas20:/var/log/glusterfs# dmesg |grep error
[    2.463658] i8042: probe of i8042 failed with error -5
[    8.180404] EXT4-fs (sdb1): re-mounted. Opts: errors=remount-ro
[ 10.024111] EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: errors=remount-ro [ 64.432042] ureadahead[1478]: segfault at 7f4b99d3d2c0 ip 00005629096fe2d1 sp 00007fff9dc98250 error 6 in ureadahead[5629096fa000+8000]


root@nas20:/var/log/glusterfs# cat export-users.log | grep "2019-10-08 20"
[2019-10-08 20:10:33.695082] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.8 (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=localhost --volfile-id=/vol-users /export/users) [2019-10-08 20:10:33.712430] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-10-08 20:10:33.816594] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2019-10-08 20:10:33.820975] I [MSGID: 114020] [client.c:2328:notify] 0-vol-users-client-0: parent translators are ready, attempting connect on transport [2019-10-08 20:10:33.821257] I [MSGID: 114020] [client.c:2328:notify] 0-vol-users-client-1: parent translators are ready, attempting connect on transport [2019-10-08 20:10:33.821466] I [MSGID: 114020] [client.c:2328:notify] 0-vol-users-client-2: parent translators are ready, attempting connect on transport [2019-10-08 20:10:33.822271] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:33.822425] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:33.822484] E [MSGID: 114058] [client-handshake.c:1442:client_query_portmap_cbk] 0-vol-users-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2019-10-08 20:10:33.822518] I [MSGID: 114018] [client.c:2254:client_rpc_notify] 0-vol-users-client-0: disconnected from vol-users-client-0. Client process will keep trying to connect to glusterd until brick's port is available [2019-10-08 20:10:33.822528] E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-users-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2019-10-08 20:10:36.387074] E [socket.c:2524:socket_connect_finish] 0-vol-users-client-2: connection to 134.21.57.122:24007 failed (No route to host); disconnecting socket [2019-10-08 20:10:36.387120] E [socket.c:2524:socket_connect_finish] 0-vol-users-client-1: connection to 192.168.1.121:24007 failed (No route to host); disconnecting socket [2019-10-08 20:10:36.388236] I [fuse-bridge.c:4294:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23 [2019-10-08 20:10:36.388254] I [fuse-bridge.c:4927:fuse_graph_sync] 0-fuse: switched to graph 0 The message "E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-users-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-10-08 20:10:33.822528] and [2019-10-08 20:10:36.387272] [2019-10-08 20:10:36.388596] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-users-replicate-0: no subvolumes up [2019-10-08 20:10:36.388667] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-users-dht: dict is null [2019-10-08 20:10:36.388724] E [fuse-bridge.c:4362:fuse_first_lookup] 0-fuse: first lookup on root failed (Transport endpoint is not connected) [2019-10-08 20:10:36.388847] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-users-replicate-0: no subvolumes up [2019-10-08 20:10:36.388864] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-users-dht: dict is null [2019-10-08 20:10:36.388883] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-08 20:10:36.388893] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 2: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-08 20:10:36.391191] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-users-replicate-0: no subvolumes up [2019-10-08 20:10:36.391218] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-users-dht: dict is null [2019-10-08 20:10:36.391241] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-08 20:10:36.391250] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-08 20:10:36.391317] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-users-replicate-0: no subvolumes up [2019-10-08 20:10:36.391333] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-users-dht: dict is null [2019-10-08 20:10:36.391352] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-08 20:10:36.391360] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 4: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-08 20:10:36.406967] I [fuse-bridge.c:5199:fuse_thread_proc] 0-fuse: initating unmount of /export/users [2019-10-08 20:10:36.407298] W [glusterfsd.c:1514:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f88cc59b6ba] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xed) [0x55c01427f70d] -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x55c01427f524] ) 0-: received signum (15), shutting down [2019-10-08 20:10:36.407318] I [fuse-bridge.c:5981:fini] 0-fuse: Unmounting '/export/users'. [2019-10-08 20:10:36.407326] I [fuse-bridge.c:5986:fini] 0-fuse: Closing fuse connection to '/export/users'. [2019-10-08 20:10:43.925719] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.8 (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=localhost --volfile-id=/vol-users /export/users) [2019-10-08 20:10:43.929529] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-10-08 20:10:43.933210] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2019-10-08 20:10:43.933789] I [MSGID: 114020] [client.c:2328:notify] 0-vol-users-client-0: parent translators are ready, attempting connect on transport [2019-10-08 20:10:43.934151] I [MSGID: 114020] [client.c:2328:notify] 0-vol-users-client-1: parent translators are ready, attempting connect on transport [2019-10-08 20:10:43.934174] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:43.934269] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:43.934331] E [MSGID: 114058] [client-handshake.c:1442:client_query_portmap_cbk] 0-vol-users-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2019-10-08 20:10:43.934369] I [MSGID: 114018] [client.c:2254:client_rpc_notify] 0-vol-users-client-0: disconnected from vol-users-client-0. Client process will keep trying to connect to glusterd until brick's port is available [2019-10-08 20:10:43.934379] E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-users-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2019-10-08 20:10:43.934434] I [MSGID: 114020] [client.c:2328:notify] 0-vol-users-client-2: parent translators are ready, attempting connect on transport [2019-10-08 20:10:43.934574] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:43.934782] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:43.934859] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:43.934931] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-users-client-1: changing port to 49154 (from 0) [2019-10-08 20:10:43.935152] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:43.935286] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-users-client-2: changing port to 49154 (from 0) [2019-10-08 20:10:43.935314] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:43.935515] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:43.935711] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:43.935919] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:43.936354] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-users-client-1: Connected to vol-users-client-1, attached to remote volume '/bigdisk/brick2/vol-users'. [2019-10-08 20:10:43.936375] I [MSGID: 108005] [afr-common.c:5336:__afr_handle_child_up_event] 0-vol-users-replicate-0: Subvolume 'vol-users-client-1' came back up; going online. [2019-10-08 20:10:43.936728] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-users-client-2: Connected to vol-users-client-2, attached to remote volume '/bigdisk/brick3/vol-users'. [2019-10-08 20:10:43.936742] I [MSGID: 108002] [afr-common.c:5611:afr_notify] 0-vol-users-replicate-0: Client-quorum is met [2019-10-08 20:10:43.937579] I [fuse-bridge.c:4294:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23 [2019-10-08 20:10:43.937595] I [fuse-bridge.c:4927:fuse_graph_sync] 0-fuse: switched to graph 0 [2019-10-08 20:10:43.939789] I [MSGID: 109005] [dht-selfheal.c:2342:dht_selfheal_directory] 0-vol-users-dht: Directory selfheal failed: Unable to form layout for directory / [2019-10-08 20:10:47.927439] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.927555] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.927627] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-users-client-0: changing port to 49152 (from 0) [2019-10-08 20:10:47.928087] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.928201] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-users-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.928717] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-users-client-0: Connected to vol-users-client-0, attached to remote volume '/bigdisk/brick1/vol-users'.
root@nas20:/var/log/glusterfs# cat export-users.log | grep "2019-10-08 22"
root@nas20:/var/log/glusterfs# cat export-users.log | grep "2019-10-08 21"
root@nas20:/var/log/glusterfs# cat export-users.log | grep "2019-10-08 23"
root@nas20:/var/log/glusterfs# cat export-data.log.log | grep "2019-10-08 23"
cat: export-data.log.log: No such file or directory
root@nas20:/var/log/glusterfs# cat export-data.log | grep "2019-10-08 15"
root@nas20:/var/log/glusterfs# cat export-data.log | grep "2019-10-08 16"
root@nas20:/var/log/glusterfs# cat export-data.log | grep "2019-10-08 17"
root@nas20:/var/log/glusterfs# cat export-data.log | grep "2019-10-08 19"
root@nas20:/var/log/glusterfs# cat export-data.log | grep "2019-10-08 1"
root@nas20:/var/log/glusterfs# cat export-data.log | grep "2019-10-08 20"
[2019-10-08 20:10:33.695000] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.8 (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=localhost --volfile-id=/vol-data /export/data) [2019-10-08 20:10:33.737302] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-10-08 20:10:33.816578] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2019-10-08 20:10:33.820946] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-0: parent translators are ready, attempting connect on transport [2019-10-08 20:10:33.821255] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-1: parent translators are ready, attempting connect on transport [2019-10-08 20:10:33.821467] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-2: parent translators are ready, attempting connect on transport [2019-10-08 20:10:33.822144] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:33.822243] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:33.822374] E [MSGID: 114058] [client-handshake.c:1442:client_query_portmap_cbk] 0-vol-data-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2019-10-08 20:10:33.822412] I [MSGID: 114018] [client.c:2254:client_rpc_notify] 0-vol-data-client-0: disconnected from vol-data-client-0. Client process will keep trying to connect to glusterd until brick's port is available [2019-10-08 20:10:33.822423] E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-data-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2019-10-08 20:10:36.387062] E [socket.c:2524:socket_connect_finish] 0-vol-data-client-2: connection to 134.21.57.122:24007 failed (No route to host); disconnecting socket [2019-10-08 20:10:36.387091] E [socket.c:2524:socket_connect_finish] 0-vol-data-client-1: connection to 192.168.1.121:24007 failed (No route to host); disconnecting socket [2019-10-08 20:10:36.388218] I [fuse-bridge.c:4294:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23 [2019-10-08 20:10:36.388237] I [fuse-bridge.c:4927:fuse_graph_sync] 0-fuse: switched to graph 0 The message "E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-data-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-10-08 20:10:33.822423] and [2019-10-08 20:10:36.387268] [2019-10-08 20:10:36.388590] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-data-replicate-0: no subvolumes up [2019-10-08 20:10:36.388630] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-data-dht: dict is null [2019-10-08 20:10:36.388723] E [fuse-bridge.c:4362:fuse_first_lookup] 0-fuse: first lookup on root failed (Transport endpoint is not connected) [2019-10-08 20:10:36.388855] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-data-replicate-0: no subvolumes up [2019-10-08 20:10:36.388871] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-data-dht: dict is null [2019-10-08 20:10:36.388892] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-08 20:10:36.388902] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 2: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-08 20:10:36.390447] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-data-replicate-0: no subvolumes up [2019-10-08 20:10:36.390480] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-data-dht: dict is null [2019-10-08 20:10:36.390503] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-08 20:10:36.390513] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-08 20:10:36.390580] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-data-replicate-0: no subvolumes up [2019-10-08 20:10:36.390595] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-data-dht: dict is null [2019-10-08 20:10:36.390614] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-08 20:10:36.390622] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 4: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-08 20:10:36.410905] I [fuse-bridge.c:5199:fuse_thread_proc] 0-fuse: initating unmount of /export/data [2019-10-08 20:10:36.411091] W [glusterfsd.c:1514:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7ff189f586ba] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xed) [0x55946f24b70d] -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x55946f24b524] ) 0-: received signum (15), shutting down [2019-10-08 20:10:36.411113] I [fuse-bridge.c:5981:fini] 0-fuse: Unmounting '/export/data'. [2019-10-08 20:10:36.411122] I [fuse-bridge.c:5986:fini] 0-fuse: Closing fuse connection to '/export/data'. [2019-10-08 20:10:36.845106] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.8 (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=localhost --volfile-id=/vol-data /export/data) [2019-10-08 20:10:36.848865] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-10-08 20:10:36.852064] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2019-10-08 20:10:36.852477] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-0: parent translators are ready, attempting connect on transport [2019-10-08 20:10:36.852694] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-1: parent translators are ready, attempting connect on transport [2019-10-08 20:10:36.852773] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:36.852877] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:36.852917] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-2: parent translators are ready, attempting connect on transport [2019-10-08 20:10:36.852947] E [MSGID: 114058] [client-handshake.c:1442:client_query_portmap_cbk] 0-vol-data-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2019-10-08 20:10:36.852980] I [MSGID: 114018] [client.c:2254:client_rpc_notify] 0-vol-data-client-0: disconnected from vol-data-client-0. Client process will keep trying to connect to glusterd until brick's port is available [2019-10-08 20:10:36.852990] E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-data-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2019-10-08 20:10:37.387355] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:37.387579] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:37.387706] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-data-client-1: changing port to 49156 (from 0) [2019-10-08 20:10:37.388065] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:37.388253] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:37.389087] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-data-client-1: Connected to vol-data-client-1, attached to remote volume '/bigdisk/brick2/vol-data'. [2019-10-08 20:10:37.389102] I [MSGID: 108005] [afr-common.c:5336:__afr_handle_child_up_event] 0-vol-data-replicate-0: Subvolume 'vol-data-client-1' came back up; going online. [2019-10-08 20:10:39.387062] E [socket.c:2524:socket_connect_finish] 0-vol-data-client-2: connection to 134.21.57.122:24007 failed (No route to host); disconnecting socket [2019-10-08 20:10:39.389703] I [fuse-bridge.c:4294:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23 [2019-10-08 20:10:39.389740] I [fuse-bridge.c:4927:fuse_graph_sync] 0-fuse: switched to graph 0 [2019-10-08 20:10:39.411859] I [glusterfsd-mgmt.c:53:mgmt_cbk_spec] 0-mgmt: Volume file changed [2019-10-08 20:10:40.832633] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-data-dht: dict is null [2019-10-08 20:10:40.832712] E [fuse-bridge.c:4362:fuse_first_lookup] 0-fuse: first lookup on root failed (Transport endpoint is not connected) [2019-10-08 20:10:40.834248] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-08 20:10:40.834281] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 2: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-08 20:10:40.837624] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-08 20:10:40.837659] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-08 20:10:40.839468] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-08 20:10:40.839503] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 4: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-08 20:10:40.847013] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:40.847219] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:40.847368] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-data-client-2: changing port to 49158 (from 0) [2019-10-08 20:10:40.847725] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:40.847906] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 The message "E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-data-dht: dict is null" repeated 3 times between [2019-10-08 20:10:40.832633] and [2019-10-08 20:10:40.839454] [2019-10-08 20:10:40.848759] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-data-client-2: Connected to vol-data-client-2, attached to remote volume '/bigdisk/brick3/vol-data'. [2019-10-08 20:10:40.848785] I [MSGID: 108002] [afr-common.c:5611:afr_notify] 0-vol-data-replicate-0: Client-quorum is met [2019-10-08 20:10:40.874884] I [fuse-bridge.c:5199:fuse_thread_proc] 0-fuse: initating unmount of /export/data [2019-10-08 20:10:40.875054] W [glusterfsd.c:1514:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7fdc50b646ba] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xed) [0x563108ee670d] -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x563108ee6524] ) 0-: received signum (15), shutting down [2019-10-08 20:10:40.875079] I [fuse-bridge.c:5981:fini] 0-fuse: Unmounting '/export/data'. [2019-10-08 20:10:40.875087] I [fuse-bridge.c:5986:fini] 0-fuse: Closing fuse connection to '/export/data'. [2019-10-08 20:10:47.464875] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.8 (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=localhost --volfile-id=/vol-data /export/data) [2019-10-08 20:10:47.468743] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-10-08 20:10:47.472050] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2019-10-08 20:10:47.472465] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-0: parent translators are ready, attempting connect on transport [2019-10-08 20:10:47.472803] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-1: parent translators are ready, attempting connect on transport [2019-10-08 20:10:47.472865] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.472968] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.473036] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-data-client-0: changing port to 49156 (from 0) [2019-10-08 20:10:47.473121] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-2: parent translators are ready, attempting connect on transport [2019-10-08 20:10:47.473466] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.473511] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.473681] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.473850] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.473928] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.474019] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-data-client-1: changing port to 49156 (from 0) [2019-10-08 20:10:47.474072] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.474309] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-data-client-2: changing port to 49158 (from 0) [2019-10-08 20:10:47.474621] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-data-client-0: Connected to vol-data-client-0, attached to remote volume '/bigdisk/brick1/vol-data'. [2019-10-08 20:10:47.474638] I [MSGID: 108005] [afr-common.c:5336:__afr_handle_child_up_event] 0-vol-data-replicate-0: Subvolume 'vol-data-client-0' came back up; going online. [2019-10-08 20:10:47.474750] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.474927] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.474958] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.475216] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-08 20:10:47.476030] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-data-client-1: Connected to vol-data-client-1, attached to remote volume '/bigdisk/brick2/vol-data'. [2019-10-08 20:10:47.476052] I [MSGID: 108002] [afr-common.c:5611:afr_notify] 0-vol-data-replicate-0: Client-quorum is met [2019-10-08 20:10:47.476152] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-data-client-2: Connected to vol-data-client-2, attached to remote volume '/bigdisk/brick3/vol-data'. [2019-10-08 20:10:47.477159] I [fuse-bridge.c:4294:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23 [2019-10-08 20:10:47.477210] I [fuse-bridge.c:4927:fuse_graph_sync] 0-fuse: switched to graph 0 [2019-10-08 20:10:47.478960] I [MSGID: 108031] [afr-common.c:2597:afr_local_discovery_cbk] 0-vol-data-replicate-0: selecting local read_child vol-data-client-0 [2019-10-08 20:10:47.479971] I [MSGID: 108031] [afr-common.c:2597:afr_local_discovery_cbk] 0-vol-data-replicate-0: selecting local read_child vol-data-client-0 [2019-10-08 20:10:47.480094] I [MSGID: 109005] [dht-selfheal.c:2342:dht_selfheal_directory] 0-vol-data-dht: Directory selfheal failed: Unable to form layout for directory /
root@nas20:/var/log/glusterfs# cat export-data.log | grep "2019-10-09 1"
root@nas20:/var/log/glusterfs# cat export-data.log | grep "2019-10-09 7"
root@nas20:/var/log/glusterfs# cat export-data.log | grep "2019-10-09 0"
[2019-10-09 04:25:02.165330] I [MSGID: 100011] [glusterfsd.c:1599:reincarnate] 0-glusterfsd: Fetching the volume file from server... [2019-10-09 04:25:02.191948] I [glusterfsd-mgmt.c:1953:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing [2019-10-09 07:12:03.955619] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.8 (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=localhost --volfile-id=/vol-data /export/data) [2019-10-09 07:12:03.981652] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-10-09 07:12:04.002485] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2019-10-09 07:12:04.003899] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-0: parent translators are ready, attempting connect on transport [2019-10-09 07:12:04.004147] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-1: parent translators are ready, attempting connect on transport [2019-10-09 07:12:04.004366] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-2: parent translators are ready, attempting connect on transport [2019-10-09 07:12:04.004628] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:04.004923] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:04.005244] E [MSGID: 114058] [client-handshake.c:1442:client_query_portmap_cbk] 0-vol-data-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2019-10-09 07:12:04.005286] I [MSGID: 114018] [client.c:2254:client_rpc_notify] 0-vol-data-client-0: disconnected from vol-data-client-0. Client process will keep trying to connect to glusterd until brick's port is available [2019-10-09 07:12:04.005297] E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-data-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2019-10-09 07:12:06.690631] E [socket.c:2524:socket_connect_finish] 0-vol-data-client-2: connection to 134.21.57.122:24007 failed (No route to host); disconnecting socket [2019-10-09 07:12:06.690792] E [socket.c:2524:socket_connect_finish] 0-vol-data-client-1: connection to 192.168.1.121:24007 failed (No route to host); disconnecting socket [2019-10-09 07:12:06.691746] I [fuse-bridge.c:4294:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23 [2019-10-09 07:12:06.691771] I [fuse-bridge.c:4927:fuse_graph_sync] 0-fuse: switched to graph 0 The message "E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-data-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up." repeated 2 times between [2019-10-09 07:12:04.005297] and [2019-10-09 07:12:06.690811] [2019-10-09 07:12:06.692647] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-data-replicate-0: no subvolumes up [2019-10-09 07:12:06.692695] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-data-dht: dict is null [2019-10-09 07:12:06.692807] E [fuse-bridge.c:4362:fuse_first_lookup] 0-fuse: first lookup on root failed (Transport endpoint is not connected) [2019-10-09 07:12:06.692955] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-data-replicate-0: no subvolumes up [2019-10-09 07:12:06.692980] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-data-dht: dict is null [2019-10-09 07:12:06.693003] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-09 07:12:06.693013] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 2: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-09 07:12:06.695503] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-data-replicate-0: no subvolumes up [2019-10-09 07:12:06.695526] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-data-dht: dict is null [2019-10-09 07:12:06.695547] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-09 07:12:06.695556] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 3: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-09 07:12:06.695619] I [MSGID: 108006] [afr-common.c:5677:afr_local_init] 0-vol-data-replicate-0: no subvolumes up [2019-10-09 07:12:06.695633] E [MSGID: 101046] [dht-common.c:1502:dht_lookup_dir_cbk] 0-vol-data-dht: dict is null [2019-10-09 07:12:06.695650] W [fuse-resolve.c:132:fuse_resolve_gfid_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001: failed to resolve (Transport endpoint is not connected) [2019-10-09 07:12:06.695658] E [fuse-bridge.c:928:fuse_getattr_resume] 0-glusterfs-fuse: 4: GETATTR 1 (00000000-0000-0000-0000-000000000001) resolution failed [2019-10-09 07:12:06.714499] I [fuse-bridge.c:5199:fuse_thread_proc] 0-fuse: initating unmount of /export/data [2019-10-09 07:12:06.714753] W [glusterfsd.c:1514:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f133ffef6ba] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xed) [0x562b2312c70d] -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x562b2312c524] ) 0-: received signum (15), shutting down [2019-10-09 07:12:06.714773] I [fuse-bridge.c:5981:fini] 0-fuse: Unmounting '/export/data'. [2019-10-09 07:12:06.714779] I [fuse-bridge.c:5986:fini] 0-fuse: Closing fuse connection to '/export/data'. [2019-10-09 07:12:07.109206] I [MSGID: 100030] [glusterfsd.c:2741:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 4.1.8 (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=localhost --volfile-id=/vol-data /export/data) [2019-10-09 07:12:07.112870] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2019-10-09 07:12:07.116011] I [MSGID: 101190] [event-epoll.c:617:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2019-10-09 07:12:07.116421] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-0: parent translators are ready, attempting connect on transport [2019-10-09 07:12:07.116655] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-1: parent translators are ready, attempting connect on transport [2019-10-09 07:12:07.116676] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:07.116767] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:07.116833] E [MSGID: 114058] [client-handshake.c:1442:client_query_portmap_cbk] 0-vol-data-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running. [2019-10-09 07:12:07.116835] I [MSGID: 114020] [client.c:2328:notify] 0-vol-data-client-2: parent translators are ready, attempting connect on transport [2019-10-09 07:12:07.116887] I [MSGID: 114018] [client.c:2254:client_rpc_notify] 0-vol-data-client-0: disconnected from vol-data-client-0. Client process will keep trying to connect to glusterd until brick's port is available [2019-10-09 07:12:07.116898] E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-data-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2019-10-09 07:12:07.691005] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:09.690613] E [socket.c:2524:socket_connect_finish] 0-vol-data-client-2: connection to 134.21.57.122:24007 failed (No route to host); disconnecting socket [2019-10-09 07:12:11.111975] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:11.112083] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:11.112200] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:11.112397] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:11.112518] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-data-client-2: changing port to 49158 (from 0) [2019-10-09 07:12:11.112820] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:11.113013] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-2: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:09.690664] E [MSGID: 108006] [afr-common.c:5413:__afr_handle_child_down_event] 0-vol-data-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2019-10-09 07:12:11.114003] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-data-client-2: Connected to vol-data-client-2, attached to remote volume '/bigdisk/brick3/vol-data'. [2019-10-09 07:12:11.114045] I [MSGID: 108005] [afr-common.c:5336:__afr_handle_child_up_event] 0-vol-data-replicate-0: Subvolume 'vol-data-client-2' came back up; going online. [2019-10-09 07:12:11.290914] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:11.291239] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-data-client-1: changing port to 49156 (from 0) [2019-10-09 07:12:11.291676] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:11.291919] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-1: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:11.293266] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-data-client-1: Connected to vol-data-client-1, attached to remote volume '/bigdisk/brick2/vol-data'. [2019-10-09 07:12:11.293306] I [MSGID: 108002] [afr-common.c:5611:afr_notify] 0-vol-data-replicate-0: Client-quorum is met [2019-10-09 07:12:11.295955] I [fuse-bridge.c:4294:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23 [2019-10-09 07:12:11.296014] I [fuse-bridge.c:4927:fuse_graph_sync] 0-fuse: switched to graph 0 [2019-10-09 07:12:11.299181] I [MSGID: 109005] [dht-selfheal.c:2342:dht_selfheal_directory] 0-vol-data-dht: Directory selfheal failed: Unable to form layout for directory / [2019-10-09 07:12:14.112691] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:14.112772] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:17.113224] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:17.113319] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:20.113917] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:20.114031] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:24.393064] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:24.393253] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:26.393776] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:26.393880] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:29.394504] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:29.394614] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:32.395375] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:32.395534] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:35.395920] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:35.396027] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:38.396531] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:38.396618] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:41.397419] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:41.397526] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:44.398189] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:44.398312] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:47.399045] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:47.399166] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:50.399735] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:50.399855] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:53.400507] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:53.400616] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:56.401284] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:56.401402] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:59.402080] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:12:59.402200] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:02.402863] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:02.402984] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:05.404125] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:05.404320] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:08.404977] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:08.405172] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:11.405694] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:11.405884] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:14.406443] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:14.406629] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:17.407255] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:17.407445] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:20.408092] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:20.408277] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:23.409546] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:23.409735] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:26.410420] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:26.410600] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:29.411353] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:29.411528] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:32.412325] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:32.412505] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:35.413311] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:35.413491] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:38.414345] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:38.414540] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:41.415407] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:41.415597] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:44.416490] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:44.416672] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:47.417664] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:47.417851] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:50.418814] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:50.419005] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:53.419982] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:53.420166] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:56.421200] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:56.421388] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:59.422450] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:13:59.422630] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:02.423757] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:02.423952] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:05.425051] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:05.425243] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:08.425832] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:08.426011] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:11.426636] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:11.426846] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:16.310279] I [glusterfsd-mgmt.c:53:mgmt_cbk_spec] 0-mgmt: Volume file changed [2019-10-09 07:14:19.393266] I [glusterfsd-mgmt.c:53:mgmt_cbk_spec] 0-mgmt: Volume file changed [2019-10-09 07:14:19.465709] I [glusterfsd-mgmt.c:1953:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing [2019-10-09 07:14:19.467466] I [glusterfsd-mgmt.c:1953:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing [2019-10-09 07:14:29.457122] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:29.457312] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:29.457431] I [rpc-clnt.c:2105:rpc_clnt_reconfig] 0-vol-data-client-0: changing port to 49157 (from 0) [2019-10-09 07:14:29.458078] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:29.458264] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-vol-data-client-0: error returned while attempting to connect to host:(null), port:0 [2019-10-09 07:14:29.459212] I [MSGID: 114046] [client-handshake.c:1095:client_setvolume_cbk] 0-vol-data-client-0: Connected to vol-data-client-0, attached to remote volume '/bigdisk/brick1/vol-data'.

Regards,
Birgit

On 13/10/19 08:13, Amar Tumballi wrote:
'Transport endpoint not connected' (ie, ENOTCONN) comes when the n/w connection is not established between client and the server. I recommend checking the logs for particular reason. Specially the brick (server side) logs will have some hints on this.

About the crash, we treat it as a bug. Considering there is no specific backtrace, or logs shared with the email, it is hard to tell if it is already fixed in higher version or not.

Considering you are in 4.1.8 version, and there are many releases done after that, upgrading also can be an option.

Regards,
Amar


On Fri, Oct 11, 2019 at 4:13 PM DUCARROZ Birgit <birgit.ducarroz@xxxxxxxx <mailto:birgit.ducarroz@xxxxxxxx>> wrote:

    Hi list,

    Does anyone know what I can do to avoid "Transport Endpoint not
    connected" (and then to get a blocked server) when writing a lot of
    small files on a volume?

    I'm running glusterfs 4.1.8 on 6 servers. With 3 servers I never have
    problems, but the other 3 servers are acting as HA storage for people
    who write sometimes a thousands of small files. This seems to provoke a
    crash of the gluster daemon.

    I have 3 bricks whereas the 3rd brick acts as arbiter.


    # Location of the bricks:
    #-------$HOST1-------  -------$HOST3-------
    # brick1            |  | brick3           | brick3 = arbiter
    #                   |  |                  |
    #-------$HOST2-------  --------------------
    # brick2            |
    #--------------------

    Checked:
    The underlying ext4 filesystem and the HD's seem to be without errors.
    The ports in the firewall should not be the problem since it occurs
    also
    when the firewall is disabled.

    Any help appreciated!
    Kind regards,
    Birgit
    ________

    Community Meeting Calendar:

    APAC Schedule -
    Every 2nd and 4th Tuesday at 11:30 AM IST
    Bridge: https://bluejeans.com/118564314

    NA/EMEA Schedule -
    Every 1st and 3rd Tuesday at 01:00 PM EDT
    Bridge: https://bluejeans.com/118564314

    Gluster-users mailing list
    Gluster-users@xxxxxxxxxxx <mailto:Gluster-users@xxxxxxxxxxx>
    https://lists.gluster.org/mailman/listinfo/gluster-users

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users





[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux