All four clients did run 3.10.2 as well
The volumes has been running fine until we upgraded to 3.10, when we hit some issues with port mismatches. We restarted all the volumes, the servers and the clients and now hit this issue.
We’ve since backed up the files, remove the volumes, removed the bricks, removed gluster, installed glusterfs 3.7.20, created new volumes on new bricks, restored the files and still hit the same issue at clients on the nodes that also runs the
servers. We’ve got to clients on connected to one of the volumes that has been working fine all the time.
This is the debug logs from one of the mount as the client gets disconnected:
The message "D [MSGID: 0] [dht-common.c:979:dht_revalidate_cbk] 0-mule-dht: revalidate lookup of / returned with op_ret 0 [Structure needs cleaning]" repeated 26 times between [2017-05-31 13:48:51.680757] and [2017-05-31 13:50:46.325368]
/DAEMON/DEBUG [2017-05-31T15:50:50.589272+02:00] [] [] [logging.c:1830:gf_log_flush_timeout_cbk] 0-logging-infra: Log timer timed out. About to flush outstanding messages if present
/DAEMON/DEBUG [2017-05-31T15:50:50.589520+02:00] [] [] [logging.c:1792:__gf_log_inject_timer_event] 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
[2017-05-31 13:50:51.908797] D [MSGID: 0] [dht-common.c:979:dht_revalidate_cbk] 0-mule-dht: revalidate lookup of / returned with op_ret 0 [Structure needs cleaning]
/DAEMON/DEBUG [2017-05-31T15:51:24.592190+02:00] [] [] [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-mule-client-0: returning as transport is already disconnected OR there are no frames (0 || 0)
/DAEMON/DEBUG [2017-05-31T15:51:24.592469+02:00] [] [] [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-mule-client-1: returning as transport is already disconnected OR there are no frames (0 || 0)
/DAEMON/DEBUG [2017-05-31T15:51:26.324867+02:00] [] [] [rpc-clnt-ping.c:98:rpc_clnt_remove_ping_timer_locked] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f36b3260192] (--> /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7f36b302f9db]
(--> /lib
64/libgfrpc.so.0(+0x13fd4)[0x7f36b302ffd4] (--> /lib64/libgfrpc.so.0(rpc_clnt_submit+0x451)[0x7f36b302cf01] (--> /usr/lib64/glusterfs/3.7.20/xlator/protocol/client.so(client_submit_request+0x1fc)[0x7f36a599c33c] ))))) 0-: 10.3.48.179:49155: ping
timer event already remove
d
/DAEMON/DEBUG [2017-05-31T15:51:26.325230+02:00] [] [] [rpc-clnt-ping.c:98:rpc_clnt_remove_ping_timer_locked] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f36b3260192] (--> /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7f36b302f9db]
(--> /lib
64/libgfrpc.so.0(+0x13fd4)[0x7f36b302ffd4] (--> /lib64/libgfrpc.so.0(rpc_clnt_submit+0x451)[0x7f36b302cf01] (--> /usr/lib64/glusterfs/3.7.20/xlator/protocol/client.so(client_submit_request+0x1fc)[0x7f36a599c33c] ))))) 0-: 10.3.48.180:49155: ping
timer event already remove
d
/DAEMON/DEBUG [2017-05-31T15:52:08.595536+02:00] [] [] [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-mule-client-0: returning as transport is already disconnected OR there are no frames (0 || 0)
/DAEMON/DEBUG [2017-05-31T15:52:08.595735+02:00] [] [] [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-mule-client-1: returning as transport is already disconnected OR there are no frames (0 || 0)
/DAEMON/DEBUG [2017-05-31T15:52:12.059895+02:00] [] [] [rpc-clnt-ping.c:98:rpc_clnt_remove_ping_timer_locked] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f36b3260192] (--> /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7f36b302f9db]
(--> /lib
64/libgfrpc.so.0(+0x13fd4)[0x7f36b302ffd4] (--> /lib64/libgfrpc.so.0(rpc_clnt_submit+0x451)[0x7f36b302cf01] (--> /usr/lib64/glusterfs/3.7.20/xlator/protocol/client.so(client_submit_request+0x1fc)[0x7f36a599c33c] ))))) 0-: 10.3.48.179:49155: ping
timer event already remove
d
/DAEMON/DEBUG [2017-05-31T15:52:12.060170+02:00] [] [] [rpc-clnt-ping.c:98:rpc_clnt_remove_ping_timer_locked] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f36b3260192] (--> /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7f36b302f9db]
(--> /lib
64/libgfrpc.so.0(+0x13fd4)[0x7f36b302ffd4] (--> /lib64/libgfrpc.so.0(rpc_clnt_submit+0x451)[0x7f36b302cf01] (--> /usr/lib64/glusterfs/3.7.20/xlator/protocol/client.so(client_submit_request+0x1fc)[0x7f36a599c33c] ))))) 0-: 10.3.48.180:49155: ping
timer event already remove
d
The message "D [MSGID: 0] [dht-common.c:979:dht_revalidate_cbk] 0-mule-dht: revalidate lookup of / returned with op_ret 0 [Structure needs cleaning]" repeated 26 times between [2017-05-31 13:50:51.908797] and [2017-05-31 13:52:46.326381]
/DAEMON/DEBUG [2017-05-31T15:52:50.598987+02:00] [] [] [logging.c:1830:gf_log_flush_timeout_cbk] 0-logging-infra: Log timer timed out. About to flush outstanding messages if present
/DAEMON/DEBUG [2017-05-31T15:52:50.599226+02:00] [] [] [logging.c:1792:__gf_log_inject_timer_event] 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
[2017-05-31 13:52:52.138032] D [MSGID: 0] [dht-common.c:979:dht_revalidate_cbk] 0-mule-dht: revalidate lookup of / returned with op_ret 0 [Structure needs cleaning]
/DAEMON/DEBUG [2017-05-31T15:52:54.599435+02:00] [] [] [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-mule-client-0: returning as transport is already disconnected OR there are no frames (0 || 0)
/DAEMON/DEBUG [2017-05-31T15:52:54.599718+02:00] [] [] [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-mule-client-1: returning as transport is already disconnected OR there are no frames (0 || 0)
/DAEMON/DEBUG [2017-05-31T15:52:56.325482+02:00] [] [] [rpc-clnt-ping.c:98:rpc_clnt_remove_ping_timer_locked] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f36b3260192] (--> /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7f36b302f9db]
(--> /lib
64/libgfrpc.so.0(+0x13fd4)[0x7f36b302ffd4] (--> /lib64/libgfrpc.so.0(rpc_clnt_submit+0x451)[0x7f36b302cf01] (--> /usr/lib64/glusterfs/3.7.20/xlator/protocol/client.so(client_submit_request+0x1fc)[0x7f36a599c33c] ))))) 0-: 10.3.48.179:49155: ping
timer event already remove
d
/DAEMON/DEBUG [2017-05-31T15:52:56.325731+02:00] [] [] [rpc-clnt-ping.c:98:rpc_clnt_remove_ping_timer_locked] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f36b3260192] (--> /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7f36b302f9db]
(--> /lib
64/libgfrpc.so.0(+0x13fd4)[0x7f36b302ffd4] (--> /lib64/libgfrpc.so.0(rpc_clnt_submit+0x451)[0x7f36b302cf01] (--> /usr/lib64/glusterfs/3.7.20/xlator/protocol/client.so(client_submit_request+0x1fc)[0x7f36a599c33c] ))))) 0-: 10.3.48.180:49155: ping
timer event already remove
d
/DAEMON/DEBUG [2017-05-31T15:53:38.603305+02:00] [] [] [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-mule-client-0: returning as transport is already disconnected OR there are no frames (0 || 0)
/DAEMON/DEBUG [2017-05-31T15:53:38.603533+02:00] [] [] [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-mule-client-1: returning as transport is already disconnected OR there are no frames (0 || 0)
/DAEMON/DEBUG [2017-05-31T15:53:42.226123+02:00] [] [] [rpc-clnt-ping.c:98:rpc_clnt_remove_ping_timer_locked] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f36b3260192] (--> /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7f36b302f9db]
(--> /lib
64/libgfrpc.so.0(+0x13fd4)[0x7f36b302ffd4] (--> /lib64/libgfrpc.so.0(rpc_clnt_submit+0x451)[0x7f36b302cf01] (--> /usr/lib64/glusterfs/3.7.20/xlator/protocol/client.so(client_submit_request+0x1fc)[0x7f36a599c33c] ))))) 0-: 10.3.48.179:49155: ping
timer event already remove
d
/DAEMON/DEBUG [2017-05-31T15:53:42.226345+02:00] [] [] [rpc-clnt-ping.c:98:rpc_clnt_remove_ping_timer_locked] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f36b3260192] (--> /lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7f36b302f9db]
(--> /lib
64/libgfrpc.so.0(+0x13fd4)[0x7f36b302ffd4] (--> /lib64/libgfrpc.so.0(rpc_clnt_submit+0x451)[0x7f36b302cf01] (--> /usr/lib64/glusterfs/3.7.20/xlator/protocol/client.so(client_submit_request+0x1fc)[0x7f36a599c33c] ))))) 0-: 10.3.48.180:49155: ping
timer event already remove
d
/DAEMON/DEBUG [2017-05-31T15:54:24.607225+02:00] [] [] [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-mule-client-0: returning as transport is already disconnected OR there are no frames (0 || 0)
/DAEMON/DEBUG [2017-05-31T15:54:24.607479+02:00] [] [] [rpc-clnt-ping.c:300:rpc_clnt_start_ping] 0-mule-client-1: returning as transport is already disconnected OR there are no frames (0 || 0)
The message "D [MSGID: 0] [dht-common.c:979:dht_revalidate_cbk] 0-mule-dht: revalidate lookup of / returned with op_ret 0 [Structure needs cleaning]" repeated 20 times between [2017-05-31 13:52:52.138032] and [2017-05-31 13:54:22.302116]
[2017-05-31 13:54:25.284676] W [glusterfsd.c:1251:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dc5) [0x7f36b20c6dc5] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x7f36b3740915] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7f36b374078b]
) 0-: received signum (15), shutting down
/DAEMON/DEBUG [2017-05-31T15:54:25.285804+02:00] [] [] [logging.c:1766:gf_log_flush_extra_msgs] 0-logging-infra: Log buffer size reduced. About to flush 1 extra log messages
/DAEMON/DEBUG [2017-05-31T15:54:25.286570+02:00] [] [] [logging.c:1769:gf_log_flush_extra_msgs] 0-logging-infra: Just flushed 1 extra log messages
/DAEMON/DEBUG [2017-05-31T15:54:25.287431+02:00] [] [] [glusterfsd-mgmt.c:2361:glusterfs_mgmt_pmap_signout] 0-fsd-mgmt: portmapper signout arguments not given
/DAEMON/INFO [2017-05-31T15:54:25.287785+02:00] [] [] [fuse-bridge.c:5720:fini] 0-fuse: Unmounting '/mnt/gluster/mule’.
Cheers
Gabbe
|
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users