Hello,
We have a cluster with two nodes, "sg" and "br", which were running GlusterFS 9.1, installed via the Ubuntu package manager. We updated the Ubuntu packages on "sg" to version 9.6, and now have big problems. The "br" node is still on version 9.1.
Running "gluster volume status" on either host gives "Error : Request timed out". On "sg" not all processes are running, compared to "br", as below. Restarting the services on "sg" doesn't help. Can anyone advise how we should proceed? This is a production system.
root@sg:~# ps -ef | grep gluster
root 15196 1 0 22:37 ? 00:00:00 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
root 15426 1 0 22:39 ? 00:00:00 /usr/bin/python3 /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root 15457 15426 0 22:39 ? 00:00:00 /usr/bin/python3 /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root 19341 13695 0 23:24 pts/1 00:00:00 grep --color=auto gluster
root 15196 1 0 22:37 ? 00:00:00 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
root 15426 1 0 22:39 ? 00:00:00 /usr/bin/python3 /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root 15457 15426 0 22:39 ? 00:00:00 /usr/bin/python3 /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root 19341 13695 0 23:24 pts/1 00:00:00 grep --color=auto gluster
root@br:~# ps -ef | grep gluster
root 2052 1 0 2022 ? 00:00:00 /usr/bin/python3 /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root 2062 1 3 2022 ? 10-11:57:16 /usr/sbin/glusterfs --fuse-mountopts=noatime --process-name fuse --volfile-server=br --volfile-server=sg --volfile-id=/gvol0 --fuse-mountopts=noatime /mnt/glusterfs
root 2379 2052 0 2022 ? 00:00:00 /usr/bin/python3 /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root 5884 1 5 2022 ? 18-16:08:53 /usr/sbin/glusterfsd -s br --volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p /var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S /var/run/gluster/61df1d4e1c65300e.socket --brick-name /nodirectwritedata/gluster/gvol0 -l /var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log --xlator-option *-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6 --process-name brick --brick-port 49152 --xlator-option gvol0-server.listen-port=49152
root 10463 18747 0 23:24 pts/1 00:00:00 grep --color=auto gluster
root 27744 1 0 2022 ? 03:55:10 /usr/sbin/glusterfsd -s br --volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p /var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S /var/run/gluster/61df1d4e1c65300e.socket --brick-name /nodirectwritedata/gluster/gvol0 -l /var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log --xlator-option *-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6 --process-name brick --brick-port 49153 --xlator-option gvol0-server.listen-port=49153
root 48227 1 0 Feb17 ? 00:00:26 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
root 2052 1 0 2022 ? 00:00:00 /usr/bin/python3 /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root 2062 1 3 2022 ? 10-11:57:16 /usr/sbin/glusterfs --fuse-mountopts=noatime --process-name fuse --volfile-server=br --volfile-server=sg --volfile-id=/gvol0 --fuse-mountopts=noatime /mnt/glusterfs
root 2379 2052 0 2022 ? 00:00:00 /usr/bin/python3 /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root 5884 1 5 2022 ? 18-16:08:53 /usr/sbin/glusterfsd -s br --volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p /var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S /var/run/gluster/61df1d4e1c65300e.socket --brick-name /nodirectwritedata/gluster/gvol0 -l /var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log --xlator-option *-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6 --process-name brick --brick-port 49152 --xlator-option gvol0-server.listen-port=49152
root 10463 18747 0 23:24 pts/1 00:00:00 grep --color=auto gluster
root 27744 1 0 2022 ? 03:55:10 /usr/sbin/glusterfsd -s br --volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p /var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S /var/run/gluster/61df1d4e1c65300e.socket --brick-name /nodirectwritedata/gluster/gvol0 -l /var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log --xlator-option *-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6 --process-name brick --brick-port 49153 --xlator-option gvol0-server.listen-port=49153
root 48227 1 0 Feb17 ? 00:00:26 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
On "sg" in glusterd.log we're seeing:
[2023-02-23
20:26:57.619318 +0000] E [rpc-clnt.c:181:call_bail] 0-management:
bailing out frame type(glusterd mgmt v3), op(--(6)), xid = 0x11, unique =
27, sent = 2023-02-23 20:16:50.596447 +0000, timeout = 600 for 10.20.20.11:24007
[2023-02-23 20:26:57.619425 +0000] E [MSGID: 106115] [glusterd-mgmt.c:122:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on br. Please check log file for details.
[2023-02-23 20:26:57.619545 +0000] E [MSGID: 106151] [glusterd-syncop.c:1655:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s)
[2023-02-23 20:26:57.619693 +0000] W [glusterd-locks.c:817:glusterd_mgmt_v3_unlock] (-->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe19b9) [0x7fadf47fa9b9] -->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe0e20) [0x7fadf47f9e20] -->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe7904) [0x7fadf4800904] ) 0-management: Lock owner mismatch. Lock for vol gvol0 held by 11e528b0-8c69-4b5d-82ed-c41dd25536d6
[2023-02-23 20:26:57.619780 +0000] E [MSGID: 106117] [glusterd-syncop.c:1679:gd_unlock_op_phase] 0-management: Unable to release lock for gvol0
[2023-02-23 20:26:57.619939 +0000] I [socket.c:3811:socket_submit_outgoing_msg] 0-socket.management: not connected (priv->connected = -1)
[2023-02-23 20:26:57.619969 +0000] E [rpcsvc.c:1567:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x3, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
[2023-02-23 20:26:57.619995 +0000] E [MSGID: 106430] [glusterd-utils.c:678:glusterd_submit_reply] 0-glusterd: Reply submission failed
[2023-02-23 20:26:57.619425 +0000] E [MSGID: 106115] [glusterd-mgmt.c:122:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on br. Please check log file for details.
[2023-02-23 20:26:57.619545 +0000] E [MSGID: 106151] [glusterd-syncop.c:1655:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s)
[2023-02-23 20:26:57.619693 +0000] W [glusterd-locks.c:817:glusterd_mgmt_v3_unlock] (-->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe19b9) [0x7fadf47fa9b9] -->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe0e20) [0x7fadf47f9e20] -->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe7904) [0x7fadf4800904] ) 0-management: Lock owner mismatch. Lock for vol gvol0 held by 11e528b0-8c69-4b5d-82ed-c41dd25536d6
[2023-02-23 20:26:57.619780 +0000] E [MSGID: 106117] [glusterd-syncop.c:1679:gd_unlock_op_phase] 0-management: Unable to release lock for gvol0
[2023-02-23 20:26:57.619939 +0000] I [socket.c:3811:socket_submit_outgoing_msg] 0-socket.management: not connected (priv->connected = -1)
[2023-02-23 20:26:57.619969 +0000] E [rpcsvc.c:1567:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x3, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
[2023-02-23 20:26:57.619995 +0000] E [MSGID: 106430] [glusterd-utils.c:678:glusterd_submit_reply] 0-glusterd: Reply submission failed
And in the brick log:
[2023-02-23 20:22:56.717721 +0000] I [addr.c:54:compare_addr_and_update] 0-/nodirectwritedata/gluster/gvol0: allowed = "*", received addr = "10.20.20.11"
[2023-02-23 20:22:56.717817 +0000] I [login.c:110:gf_auth] 0-auth/login: allowed user names: a26c7de4-1236-4e0a-944a-cb82de7f7f0e
[2023-02-23 20:22:56.717840 +0000] I [MSGID: 115029] [server-handshake.c:561:server_setvolume] 0-gvol0-server: accepted client from CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0 (version: 9.1) with subvol /nodirectwritedata/gluster/gvol0
[2023-02-23 20:22:56.741545 +0000] W [socket.c:766:__socket_rwv] 0-tcp.gvol0-server: readv on 10.20.20.11:49144 failed (No data available)
[2023-02-23 20:22:56.741599 +0000] I [MSGID: 115036] [server.c:500:server_rpc_notify] 0-gvol0-server: disconnecting connection [{client-uid=CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0}]
[2023-02-23 20:22:56.741866 +0000] I [MSGID: 101055] [client_t.c:397:gf_client_unref] 0-gvol0-server: Shutting down connection CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0
[2023-02-23 20:22:56.717817 +0000] I [login.c:110:gf_auth] 0-auth/login: allowed user names: a26c7de4-1236-4e0a-944a-cb82de7f7f0e
[2023-02-23 20:22:56.717840 +0000] I [MSGID: 115029] [server-handshake.c:561:server_setvolume] 0-gvol0-server: accepted client from CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0 (version: 9.1) with subvol /nodirectwritedata/gluster/gvol0
[2023-02-23 20:22:56.741545 +0000] W [socket.c:766:__socket_rwv] 0-tcp.gvol0-server: readv on 10.20.20.11:49144 failed (No data available)
[2023-02-23 20:22:56.741599 +0000] I [MSGID: 115036] [server.c:500:server_rpc_notify] 0-gvol0-server: disconnecting connection [{client-uid=CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0}]
[2023-02-23 20:22:56.741866 +0000] I [MSGID: 101055] [client_t.c:397:gf_client_unref] 0-gvol0-server: Shutting down connection CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0
Thanks for your help,
--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users