Hi Ravi and Joe,
The command "gluster volume status gvol0" shows all 3 nodes as being online, even on gfs3 as below. I've attached the glfsheal-gvol0.log, in which I can't see anything like a connection error. Would you have any further suggestions? Thank you.
[root@gfs3 glusterfs]# gluster volume status gvol0
Status of volume: gvol0
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick gfs1:/nodirectwritedata/gluster/gvol0 49152 0 Y 7706
Brick gfs2:/nodirectwritedata/gluster/gvol0 49152 0 Y 7625
Brick gfs3:/nodirectwritedata/gluster/gvol0 49152 0 Y 7307
Self-heal Daemon on localhost N/A N/A Y 7316
Self-heal Daemon on gfs1 N/A N/A Y 40591
Self-heal Daemon on gfs2 N/A N/A Y 7634
Task Status of Volume gvol0
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: gvol0
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick gfs1:/nodirectwritedata/gluster/gvol0 49152 0 Y 7706
Brick gfs2:/nodirectwritedata/gluster/gvol0 49152 0 Y 7625
Brick gfs3:/nodirectwritedata/gluster/gvol0 49152 0 Y 7307
Self-heal Daemon on localhost N/A N/A Y 7316
Self-heal Daemon on gfs1 N/A N/A Y 40591
Self-heal Daemon on gfs2 N/A N/A Y 7634
Task Status of Volume gvol0
------------------------------------------------------------------------------
There are no active volume tasks
On Wed, 29 May 2019 at 16:26, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
On 29/05/19 6:21 AM, David Cunningham wrote:
Hello all,
We are seeing a strange issue where a new node gfs3 shows another node gfs2 as not connected on the "gluster volume heal" info:
[root@gfs3 bricks]# gluster volume heal gvol0 info
Brick gfs1:/nodirectwritedata/gluster/gvol0
Status: Connected
Number of entries: 0
Brick gfs2:/nodirectwritedata/gluster/gvol0
Status: Transport endpoint is not connected
Number of entries: -
Brick gfs3:/nodirectwritedata/gluster/gvol0
Status: Connected
Number of entries: 0
However it does show the same node connected on "gluster peer status". Does anyone know why this would be?
[root@gfs3 bricks]# gluster peer status
Number of Peers: 2
Hostname: gfs2
Uuid: 91863102-23a8-43e1-b3d3-f0a1bd57f350
State: Peer in Cluster (Connected)
Hostname: gfs1
Uuid: 32c99e7d-71f2-421c-86fc-b87c0f68ad1b
State: Peer in Cluster (Connected)
In nodirectwritedata-gluster-gvol0.log on gfs3 we see this logged with regards to gfs2:You need to check glfsheal-$volname.log on the node where you ran the command and check for any connection related errors.
-Ravi
[2019-05-29 00:17:50.646360] I [MSGID: 115029] [server-handshake.c:537:server_setvolume] 0-gvol0-server: accepted client from CTX_ID:30d74196-fece-4380-adc0-338760188b81-GRAPH_ID:0-PID:7718-HOST:gfs2.xxx.com-PC_NAME:gvol0-client-2-RECON_NO:-0 (version: 5.6)
[2019-05-29 00:17:50.761120] I [MSGID: 115036] [server.c:469:server_rpc_notify] 0-gvol0-server: disconnecting connection from CTX_ID:30d74196-fece-4380-adc0-338760188b81-GRAPH_ID:0-PID:7718-HOST:gfs2.xxx.com-PC_NAME:gvol0-client-2-RECON_NO:-0
[2019-05-29 00:17:50.761352] I [MSGID: 101055] [client_t.c:435:gf_client_unref] 0-gvol0-server: Shutting down connection CTX_ID:30d74196-fece-4380-adc0-338760188b81-GRAPH_ID:0-PID:7718-HOST:gfs2.xxx.com-PC_NAME:gvol0-client-2-RECON_NO:-0
Thanks in advance for any assistance.
--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users
--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
[2019-05-29 08:44:11.435439] I [MSGID: 104045] [glfs-master.c:86:notify] 0-gfapi: New graph 67667333-2e74-656c-6562-726f61642e63 (0) coming up [2019-05-29 08:44:11.435532] I [MSGID: 114020] [client.c:2358:notify] 0-gvol0-client-0: parent translators are ready, attempting connect on transport [2019-05-29 08:44:11.441023] I [MSGID: 114020] [client.c:2358:notify] 0-gvol0-client-1: parent translators are ready, attempting connect on transport [2019-05-29 08:44:11.444160] I [rpc-clnt.c:2042:rpc_clnt_reconfig] 0-gvol0-client-0: changing port to 49152 (from 0) [2019-05-29 08:44:11.445679] I [MSGID: 114020] [client.c:2358:notify] 0-gvol0-client-2: parent translators are ready, attempting connect on transport [2019-05-29 08:44:11.455002] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-gvol0-client-0: Connected to gvol0-client-0, attached to remote volume '/nodirectwritedata/gluster/gvol0'. Final graph: +------------------------------------------------------------------------------+ 1: volume gvol0-client-0 2: type protocol/client 3: option opversion 50400 4: option clnt-lk-version 1 5: option volfile-checksum 0 6: option volfile-key gvol0 7: option client-version 5.6 8: option process-name gfapi.glfsheal 9: option process-uuid CTX_ID:bda2caab-106e-4097-9b8a-b2c66fbce168-GRAPH_ID:0-PID:14552-HOST:gfs3.example.com-PC_NAME:gvol0-client-0-RECON_NO:-0 [2019-05-29 08:44:11.455154] I [MSGID: 108005] [afr-common.c:5237:__afr_handle_child_up_event] 0-gvol0-replicate-0: Subvolume 'gvol0-client-0' came back up; going online. 10: option fops-version 1298437 11: option ping-timeout 42 12: option remote-host gfs1 13: option remote-subvolume /nodirectwritedata/gluster/gvol0 14: option transport-type socket 15: option transport.address-family inet 16: option username 59692084-f74d-498e-a4a7-949bcaa9d484 17: option password 4a76fa00-cc94-40fe-b1f0-1d43XXXXXXXX 18: option transport.tcp-user-timeout 0 19: option transport.socket.keepalive-time 20 20: option transport.socket.keepalive-interval 2 21: option transport.socket.keepalive-count 9 22: option send-gids true 23: end-volume 24: 25: volume gvol0-client-1 26: type protocol/client 27: option ping-timeout 42 28: option remote-host gfs2 29: option remote-subvolume /nodirectwritedata/gluster/gvol0 30: option transport-type socket 31: option transport.address-family inet 32: option username 59692084-f74d-498e-a4a7-949bcaa9d484 33: option password 4a76fa00-cc94-40fe-b1f0-1d43XXXXXXXX 34: option transport.tcp-user-timeout 0 35: option transport.socket.keepalive-time 20 36: option transport.socket.keepalive-interval 2 37: option transport.socket.keepalive-count 9 38: option send-gids true 39: end-volume 40: 41: volume gvol0-client-2 42: type protocol/client 43: option ping-timeout 42 44: option remote-host gfs3 45: option remote-subvolume /nodirectwritedata/gluster/gvol0 46: option transport-type socket 47: option transport.address-family inet 48: option username 59692084-f74d-498e-a4a7-949bcaa9d484 49: option password 4a76fa00-cc94-40fe-b1f0-1d43XXXXXXXX 50: option transport.tcp-user-timeout 0 51: option transport.socket.keepalive-time 20 52: option transport.socket.keepalive-interval 2 53: option transport.socket.keepalive-count 9 54: option send-gids true 55: end-volume 56: 57: volume gvol0-replicate-0 58: type cluster/replicate 59: option background-self-heal-count 0 60: option afr-pending-xattr gvol0-client-0,gvol0-client-1,gvol0-client-2 61: option arbiter-count 1 62: option use-compound-fops off 63: subvolumes gvol0-client-0 gvol0-client-1 gvol0-client-2 64: end-volume 65: 66: volume gvol0-dht 67: type cluster/distribute 68: option lock-migration off 69: option force-migration off 70: subvolumes gvol0-replicate-0 71: end-volume 72: 73: volume gvol0-write-behind 74: type performance/write-behind 75: subvolumes gvol0-dht 76: end-volume 77: 78: volume gvol0-read-ahead 79: type performance/read-ahead 80: subvolumes gvol0-write-behind 81: end-volume 82: 83: volume gvol0-readdir-ahead 84: type performance/readdir-ahead 85: option parallel-readdir off 86: option rda-request-size 131072 87: option rda-cache-limit 10MB 88: subvolumes gvol0-read-ahead 89: end-volume 90: 91: volume gvol0-io-cache 92: type performance/io-cache 93: subvolumes gvol0-readdir-ahead 94: end-volume 95: 96: volume gvol0-quick-read 97: type performance/quick-read 98: subvolumes gvol0-io-cache 99: end-volume 100: 101: volume gvol0-open-behind 102: type performance/open-behind 103: subvolumes gvol0-quick-read 104: end-volume 105: 106: volume gvol0-md-cache 107: type performance/md-cache 108: subvolumes gvol0-open-behind 109: end-volume 110: 111: volume gvol0 112: type debug/io-stats 113: option log-level INFO 114: option latency-measurement off 115: option count-fop-hits off 116: subvolumes gvol0-md-cache 117: end-volume 118: 119: volume meta-autoload 120: type meta 121: subvolumes gvol0 122: end-volume 123: +------------------------------------------------------------------------------+ [2019-05-29 08:44:11.456196] I [rpc-clnt.c:2042:rpc_clnt_reconfig] 0-gvol0-client-1: changing port to 49152 (from 0) [2019-05-29 08:44:11.460861] I [rpc-clnt.c:2042:rpc_clnt_reconfig] 0-gvol0-client-2: changing port to 49152 (from 0) [2019-05-29 08:44:11.466405] I [MSGID: 114046] [client-handshake.c:1106:client_setvolume_cbk] 0-gvol0-client-2: Connected to gvol0-client-2, attached to remote volume '/nodirectwritedata/gluster/gvol0'. [2019-05-29 08:44:11.491737] I [MSGID: 108002] [afr-common.c:5588:afr_notify] 0-gvol0-replicate-0: Client-quorum is met [2019-05-29 08:44:22.425447] I [MSGID: 104041] [glfs-resolve.c:954:__glfs_active_subvol] 0-gvol0: switched to graph 67667333-2e74-656c-6562-726f61642e63 (0)
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users