I have two production machines (web1, web2) that are currently using Glusterfs. I added two new machines, web3 and web4. Web1 and web2 are peered and are running great. Web3 and web4 will peer with each other but web1 and web2 will not peer with web3 or web4 and vice versa.
All machines are running Gluster 3.6.1:
ii glusterfs-client 3.6.1-1 amd64 clustered file-system (client package)
ii glusterfs-common 3.6.1-1 amd64 GlusterFS common libraries and translator modules
ii glusterfs-server 3.6.1-1 amd64 clustered file-system (server package)
I ran "gluster volume set all cluster.op-version 30600" on web2, now all four machines have "operating-version=30600" in /var/lib/glusterd/glusterd.info
All machines have the same kernel:
Linux web1 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt2-1 (2014-12-08) x86_64 GNU/Linux
/etc/hosts is managed with Ansible, all machines resolve by their names (web1, web2, web3, web4) to their internal IP.
"hping3 -8 111,2049,24007,49152-49162 -S [web1-web4]" shows all Gluster ports open from all servers.
root@web1:~# gluster peer probe web2
peer probe: success. Host web2 port 24007 already in peer list
root@web1:~# gluster peer probe web3
peer probe: failed: Probe returned with unknown errno 107
root@web1:~# gluster peer probe web4
peer probe: failed: Probe returned with unknown errno 107
root@web3:~# gluster peer probe web1
peer probe: failed: Probe returned with unknown errno 107
root@web3:~# gluster peer probe web2
peer probe: failed: Probe returned with unknown errno 107
root@web3:~# gluster peer probe web4
peer probe: success. Host web4 port 24007 already in peer list
from /var/log/glusterfs/cli.log:
[2015-01-07 20:38:33.978304] T [cli.c:264:cli_rpc_notify] 0-glusterfs: got RPC_CLNT_CONNECT
[2015-01-07 20:38:33.978322] T [cli-quotad-client.c:94:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_CONNECT
[2015-01-07 20:38:33.978333] I [socket.c:2344:socket_event_handler] 0-transport: disconnecting now
[2015-01-07 20:38:33.978354] T [cli-quotad-client.c:100:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_DISCONNECT
[2015-01-07 20:38:33.978622] T [rpc-clnt.c:1381:rpc_clnt_record] 0-glusterfs: Auth Info: pid: 0, uid: 0, gid: 0, owner:
[2015-01-07 20:38:33.978681] T [rpc-clnt.c:1238:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 144, payload: 80, rpc hdr: 64
[2015-01-07 20:38:33.979037] T [socket.c:2863:socket_connect] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1cd)[0x7f9e8ef0ac1d] (--> /usr/lib/x86_64-linux-gnu/glusterfs/3.6.1/rpc-transport/socket.so(+0x637c)[0x7f9e8bcc637c] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_submit+0x3a4)[0x7f9e8e40fad4] (--> gluster(cli_submit_request+0x15f)[0x7f9e8f3c52df] (--> gluster(cli_cmd_submit+0x8b)[0x7f9e8f3c6eeb] ))))) 0-glusterfs: connect () called on transport already connected
[2015-01-07 20:38:33.979100] T [rpc-clnt.c:1573:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x1 Program: Gluster CLI, ProgVers: 2, Proc: 1) to rpc-transport (glusterfs)
[2015-01-07 20:38:33.979111] D [rpc-clnt-ping.c:231:rpc_clnt_start_ping] 0-glusterfs: ping timeout is 0, returning
[2015-01-07 20:38:33.986269] T [rpc-clnt.c:660:rpc_clnt_reply_init] 0-glusterfs: received rpc message (RPC XID: 0x1 Program: Gluster CLI, ProgVers: 2, Proc: 1) from rpc-transport (glusterfs)
[2015-01-07 20:38:33.986311] I [cli-rpc-ops.c:131:gf_cli_probe_cbk] 0-cli: Received resp to probe
[2015-01-07 20:38:33.986321] E [cli-rpc-ops.c:136:gf_cli_probe_cbk] 0-cli: Probe returned with unknown errno 107
[2015-01-07 20:38:33.986620] D [cli-cmd.c:384:cli_cmd_submit] 0-cli: Returning -1
[2015-01-07 20:38:33.986638] D [cli-rpc-ops.c:3021:gf_cli_probe] 0-cli: Returning -1
[2015-01-07 20:38:33.986650] D [cli-cmd-peer.c:96:cli_cmd_peer_probe_cbk] 0-cli: frame->local is not NULL (0x7f9e7c000980)
[2015-01-07 20:38:33.986667] I [input.c:36:cli_batch] 0-: Exiting with: -1
Thanks in advance for any help.All machines are running Gluster 3.6.1:
ii glusterfs-client 3.6.1-1 amd64 clustered file-system (client package)
ii glusterfs-common 3.6.1-1 amd64 GlusterFS common libraries and translator modules
ii glusterfs-server 3.6.1-1 amd64 clustered file-system (server package)
I ran "gluster volume set all cluster.op-version 30600" on web2, now all four machines have "operating-version=30600" in /var/lib/glusterd/glusterd.info
All machines have the same kernel:
Linux web1 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt2-1 (2014-12-08) x86_64 GNU/Linux
/etc/hosts is managed with Ansible, all machines resolve by their names (web1, web2, web3, web4) to their internal IP.
"hping3 -8 111,2049,24007,49152-49162 -S [web1-web4]" shows all Gluster ports open from all servers.
root@web1:~# gluster peer probe web2
peer probe: success. Host web2 port 24007 already in peer list
root@web1:~# gluster peer probe web3
peer probe: failed: Probe returned with unknown errno 107
root@web1:~# gluster peer probe web4
peer probe: failed: Probe returned with unknown errno 107
root@web3:~# gluster peer probe web1
peer probe: failed: Probe returned with unknown errno 107
root@web3:~# gluster peer probe web2
peer probe: failed: Probe returned with unknown errno 107
root@web3:~# gluster peer probe web4
peer probe: success. Host web4 port 24007 already in peer list
from /var/log/glusterfs/cli.log:
[2015-01-07 20:38:33.978304] T [cli.c:264:cli_rpc_notify] 0-glusterfs: got RPC_CLNT_CONNECT
[2015-01-07 20:38:33.978322] T [cli-quotad-client.c:94:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_CONNECT
[2015-01-07 20:38:33.978333] I [socket.c:2344:socket_event_handler] 0-transport: disconnecting now
[2015-01-07 20:38:33.978354] T [cli-quotad-client.c:100:cli_quotad_notify] 0-glusterfs: got RPC_CLNT_DISCONNECT
[2015-01-07 20:38:33.978622] T [rpc-clnt.c:1381:rpc_clnt_record] 0-glusterfs: Auth Info: pid: 0, uid: 0, gid: 0, owner:
[2015-01-07 20:38:33.978681] T [rpc-clnt.c:1238:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 144, payload: 80, rpc hdr: 64
[2015-01-07 20:38:33.979037] T [socket.c:2863:socket_connect] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1cd)[0x7f9e8ef0ac1d] (--> /usr/lib/x86_64-linux-gnu/glusterfs/3.6.1/rpc-transport/socket.so(+0x637c)[0x7f9e8bcc637c] (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_submit+0x3a4)[0x7f9e8e40fad4] (--> gluster(cli_submit_request+0x15f)[0x7f9e8f3c52df] (--> gluster(cli_cmd_submit+0x8b)[0x7f9e8f3c6eeb] ))))) 0-glusterfs: connect () called on transport already connected
[2015-01-07 20:38:33.979100] T [rpc-clnt.c:1573:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x1 Program: Gluster CLI, ProgVers: 2, Proc: 1) to rpc-transport (glusterfs)
[2015-01-07 20:38:33.979111] D [rpc-clnt-ping.c:231:rpc_clnt_start_ping] 0-glusterfs: ping timeout is 0, returning
[2015-01-07 20:38:33.986269] T [rpc-clnt.c:660:rpc_clnt_reply_init] 0-glusterfs: received rpc message (RPC XID: 0x1 Program: Gluster CLI, ProgVers: 2, Proc: 1) from rpc-transport (glusterfs)
[2015-01-07 20:38:33.986311] I [cli-rpc-ops.c:131:gf_cli_probe_cbk] 0-cli: Received resp to probe
[2015-01-07 20:38:33.986321] E [cli-rpc-ops.c:136:gf_cli_probe_cbk] 0-cli: Probe returned with unknown errno 107
[2015-01-07 20:38:33.986620] D [cli-cmd.c:384:cli_cmd_submit] 0-cli: Returning -1
[2015-01-07 20:38:33.986638] D [cli-rpc-ops.c:3021:gf_cli_probe] 0-cli: Returning -1
[2015-01-07 20:38:33.986650] D [cli-cmd-peer.c:96:cli_cmd_peer_probe_cbk] 0-cli: frame->local is not NULL (0x7f9e7c000980)
[2015-01-07 20:38:33.986667] I [input.c:36:cli_batch] 0-: Exiting with: -1
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users